Droit, Arnaud; Hunter, Joanna M; Rouleau, Michèle; Ethier, Chantal; Picard-Cloutier, Aude; Bourgais, David; Poirier, Guy G
2007-01-01
Background In the "post-genome" era, mass spectrometry (MS) has become an important method for the analysis of proteins and the rapid advancement of this technique, in combination with other proteomics methods, results in an increasing amount of proteome data. This data must be archived and analysed using specialized bioinformatics tools. Description We herein describe "PARPs database," a data analysis and management pipeline for liquid chromatography tandem mass spectrometry (LC-MS/MS) proteomics. PARPs database is a web-based tool whose features include experiment annotation, protein database searching, protein sequence management, as well as data-mining of the peptides and proteins identified. Conclusion Using this pipeline, we have successfully identified several interactions of biological significance between PARP-1 and other proteins, namely RFC-1, 2, 3, 4 and 5. PMID:18093328
Hermjakob, Henning; Montecchi-Palazzi, Luisa; Bader, Gary; Wojcik, Jérôme; Salwinski, Lukasz; Ceol, Arnaud; Moore, Susan; Orchard, Sandra; Sarkans, Ugis; von Mering, Christian; Roechert, Bernd; Poux, Sylvain; Jung, Eva; Mersch, Henning; Kersey, Paul; Lappe, Michael; Li, Yixue; Zeng, Rong; Rana, Debashis; Nikolski, Macha; Husi, Holger; Brun, Christine; Shanker, K; Grant, Seth G N; Sander, Chris; Bork, Peer; Zhu, Weimin; Pandey, Akhilesh; Brazma, Alvis; Jacq, Bernard; Vidal, Marc; Sherman, David; Legrain, Pierre; Cesareni, Gianni; Xenarios, Ioannis; Eisenberg, David; Steipe, Boris; Hogue, Chris; Apweiler, Rolf
2004-02-01
A major goal of proteomics is the complete description of the protein interaction network underlying cell physiology. A large number of small scale and, more recently, large-scale experiments have contributed to expanding our understanding of the nature of the interaction network. However, the necessary data integration across experiments is currently hampered by the fragmentation of publicly available protein interaction data, which exists in different formats in databases, on authors' websites or sometimes only in print publications. Here, we propose a community standard data model for the representation and exchange of protein interaction data. This data model has been jointly developed by members of the Proteomics Standards Initiative (PSI), a work group of the Human Proteome Organization (HUPO), and is supported by major protein interaction data providers, in particular the Biomolecular Interaction Network Database (BIND), Cellzome (Heidelberg, Germany), the Database of Interacting Proteins (DIP), Dana Farber Cancer Institute (Boston, MA, USA), the Human Protein Reference Database (HPRD), Hybrigenics (Paris, France), the European Bioinformatics Institute's (EMBL-EBI, Hinxton, UK) IntAct, the Molecular Interactions (MINT, Rome, Italy) database, the Protein-Protein Interaction Database (PPID, Edinburgh, UK) and the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING, EMBL, Heidelberg, Germany).
Senachak, Jittisak; Cheevadhanarak, Supapon; Hongsthong, Apiradee
2015-07-29
Spirulina (Arthrospira) platensis is the only cyanobacterium that in addition to being studied at the molecular level and subjected to gene manipulation, can also be mass cultivated in outdoor ponds for commercial use as a food supplement. Thus, encountering environmental changes, including temperature stresses, is common during the mass production of Spirulina. The use of cyanobacteria as an experimental platform, especially for photosynthetic gene manipulation in plants and bacteria, is becoming increasingly important. Understanding the mechanisms and protein-protein interaction networks that underlie low- and high-temperature responses is relevant to Spirulina mass production. To accomplish this goal, high-throughput techniques such as OMICs analyses are used. Thus, large datasets must be collected, managed and subjected to information extraction. Therefore, databases including (i) proteomic analysis and protein-protein interaction (PPI) data and (ii) domain/motif visualization tools are required for potential use in temperature response models for plant chloroplasts and photosynthetic bacteria. A web-based repository was developed including an embedded database, SpirPro, and tools for network visualization. Proteome data were analyzed integrated with protein-protein interactions and/or metabolic pathways from KEGG. The repository provides various information, ranging from raw data (2D-gel images) to associated results, such as data from interaction and/or pathway analyses. This integration allows in silico analyses of protein-protein interactions affected at the metabolic level and, particularly, analyses of interactions between and within the affected metabolic pathways under temperature stresses for comparative proteomic analysis. The developed tool, which is coded in HTML with CSS/JavaScript and depicted in Scalable Vector Graphics (SVG), is designed for interactive analysis and exploration of the constructed network. SpirPro is publicly available on the web at http://spirpro.sbi.kmutt.ac.th . SpirPro is an analysis platform containing an integrated proteome and PPI database that provides the most comprehensive data on this cyanobacterium at the systematic level. As an integrated database, SpirPro can be applied in various analyses, such as temperature stress response networking analysis in cyanobacterial models and interacting domain-domain analysis between proteins of interest.
Computer applications making rapid advances in high throughput microbial proteomics (HTMP).
Anandkumar, Balakrishna; Haga, Steve W; Wu, Hui-Fen
2014-02-01
The last few decades have seen the rise of widely-available proteomics tools. From new data acquisition devices, such as MALDI-MS and 2DE to new database searching softwares, these new products have paved the way for high throughput microbial proteomics (HTMP). These tools are enabling researchers to gain new insights into microbial metabolism, and are opening up new areas of study, such as protein-protein interactions (interactomics) discovery. Computer software is a key part of these emerging fields. This current review considers: 1) software tools for identifying the proteome, such as MASCOT or PDQuest, 2) online databases of proteomes, such as SWISS-PROT, Proteome Web, or the Proteomics Facility of the Pathogen Functional Genomics Resource Center, and 3) software tools for applying proteomic data, such as PSI-BLAST or VESPA. These tools allow for research in network biology, protein identification, functional annotation, target identification/validation, protein expression, protein structural analysis, metabolic pathway engineering and drug discovery.
SAFE Software and FED Database to Uncover Protein-Protein Interactions using Gene Fusion Analysis.
Tsagrasoulis, Dimosthenis; Danos, Vasilis; Kissa, Maria; Trimpalis, Philip; Koumandou, V Lila; Karagouni, Amalia D; Tsakalidis, Athanasios; Kossida, Sophia
2012-01-01
Domain Fusion Analysis takes advantage of the fact that certain proteins in a given proteome A, are found to have statistically significant similarity with two separate proteins in another proteome B. In other words, the result of a fusion event between two separate proteins in proteome B is a specific full-length protein in proteome A. In such a case, it can be safely concluded that the protein pair has a common biological function or even interacts physically. In this paper, we present the Fusion Events Database (FED), a database for the maintenance and retrieval of fusion data both in prokaryotic and eukaryotic organisms and the Software for the Analysis of Fusion Events (SAFE), a computational platform implemented for the automated detection, filtering and visualization of fusion events (both available at: http://www.bioacademy.gr/bioinformatics/projects/ProteinFusion/index.htm). Finally, we analyze the proteomes of three microorganisms using these tools in order to demonstrate their functionality.
SAFE Software and FED Database to Uncover Protein-Protein Interactions using Gene Fusion Analysis
Tsagrasoulis, Dimosthenis; Danos, Vasilis; Kissa, Maria; Trimpalis, Philip; Koumandou, V. Lila; Karagouni, Amalia D.; Tsakalidis, Athanasios; Kossida, Sophia
2012-01-01
Domain Fusion Analysis takes advantage of the fact that certain proteins in a given proteome A, are found to have statistically significant similarity with two separate proteins in another proteome B. In other words, the result of a fusion event between two separate proteins in proteome B is a specific full-length protein in proteome A. In such a case, it can be safely concluded that the protein pair has a common biological function or even interacts physically. In this paper, we present the Fusion Events Database (FED), a database for the maintenance and retrieval of fusion data both in prokaryotic and eukaryotic organisms and the Software for the Analysis of Fusion Events (SAFE), a computational platform implemented for the automated detection, filtering and visualization of fusion events (both available at: http://www.bioacademy.gr/bioinformatics/projects/ProteinFusion/index.htm). Finally, we analyze the proteomes of three microorganisms using these tools in order to demonstrate their functionality. PMID:22267904
Perez-Riverol, Yasset; Alpi, Emanuele; Wang, Rui; Hermjakob, Henning; Vizcaíno, Juan Antonio
2015-01-01
Compared to other data-intensive disciplines such as genomics, public deposition and storage of MS-based proteomics, data are still less developed due to, among other reasons, the inherent complexity of the data and the variety of data types and experimental workflows. In order to address this need, several public repositories for MS proteomics experiments have been developed, each with different purposes in mind. The most established resources are the Global Proteome Machine Database (GPMDB), PeptideAtlas, and the PRIDE database. Additionally, there are other useful (in many cases recently developed) resources such as ProteomicsDB, Mass Spectrometry Interactive Virtual Environment (MassIVE), Chorus, MaxQB, PeptideAtlas SRM Experiment Library (PASSEL), Model Organism Protein Expression Database (MOPED), and the Human Proteinpedia. In addition, the ProteomeXchange consortium has been recently developed to enable better integration of public repositories and the coordinated sharing of proteomics information, maximizing its benefit to the scientific community. Here, we will review each of the major proteomics resources independently and some tools that enable the integration, mining and reuse of the data. We will also discuss some of the major challenges and current pitfalls in the integration and sharing of the data. PMID:25158685
P2P proteomics -- data sharing for enhanced protein identification
2012-01-01
Background In order to tackle the important and challenging problem in proteomics of identifying known and new protein sequences using high-throughput methods, we propose a data-sharing platform that uses fully distributed P2P technologies to share specifications of peer-interaction protocols and service components. By using such a platform, information to be searched is no longer centralised in a few repositories but gathered from experiments in peer proteomics laboratories, which can subsequently be searched by fellow researchers. Methods The system distributively runs a data-sharing protocol specified in the Lightweight Communication Calculus underlying the system through which researchers interact via message passing. For this, researchers interact with the system through particular components that link to database querying systems based on BLAST and/or OMSSA and GUI-based visualisation environments. We have tested the proposed platform with data drawn from preexisting MS/MS data reservoirs from the 2006 ABRF (Association of Biomolecular Resource Facilities) test sample, which was extensively tested during the ABRF Proteomics Standards Research Group 2006 worldwide survey. In particular we have taken the data available from a subset of proteomics laboratories of Spain's National Institute for Proteomics, ProteoRed, a network for the coordination, integration and development of the Spanish proteomics facilities. Results and Discussion We performed queries against nine databases including seven ProteoRed proteomics laboratories, the NCBI Swiss-Prot database and the local database of the CSIC/UAB Proteomics Laboratory. A detailed analysis of the results indicated the presence of a protein that was supported by other NCBI matches and highly scored matches in several proteomics labs. The analysis clearly indicated that the protein was a relatively high concentrated contaminant that could be present in the ABRF sample. This fact is evident from the information that could be derived from the proposed P2P proteomics system, however it is not straightforward to arrive to the same conclusion by conventional means as it is difficult to discard organic contamination of samples. The actual presence of this contaminant was only stated after the ABRF study of all the identifications reported by the laboratories. PMID:22293032
Perez-Riverol, Yasset; Alpi, Emanuele; Wang, Rui; Hermjakob, Henning; Vizcaíno, Juan Antonio
2015-03-01
Compared to other data-intensive disciplines such as genomics, public deposition and storage of MS-based proteomics, data are still less developed due to, among other reasons, the inherent complexity of the data and the variety of data types and experimental workflows. In order to address this need, several public repositories for MS proteomics experiments have been developed, each with different purposes in mind. The most established resources are the Global Proteome Machine Database (GPMDB), PeptideAtlas, and the PRIDE database. Additionally, there are other useful (in many cases recently developed) resources such as ProteomicsDB, Mass Spectrometry Interactive Virtual Environment (MassIVE), Chorus, MaxQB, PeptideAtlas SRM Experiment Library (PASSEL), Model Organism Protein Expression Database (MOPED), and the Human Proteinpedia. In addition, the ProteomeXchange consortium has been recently developed to enable better integration of public repositories and the coordinated sharing of proteomics information, maximizing its benefit to the scientific community. Here, we will review each of the major proteomics resources independently and some tools that enable the integration, mining and reuse of the data. We will also discuss some of the major challenges and current pitfalls in the integration and sharing of the data. © 2014 The Authors. PROTEOMICS published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Role for protein–protein interaction databases in human genetics
Pattin, Kristine A; Moore, Jason H
2010-01-01
Proteomics and the study of protein–protein interactions are becoming increasingly important in our effort to understand human diseases on a system-wide level. Thanks to the development and curation of protein-interaction databases, up-to-date information on these interaction networks is accessible and publicly available to the scientific community. As our knowledge of protein–protein interactions increases, it is important to give thought to the different ways that these resources can impact biomedical research. In this article, we highlight the importance of protein–protein interactions in human genetics and genetic epidemiology. Since protein–protein interactions demonstrate one of the strongest functional relationships between genes, combining genomic data with available proteomic data may provide us with a more in-depth understanding of common human diseases. In this review, we will discuss some of the fundamentals of protein interactions, the databases that are publicly available and how information from these databases can be used to facilitate genome-wide genetic studies. PMID:19929610
Hepatic SILAC proteomic data from PANDER transgenic model.
Athanason, Mark G; Stevens, Stanley M; Burkhardt, Brant R
2016-12-01
This article contains raw and processed data related to research published in "Quantitative Proteomic Profiling Reveals Hepatic Lipogenesis and Liver X Receptor Activation in the PANDER Transgenic Model" (M.G. Athanason, W.A. Ratliff, D. Chaput, C.B. MarElia, M.N. Kuehl, S.M., Jr. Stevens, B.R. Burkhardt (2016)) [1], and was generated by "spike-in" SILAC-based proteomic analysis of livers obtained from the PANcreatic-Derived factor (PANDER) transgenic mouse (PANTG) under various metabolic conditions [1]. The mass spectrometry output of the PANTG and wild-type B6SJLF mice liver tissue and resulting proteome search from MaxQuant 1.2.2.5 employing the Andromeda search algorithm against the UniprotKB reference database for Mus musculus has been deposited to the ProteomeXchange Consortium (http://www.proteomexchange.org) via the PRIDE partner repository with dataset identifiers PRIDE: PXD004171 and doi:10.6019/PXD004171. Protein ratio values representing PANTG/wild-type obtained by MaxQuant analysis were input into the Perseus processing suite to determine statistical significance using the Significance A outlier test (p<0.05). Differentially expressed proteins using this approach were input into Ingenuity Pathway Analysis to determined altered pathways and upstream regulators that were altered in PANTG mice.
Genome-Wide Prediction and Validation of Peptides That Bind Human Prosurvival Bcl-2 Proteins
DeBartolo, Joe; Taipale, Mikko; Keating, Amy E.
2014-01-01
Programmed cell death is regulated by interactions between pro-apoptotic and prosurvival members of the Bcl-2 family. Pro-apoptotic family members contain a weakly conserved BH3 motif that can adopt an alpha-helical structure and bind to a groove on prosurvival partners Bcl-xL, Bcl-w, Bcl-2, Mcl-1 and Bfl-1. Peptides corresponding to roughly 13 reported BH3 motifs have been verified to bind in this manner. Due to their short lengths and low sequence conservation, BH3 motifs are not detected using standard sequence-based bioinformatics approaches. Thus, it is possible that many additional proteins harbor BH3-like sequences that can mediate interactions with the Bcl-2 family. In this work, we used structure-based and data-based Bcl-2 interaction models to find new BH3-like peptides in the human proteome. We used peptide SPOT arrays to test candidate peptides for interaction with one or more of the prosurvival proteins Bcl-xL, Bcl-w, Bcl-2, Mcl-1 and Bfl-1. For the 36 most promising array candidates, we quantified binding to all five human receptors using direct and competition binding assays in solution. All 36 peptides showed evidence of interaction with at least one prosurvival protein, and 22 peptides bound at least one prosurvival protein with a dissociation constant between 1 and 500 nM; many peptides had specificity profiles not previously observed. We also screened the full-length parent proteins of a subset of array-tested peptides for binding to Bcl-xL and Mcl-1. Finally, we used the peptide binding data, in conjunction with previously reported interactions, to assess the affinity and specificity prediction performance of different models. PMID:24967846
Shrivastava, Amulya Nidhi; Redeker, Virginie; Fritz, Nicolas; Pieri, Laura; Almeida, Leandro G.; Spolidoro, Maria; Liebmann, Thomas; Bousset, Luc; Renner, Marianne; Léna, Clément; Aperia, Anita; Melki, Ronald; Triller, Antoine
2016-01-01
α-Synuclein (α-syn) is the principal component of Lewy bodies, the pathophysiological hallmark of individuals affected by Parkinson disease (PD). This neuropathologic form of α-syn contributes to PD progression and propagation of α-syn assemblies between neurons. The data we present here support the proteomic analysis used to identify neuronal proteins that specifically interact with extracellularly applied oligomeric or fibrillar α-syn assemblies (conditions 1 and 2, respectively) (doi: 10.15252/embj.201591397[1]). α-syn assemblies and their cellular partner proteins were pulled down from neuronal cell lysed shortly after exposure to exogenous α-syn assemblies and the associated proteins were identified by mass spectrometry using a shotgun proteomic-based approach. We also performed experiments on pure cultures of astrocytes to identify astrocyte-specific proteins interacting with oligomeric or fibrillar α-syn (conditions 3 and 4, respectively). For each condition, proteins interacting selectively with α-syn assemblies were identified by comparison to proteins pulled-down from untreated cells used as controls. The mass spectrometry data, the database search and the peak lists have been deposited to the ProteomeXchange Consortium database via the PRIDE partner repository with the dataset identifiers PRIDE: PXD002256 to PRIDE: PXD002263 and doi: 10.6019/PXD002256 to 10.6019/PXD002263. PMID:26958642
Shrivastava, Amulya Nidhi; Redeker, Virginie; Fritz, Nicolas; Pieri, Laura; Almeida, Leandro G; Spolidoro, Maria; Liebmann, Thomas; Bousset, Luc; Renner, Marianne; Léna, Clément; Aperia, Anita; Melki, Ronald; Triller, Antoine
2016-06-01
α-Synuclein (α-syn) is the principal component of Lewy bodies, the pathophysiological hallmark of individuals affected by Parkinson disease (PD). This neuropathologic form of α-syn contributes to PD progression and propagation of α-syn assemblies between neurons. The data we present here support the proteomic analysis used to identify neuronal proteins that specifically interact with extracellularly applied oligomeric or fibrillar α-syn assemblies (conditions 1 and 2, respectively) (doi: 10.15252/embj.201591397[1]). α-syn assemblies and their cellular partner proteins were pulled down from neuronal cell lysed shortly after exposure to exogenous α-syn assemblies and the associated proteins were identified by mass spectrometry using a shotgun proteomic-based approach. We also performed experiments on pure cultures of astrocytes to identify astrocyte-specific proteins interacting with oligomeric or fibrillar α-syn (conditions 3 and 4, respectively). For each condition, proteins interacting selectively with α-syn assemblies were identified by comparison to proteins pulled-down from untreated cells used as controls. The mass spectrometry data, the database search and the peak lists have been deposited to the ProteomeXchange Consortium database via the PRIDE partner repository with the dataset identifiers PRIDE: PXD002256 to PRIDE: PXD002263 and doi: 10.6019/PXD002256 to 10.6019/PXD002263.
Rice proteome analysis: a step toward functional analysis of the rice genome.
Komatsu, Setsuko; Tanaka, Naoki
2005-03-01
The technique of proteome analysis using 2-DE has the power to monitor global changes that occur in the protein complement of tissues and subcellular compartments. In this review, we describe construction of the rice proteome database, the cataloging of rice proteins, and the functional characterization of some of the proteins identified. Initially, proteins extracted from various tissues and organelles were separated by 2-DE and an image analyzer was used to construct a display or reference map of the proteins. The rice proteome database currently contains 23 reference maps based on 2-DE of proteins from different rice tissues and subcellular compartments. These reference maps comprise 13 129 rice proteins, and the amino acid sequences of 5092 of these proteins are entered in the database. Major proteins involved in growth or stress responses have been identified by using a proteomics approach and some of these proteins have unique functions. Furthermore, initial work has also begun on analyzing the phosphoproteome and protein-protein interactions in rice. The information obtained from the rice proteome database will aid in the molecular cloning of rice genes and in predicting the function of unknown proteins.
The Proteome of Seed Development in the Model Legume Lotus japonicus1[C][W
Dam, Svend; Laursen, Brian S.; Ørnfelt, Jane H.; Jochimsen, Bjarne; Stærfeldt, Hans Henrik; Friis, Carsten; Nielsen, Kasper; Goffard, Nicolas; Besenbacher, Søren; Krusell, Lene; Sato, Shusei; Tabata, Satoshi; Thøgersen, Ida B.; Enghild, Jan J.; Stougaard, Jens
2009-01-01
We have characterized the development of seeds in the model legume Lotus japonicus. Like soybean (Glycine max) and pea (Pisum sativum), Lotus develops straight seed pods and each pod contains approximately 20 seeds that reach maturity within 40 days. Histological sections show the characteristic three developmental phases of legume seeds and the presence of embryo, endosperm, and seed coat in desiccated seeds. Furthermore, protein, oil, starch, phytic acid, and ash contents were determined, and this indicates that the composition of mature Lotus seed is more similar to soybean than to pea. In a first attempt to determine the seed proteome, both a two-dimensional polyacrylamide gel electrophoresis approach and a gel-based liquid chromatography-mass spectrometry approach were used. Globulins were analyzed by two-dimensional polyacrylamide gel electrophoresis, and five legumins, LLP1 to LLP5, and two convicilins, LCP1 and LCP2, were identified by matrix-assisted laser desorption ionization quadrupole/time-of-flight mass spectrometry. For two distinct developmental phases, seed filling and desiccation, a gel-based liquid chromatography-mass spectrometry approach was used, and 665 and 181 unique proteins corresponding to gene accession numbers were identified for the two phases, respectively. All of the proteome data, including the experimental data and mass spectrometry spectra peaks, were collected in a database that is available to the scientific community via a Web interface (http://www.cbs.dtu.dk/cgi-bin/lotus/db.cgi). This database establishes the basis for relating physiology, biochemistry, and regulation of seed development in Lotus. Together with a new Web interface (http://bioinfoserver.rsbs.anu.edu.au/utils/PathExpress4legumes/) collecting all protein identifications for Lotus, Medicago, and soybean seed proteomes, this database is a valuable resource for comparative seed proteomics and pathway analysis within and beyond the legume family. PMID:19129418
Proteome analysis of the Mycobacterium tuberculosis Beijing B0/W148 cluster
Bespyatykh, Julia; Shitikov, Egor; Butenko, Ivan; Altukhov, Ilya; Alexeev, Dmitry; Mokrousov, Igor; Dogonadze, Marine; Zhuravlev, Viacheslav; Yablonsky, Peter; Ilina, Elena; Govorun, Vadim
2016-01-01
Beijing B0/W148, a “successful” clone of Mycobacterium tuberculosis, is widespread in the Russian Federation and some countries of the former Soviet Union. Here, we used label-free gel-LC-MS/MS shotgun proteomics to discover features of Beijing B0/W148 strains that could explain their success. Qualitative and quantitative proteome analyses of Beijing B0/W148 strains allowed us to identify 1,868 proteins, including 266 that were differentially abundant compared with the control strain H37Rv. To predict the biological effects of the observed differences in protein abundances, we performed Gene Ontology analysis together with analysis of protein-DNA interactions using a gene regulatory network. Our results demonstrate that Beijing B0/W148 strains have increased levels of enzymes responsible for long-chain fatty acid biosynthesis, along with a coincident decrease in the abundance of proteins responsible for their degradation. Together with high levels of HsaA (Rv3570c) protein, involved in steroid degradation, these findings provide a possible explanation for the increased transmissibility of Beijing B0/W148 strains and their survival in host macrophages. Among other, we confirmed a very low level of the SseA (Rv3283) protein in Beijing B0/W148 characteristic for all «modern» Beijing strains, which could lead to increased DNA oxidative damage, accumulation of mutations, and potentially facilitate the development of drug resistance. PMID:27356881
The online Tabloid Proteome: an annotated database of protein associations
Turan, Demet; Tavernier, Jan
2018-01-01
Abstract A complete knowledge of the proteome can only be attained by determining the associations between proteins, along with the nature of these associations (e.g. physical contact in protein–protein interactions, participation in complex formation or different roles in the same pathway). Despite extensive efforts in elucidating direct protein interactions, our knowledge on the complete spectrum of protein associations remains limited. We therefore developed a new approach that detects protein associations from identifications obtained after re-processing of large-scale, public mass spectrometry-based proteomics data. Our approach infers protein association based on the co-occurrence of proteins across many different proteomics experiments, and provides information that is almost completely complementary to traditional direct protein interaction studies. We here present a web interface to query and explore the associations derived from this method, called the online Tabloid Proteome. The online Tabloid Proteome also integrates biological knowledge from several existing resources to annotate our derived protein associations. The online Tabloid Proteome is freely available through a user-friendly web interface, which provides intuitive navigation and data exploration options for the user at http://iomics.ugent.be/tabloidproteome. PMID:29040688
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weckwerth, Wolfram; Baginsky, Sacha; Van Wijk, Klass
2009-12-01
In the past 10 years, we have witnessed remarkable advances in the field of plant molecular biology. The rapid development of proteomic technologies and the speed with which these techniques have been applied to the field have altered our perception of how we can analyze proteins in complex systems. At nearly the same time, the availability of the complete genome for the model plant Arabidopsis thaliana was released; this effort provides an unsurpassed resource for the identification of proteins when researchers use MS to analyze plant samples. Recognizing the growth in this area, the Multinational Arabidopsis Steering Committee (MASC) establishedmore » a subcommittee for A. thaliana proteomics in 2006 with the objective of consolidating databases, technique standards, and experimentally validated candidate genes and functions. Since the establishment of the Multinational Arabidopsis Steering Subcommittee for Proteomics (MASCP), many new approaches and resources have become available. Recently, the subcommittee established a webpage to consolidate this information (www.masc-proteomics.org). It includes links to plant proteomic databases, general information about proteomic techniques, meeting information, a summary of proteomic standards, and other relevant resources. Altogether, this website provides a useful resource for the Arabidopsis proteomics community. In the future, the website will host discussions and investigate the cross-linking of databases. The subcommittee members have extensive experience in arabidopsis proteomics and collectively have produced some of the most extensive proteomics data sets for this model plant (Table S1 in the Supporting Information has a list of resources). The largest collection of proteomics data from a single study in A. thaliana was assembled into an accessible database (AtProteome; http://fgcz-atproteome.unizh.ch/index.php) and was recently published by the Baginsky lab.1 The database provides links to major Arabidopsis online resources, and raw data have been deposited in PRIDE and PRIDE BioMart. Included in this database is an Arabidopsis proteome map that provides evidence for the expression of {approx}50% of all predicted gene models, including several alternative gene models that are not represented in The Arabidopsis Information Resource (TAIR) protein database. A set of organ-specific biomarkers is provided, as well as organ-specific proteotypic peptides for 4105 proteins that can be used to facilitate targeted quantitative proteomic surveys. In the future, the AtProteome database will be linked to additional existing resources developed by MASCP members, such as PPDB, ProMEX, and SUBA. The most comprehensive study on the Arabidopsis chloroplast proteome, which includes information on chloroplast sorting signals, posttranslational modifications (PTMs), and protein abundances (analyzed by high-accuracy MS [Orbitrap]), was recently published by the van Wijk lab.2 These and previous data are available via the plant proteome database (PPDB; http://ppdb.tc.cornell.edu) for A. thaliana and maize. PPDB provides genome-wide experimental and functional characterization of the A. thaliana and maize proteomes, including PTMs and subcellular localization information, with an emphasis on leaf and plastid proteins. Maize and Arabidopsis proteome entries are directly linked via internal BLAST alignments within PPDB. Direct links for each protein to TAIR, SUBA, ProMEX, and other resources are also provided.« less
Simons, Margaret; Saha, Rajib; Amiour, Nardjis; Kumar, Akhil; Guillard, Lenaïg; Clément, Gilles; Miquel, Martine; Li, Zhenni; Mouille, Gregory; Lea, Peter J.; Hirel, Bertrand; Maranas, Costas D.
2014-01-01
Maize (Zea mays) is an important C4 plant due to its widespread use as a cereal and energy crop. A second-generation genome-scale metabolic model for the maize leaf was created to capture C4 carbon fixation and investigate nitrogen (N) assimilation by modeling the interactions between the bundle sheath and mesophyll cells. The model contains gene-protein-reaction relationships, elemental and charge-balanced reactions, and incorporates experimental evidence pertaining to the biomass composition, compartmentalization, and flux constraints. Condition-specific biomass descriptions were introduced that account for amino acids, fatty acids, soluble sugars, proteins, chlorophyll, lignocellulose, and nucleic acids as experimentally measured biomass constituents. Compartmentalization of the model is based on proteomic/transcriptomic data and literature evidence. With the incorporation of information from the MetaCrop and MaizeCyc databases, this updated model spans 5,824 genes, 8,525 reactions, and 9,153 metabolites, an increase of approximately 4 times the size of the earlier iRS1563 model. Transcriptomic and proteomic data have also been used to introduce regulatory constraints in the model to simulate an N-limited condition and mutants deficient in glutamine synthetase, gln1-3 and gln1-4. Model-predicted results achieved 90% accuracy when comparing the wild type grown under an N-complete condition with the wild type grown under an N-deficient condition. PMID:25248718
Uddin, Reaz; Jamil, Faiza
2018-06-01
Pseudomonas aeruginosa is an opportunistic gram-negative bacterium that has the capability to acquire resistance under hostile conditions and become a threat worldwide. It is involved in nosocomial infections. In the current study, potential novel drug targets against P. aeruginosa have been identified using core proteomic analysis and Protein-Protein Interactions (PPIs) studies. The non-redundant reference proteome of 68 strains having complete genome and latest assembly version of P. aeruginosa were downloaded from ftp NCBI RefSeq server in October 2016. The standalone CD-HIT tool was used to cluster ortholog proteins (having >=80% amino acid identity) present in all strains. The pan-proteome was clustered in 12,380 Clusters of Orthologous Proteins (COPs). By using in-house shell scripts, 3252 common COPs were extracted out and designated as clusters of core proteome. The core proteome of PAO1 strain was selected by fetching PAO1's proteome from common COPs. As a result, 1212 proteins were shortlisted that are non-homologous to the human but essential for the survival of the pathogen. Among these 1212 proteins, 321 proteins are conserved hypothetical proteins. Considering their potential as drug target, those 321 hypothetical proteins were selected and their probable functions were characterized. Based on the druggability criteria, 18 proteins were shortlisted. The interacting partners were identified by investigating the PPIs network using STRING v10 database. Subsequently, 8 proteins were shortlisted as 'hub proteins' and proposed as potential novel drug targets against P. aeruginosa. The study is interesting for the scientific community working to identify novel drug targets against MDR pathogens particularly P. aeruginosa. Copyright © 2018 Elsevier Ltd. All rights reserved.
A Community Standard Format for the Representation of Protein Affinity Reagents*
Gloriam, David E.; Orchard, Sandra; Bertinetti, Daniela; Björling, Erik; Bongcam-Rudloff, Erik; Borrebaeck, Carl A. K.; Bourbeillon, Julie; Bradbury, Andrew R. M.; de Daruvar, Antoine; Dübel, Stefan; Frank, Ronald; Gibson, Toby J.; Gold, Larry; Haslam, Niall; Herberg, Friedrich W.; Hiltke, Tara; Hoheisel, Jörg D.; Kerrien, Samuel; Koegl, Manfred; Konthur, Zoltán; Korn, Bernhard; Landegren, Ulf; Montecchi-Palazzi, Luisa; Palcy, Sandrine; Rodriguez, Henry; Schweinsberg, Sonja; Sievert, Volker; Stoevesandt, Oda; Taussig, Michael J.; Ueffing, Marius; Uhlén, Mathias; van der Maarel, Silvère; Wingren, Christer; Woollard, Peter; Sherman, David J.; Hermjakob, Henning
2010-01-01
Protein affinity reagents (PARs), most commonly antibodies, are essential reagents for protein characterization in basic research, biotechnology, and diagnostics as well as the fastest growing class of therapeutics. Large numbers of PARs are available commercially; however, their quality is often uncertain. In addition, currently available PARs cover only a fraction of the human proteome, and their cost is prohibitive for proteome scale applications. This situation has triggered several initiatives involving large scale generation and validation of antibodies, for example the Swedish Human Protein Atlas and the German Antibody Factory. Antibodies targeting specific subproteomes are being pursued by members of Human Proteome Organisation (plasma and liver proteome projects) and the United States National Cancer Institute (cancer-associated antigens). ProteomeBinders, a European consortium, aims to set up a resource of consistently quality-controlled protein-binding reagents for the whole human proteome. An ultimate PAR database resource would allow consumers to visit one on-line warehouse and find all available affinity reagents from different providers together with documentation that facilitates easy comparison of their cost and quality. However, in contrast to, for example, nucleotide databases among which data are synchronized between the major data providers, current PAR producers, quality control centers, and commercial companies all use incompatible formats, hindering data exchange. Here we propose Proteomics Standards Initiative (PSI)-PAR as a global community standard format for the representation and exchange of protein affinity reagent data. The PSI-PAR format is maintained by the Human Proteome Organisation PSI and was developed within the context of ProteomeBinders by building on a mature proteomics standard format, PSI-molecular interaction, which is a widely accepted and established community standard for molecular interaction data. Further information and documentation are available on the PSI-PAR web site. PMID:19674966
Recent advances in proteomics of cereals.
Bansal, Monika; Sharma, Madhu; Kanwar, Priyanka; Goyal, Aakash
Cereals contribute a major part of human nutrition and are considered as an integral source of energy for human diets. With genomic databases already available in cereals such as rice, wheat, barley, and maize, the focus has now moved to proteome analysis. Proteomics studies involve the development of appropriate databases based on developing suitable separation and purification protocols, identification of protein functions, and can confirm their functional networks based on already available data from other sources. Tremendous progress has been made in the past decade in generating huge data-sets for covering interactions among proteins, protein composition of various organs and organelles, quantitative and qualitative analysis of proteins, and to characterize their modulation during plant development, biotic, and abiotic stresses. Proteomics platforms have been used to identify and improve our understanding of various metabolic pathways. This article gives a brief review of efforts made by different research groups on comparative descriptive and functional analysis of proteomics applications achieved in the cereal science so far.
NASA Astrophysics Data System (ADS)
Parviainen, Ville; Joenväärä, Sakari; Peltoniemi, Hannu; Mattila, Pirkko; Renkonen, Risto
2009-04-01
Mass spectrometry-based proteomic research has become one of the main methods in protein-protein interaction research. Several high throughput studies have established an interaction landscape of exponentially growing Baker's yeast culture. However, many of the protein-protein interactions are likely to change in different environmental conditions. In order to examine the dynamic nature of the protein interactions we isolated the protein complexes of mannose-1-phosphate guanyltransferase PSA1 from Saccharomyces cerevisiae at four different time points during batch cultivation. We used the tandem affinity purification (TAP)-method to purify the complexes and subjected the tryptic peptides to LC-MS/MS. The resulting peak lists were analyzed with two different methods: the database related protein identification program X!Tandem and the de novo sequencing program Lutefisk. We observed significant changes in the interactome of PSA1 during the batch cultivation and identified altogether 74 proteins interacting with PSA1 of which only six were found to interact during all time points. All the other proteins showed a more dynamic nature of binding activity. In this study we also demonstrate the benefit of using both database related and de novo methods in the protein interaction research to enhance both the quality and the quantity of observations.
A Database for Tracking Toxicogenomic Samples and Procedures with Genomic, Proteomic and Metabonomic Components
Wenjun Bao1, Jennifer Fostel2, Michael D. Waters2, B. Alex Merrick2, Drew Ekman3, Mitchell Kostich4, Judith Schmid1, David Dix1
Office of Research and Developmen...
Komatsu, Setsuko; Wang, Xin; Yin, Xiaojian; Nanjo, Yohei; Ohyanagi, Hajime; Sakata, Katsumi
2017-06-23
The Soybean Proteome Database (SPD) stores data on soybean proteins obtained with gel-based and gel-free proteomic techniques. The database was constructed to provide information on proteins for functional analyses. The majority of the data is focused on soybean (Glycine max 'Enrei'). The growth and yield of soybean are strongly affected by environmental stresses such as flooding. The database was originally constructed using data on soybean proteins separated by two-dimensional polyacrylamide gel electrophoresis, which is a gel-based proteomic technique. Since 2015, the database has been expanded to incorporate data obtained by label-free mass spectrometry-based quantitative proteomics, which is a gel-free proteomic technique. Here, the portions of the database consisting of gel-free proteomic data are described. The gel-free proteomic database contains 39,212 proteins identified in 63 sample sets, such as temporal and organ-specific samples of soybean plants grown under flooding stress or non-stressed conditions. In addition, data on organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored. Furthermore, the database integrates multiple omics data such as genomics, transcriptomics, metabolomics, and proteomics. The SPD database is accessible at http://proteome.dc.affrc.go.jp/Soybean/. The Soybean Proteome Database stores data obtained from both gel-based and gel-free proteomic techniques. The gel-free proteomic database comprises 39,212 proteins identified in 63 sample sets, such as different organs of soybean plants grown under flooding stress or non-stressed conditions in a time-dependent manner. In addition, organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored in the gel-free proteomics database. A total of 44,704 proteins, including 5490 proteins identified using a gel-based proteomic technique, are stored in the SPD. It accounts for approximately 80% of all predicted proteins from genome sequences, though there are over lapped proteins. Based on the demonstrated application of data stored in the database for functional analyses, it is suggested that these data will be useful for analyses of biological mechanisms in soybean. Furthermore, coupled with recent advances in information and communication technology, the usefulness of this database would increase in the analyses of biological mechanisms. Copyright © 2017 Elsevier B.V. All rights reserved.
MAPU: Max-Planck Unified database of organellar, cellular, tissue and body fluid proteomes
Zhang, Yanling; Zhang, Yong; Adachi, Jun; Olsen, Jesper V.; Shi, Rong; de Souza, Gustavo; Pasini, Erica; Foster, Leonard J.; Macek, Boris; Zougman, Alexandre; Kumar, Chanchal; Wiśniewski, Jacek R.; Jun, Wang; Mann, Matthias
2007-01-01
Mass spectrometry (MS)-based proteomics has become a powerful technology to map the protein composition of organelles, cell types and tissues. In our department, a large-scale effort to map these proteomes is complemented by the Max-Planck Unified (MAPU) proteome database. MAPU contains several body fluid proteomes; including plasma, urine, and cerebrospinal fluid. Cell lines have been mapped to a depth of several thousand proteins and the red blood cell proteome has also been analyzed in depth. The liver proteome is represented with 3200 proteins. By employing high resolution MS and stringent validation criteria, false positive identification rates in MAPU are lower than 1:1000. Thus MAPU datasets can serve as reference proteomes in biomarker discovery. MAPU contains the peptides identifying each protein, measured masses, scores and intensities and is freely available at using a clickable interface of cell or body parts. Proteome data can be queried across proteomes by protein name, accession number, sequence similarity, peptide sequence and annotation information. More than 4500 mouse and 2500 human proteins have already been identified in at least one proteome. Basic annotation information and links to other public databases are provided in MAPU and we plan to add further analysis tools. PMID:17090601
Zachut, M; Kra, G; Livshitz, L; Portnick, Y; Yakoby, S; Friedlander, G; Levin, Y
2017-03-31
Environmental heat stress and metabolic stress during transition from late gestation to lactation are main factors limiting production in dairy cattle, and there is a complex interaction between them. Many proteins expressed in adipose tissue are involved in metabolic responses to stress. We aimed to investigate the effects of seasonal heat stress on adipose proteome in late-pregnant cows, and to identify biomarkers of heat stress. Late pregnant cows during summer heat stress (S, n=18), or during the winter season (W, n=12) were used. Subcutaneous adipose tissue biopsies sampled 14days prepartum from S (n=10) and W (n=8) were analyzed by intensity-based, label-free, quantitative shotgun proteomics (nano-LC-MS/MS). Plasma concentrations of malondialdehyde and cortisol were higher in S than in W cows. Proteomic analysis revealed that 107/1495 proteins were differentially abundant in S compared to W (P<0.05 and fold change of at least ±1.5). Top canonical pathways in S vs. W adipose were Nrf2-mediated oxidative stress response, acute-phase response, and FXR/RXR and LXR/RXR activation. Novel biomarkers of heat stress in adipose tissue were found. These findings indicate that seasonal heat stress has a unique effect on adipose tissue in late-pregnant cows. This work shows that seasonal heat stress increases plasma concentrations of the oxidative stress marker malondialdehyde and cortisol in transition dairy cows. As many proteins expressed in the adipose tissue are involved in metabolic responses to stress, we investigated the effects of heat stress on the proteome of adipose tissue from late-pregnant cows during summer or winter seasons. We demonstrated that heat stress enriches several stress-related pathways, such as the Nrf2-mediated oxidative stress response and the acute-phase response in adipose tissues. Thus, environmental heat stress has a unique effect on adipose tissue in late-pregnant cows, as part of the regulatory adaptations to chronic heat load during the summer season. In addition, this study presents the widest available dataset of adipose tissue proteome in dairy cows, and revealed several novel biomarkers of heat stress in adipose tissue of dairy cows, the use of which awaits further validation. Copyright © 2017 Elsevier B.V. All rights reserved.
Rice proteome database: a step toward functional analysis of the rice genome.
Komatsu, Setsuko
2005-09-01
The technique of proteome analysis using two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) has the power to monitor global changes that occur in the protein complement of tissues and subcellular compartments. In this study, the proteins of rice were cataloged, a rice proteome database was constructed, and a functional characterization of some of the identified proteins was undertaken. Proteins extracted from various tissues and subcellular compartments in rice were separated by 2D-PAGE and an image analyzer was used to construct a display of the proteins. The Rice Proteome Database contains 23 reference maps based on 2D-PAGE of proteins from various rice tissues and subcellular compartments. These reference maps comprise 13129 identified proteins, and the amino acid sequences of 5092 proteins are entered in the database. Major proteins involved in growth or stress responses were identified using the proteome approach. Some of these proteins, including a beta-tubulin, calreticulin, and ribulose-1,5-bisphosphate carboxylase/oxygenase activase in rice, have unexpected functions. The information obtained from the Rice Proteome Database will aid in cloning the genes for and predicting the function of unknown proteins.
MAPU: Max-Planck Unified database of organellar, cellular, tissue and body fluid proteomes.
Zhang, Yanling; Zhang, Yong; Adachi, Jun; Olsen, Jesper V; Shi, Rong; de Souza, Gustavo; Pasini, Erica; Foster, Leonard J; Macek, Boris; Zougman, Alexandre; Kumar, Chanchal; Wisniewski, Jacek R; Jun, Wang; Mann, Matthias
2007-01-01
Mass spectrometry (MS)-based proteomics has become a powerful technology to map the protein composition of organelles, cell types and tissues. In our department, a large-scale effort to map these proteomes is complemented by the Max-Planck Unified (MAPU) proteome database. MAPU contains several body fluid proteomes; including plasma, urine, and cerebrospinal fluid. Cell lines have been mapped to a depth of several thousand proteins and the red blood cell proteome has also been analyzed in depth. The liver proteome is represented with 3200 proteins. By employing high resolution MS and stringent validation criteria, false positive identification rates in MAPU are lower than 1:1000. Thus MAPU datasets can serve as reference proteomes in biomarker discovery. MAPU contains the peptides identifying each protein, measured masses, scores and intensities and is freely available at http://www.mapuproteome.com using a clickable interface of cell or body parts. Proteome data can be queried across proteomes by protein name, accession number, sequence similarity, peptide sequence and annotation information. More than 4500 mouse and 2500 human proteins have already been identified in at least one proteome. Basic annotation information and links to other public databases are provided in MAPU and we plan to add further analysis tools.
Proteomic analysis of the Theileria annulata schizont
Witschi, M.; Xia, D.; Sanderson, S.; Baumgartner, M.; Wastling, J.M.; Dobbelaere, D.A.E.
2013-01-01
The apicomplexan parasite, Theileria annulata, is the causative agent of tropical theileriosis, a devastating lymphoproliferative disease of cattle. The schizont stage transforms bovine leukocytes and provides an intriguing model to study host/pathogen interactions. The genome of T. annulata has been sequenced and transcriptomic data are rapidly accumulating. In contrast, little is known about the proteome of the schizont, the pathogenic, transforming life cycle stage of the parasite. Using one-dimensional (1-D) gel LC-MS/MS, a proteomic analysis of purified T. annulata schizonts was carried out. In whole parasite lysates, 645 proteins were identified. Proteins with transmembrane domains (TMDs) were under-represented and no proteins with more than four TMDs could be detected. To tackle this problem, Triton X-114 treatment was applied, which facilitates the extraction of membrane proteins, followed by 1-D gel LC-MS/MS. This resulted in the identification of an additional 153 proteins. Half of those had one or more TMD and 30 proteins with more than four TMDs were identified. This demonstrates that Triton X-114 treatment can provide a valuable additional tool for the identification of new membrane proteins in proteomic studies. With two exceptions, all proteins involved in glycolysis and the citric acid cycle were identified. For at least 29% of identified proteins, the corresponding transcripts were not present in the existing expressed sequence tag databases. The proteomics data were integrated into the publicly accessible database resource at EuPathDB (www.eupathdb.org) so that mass spectrometry-based protein expression evidence for T. annulata can be queried alongside transcriptional and other genomics data available for these parasites. PMID:23178997
Hall, Aaron Smalter; Shan, Yunfeng; Lushington, Gerald; Visvanathan, Mahesh
2016-01-01
Databases and exchange formats describing biological entities such as chemicals and proteins, along with their relationships, are a critical component of research in life sciences disciplines, including chemical biology wherein small information about small molecule properties converges with cellular and molecular biology. Databases for storing biological entities are growing not only in size, but also in type, with many similarities between them and often subtle differences. The data formats available to describe and exchange these entities are numerous as well. In general, each format is optimized for a particular purpose or database, and hence some understanding of these formats is required when choosing one for research purposes. This paper reviews a selection of different databases and data formats with the goal of summarizing their purposes, features, and limitations. Databases are reviewed under the categories of 1) protein interactions, 2) metabolic pathways, 3) chemical interactions, and 4) drug discovery. Representation formats will be discussed according to those describing chemical structures, and those describing genomic/proteomic entities. PMID:22934944
Smalter Hall, Aaron; Shan, Yunfeng; Lushington, Gerald; Visvanathan, Mahesh
2013-03-01
Databases and exchange formats describing biological entities such as chemicals and proteins, along with their relationships, are a critical component of research in life sciences disciplines, including chemical biology wherein small information about small molecule properties converges with cellular and molecular biology. Databases for storing biological entities are growing not only in size, but also in type, with many similarities between them and often subtle differences. The data formats available to describe and exchange these entities are numerous as well. In general, each format is optimized for a particular purpose or database, and hence some understanding of these formats is required when choosing one for research purposes. This paper reviews a selection of different databases and data formats with the goal of summarizing their purposes, features, and limitations. Databases are reviewed under the categories of 1) protein interactions, 2) metabolic pathways, 3) chemical interactions, and 4) drug discovery. Representation formats will be discussed according to those describing chemical structures, and those describing genomic/proteomic entities.
Wimmer, Helge; Gundacker, Nina C; Griss, Johannes; Haudek, Verena J; Stättner, Stefan; Mohr, Thomas; Zwickl, Hannes; Paulitschke, Verena; Baron, David M; Trittner, Wolfgang; Kubicek, Markus; Bayer, Editha; Slany, Astrid; Gerner, Christopher
2009-06-01
Interpretation of proteome data with a focus on biomarker discovery largely relies on comparative proteome analyses. Here, we introduce a database-assisted interpretation strategy based on proteome profiles of primary cells. Both 2-D-PAGE and shotgun proteomics are applied. We obtain high data concordance with these two different techniques. When applying mass analysis of tryptic spot digests from 2-D gels of cytoplasmic fractions, we typically identify several hundred proteins. Using the same protein fractions, we usually identify more than thousand proteins by shotgun proteomics. The data consistency obtained when comparing these independent data sets exceeds 99% of the proteins identified in the 2-D gels. Many characteristic differences in protein expression of different cells can thus be independently confirmed. Our self-designed SQL database (CPL/MUW - database of the Clinical Proteomics Laboratories at the Medical University of Vienna accessible via www.meduniwien.ac.at/proteomics/database) facilitates (i) quality management of protein identification data, which are based on MS, (ii) the detection of cell type-specific proteins and (iii) of molecular signatures of specific functional cell states. Here, we demonstrate, how the interpretation of proteome profiles obtained from human liver tissue and hepatocellular carcinoma tissue is assisted by the Clinical Proteomics Laboratories at the Medical University of Vienna-database. Therefore, we suggest that the use of reference experiments supported by a tailored database may substantially facilitate data interpretation of proteome profiling experiments.
Xu, Huilei; Baroukh, Caroline; Dannenfelser, Ruth; Chen, Edward Y; Tan, Christopher M; Kou, Yan; Kim, Yujin E; Lemischka, Ihor R; Ma'ayan, Avi
2013-01-01
High content studies that profile mouse and human embryonic stem cells (m/hESCs) using various genome-wide technologies such as transcriptomics and proteomics are constantly being published. However, efforts to integrate such data to obtain a global view of the molecular circuitry in m/hESCs are lagging behind. Here, we present an m/hESC-centered database called Embryonic Stem Cell Atlas from Pluripotency Evidence integrating data from many recent diverse high-throughput studies including chromatin immunoprecipitation followed by deep sequencing, genome-wide inhibitory RNA screens, gene expression microarrays or RNA-seq after knockdown (KD) or overexpression of critical factors, immunoprecipitation followed by mass spectrometry proteomics and phosphoproteomics. The database provides web-based interactive search and visualization tools that can be used to build subnetworks and to identify known and novel regulatory interactions across various regulatory layers. The web-interface also includes tools to predict the effects of combinatorial KDs by additive effects controlled by sliders, or through simulation software implemented in MATLAB. Overall, the Embryonic Stem Cell Atlas from Pluripotency Evidence database is a comprehensive resource for the stem cell systems biology community. Database URL: http://www.maayanlab.net/ESCAPE
Biomarker Discovery and Mechanistic Studies of Prostate Cancer Using Targeted Proteomic Approaches
2012-07-01
1-0431 TITLE: Biomarker Discovery and Mechanistic Studies of Prostate Cancer Using Targeted Proteomic Approaches PRINCIPAL INVESTIGATOR...July 2012 2. REPORT TYPE Final 3. DATES COVERED (From - To) 1 July 2008 – 30 June 2012 4. TITLE AND SUBTITLE Biomarker Discovery and Mechanistic...Department of Defense Synergistic Idea Development Award W81XWH-08-1-0430 (to H.Z) and W81XWH-08-1-0431 (to N.K.), an NIH/NCRR COBRE grant 1P20RR020171 (to
FunRich proteomics software analysis, let the fun begin!
Benito-Martin, Alberto; Peinado, Héctor
2015-08-01
Protein MS analysis is the preferred method for unbiased protein identification. It is normally applied to a large number of both small-scale and high-throughput studies. However, user-friendly computational tools for protein analysis are still needed. In this issue, Mathivanan and colleagues (Proteomics 2015, 15, 2597-2601) report the development of FunRich software, an open-access software that facilitates the analysis of proteomics data, providing tools for functional enrichment and interaction network analysis of genes and proteins. FunRich is a reinterpretation of proteomic software, a standalone tool combining ease of use with customizable databases, free access, and graphical representations. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Construction of a nasopharyngeal carcinoma 2D/MS repository with Open Source XML database--Xindice.
Li, Feng; Li, Maoyu; Xiao, Zhiqiang; Zhang, Pengfei; Li, Jianling; Chen, Zhuchu
2006-01-11
Many proteomics initiatives require integration of all information with uniformcriteria from collection of samples and data display to publication of experimental results. The integration and exchanging of these data of different formats and structure imposes a great challenge to us. The XML technology presents a promise in handling this task due to its simplicity and flexibility. Nasopharyngeal carcinoma (NPC) is one of the most common cancers in southern China and Southeast Asia, which has marked geographic and racial differences in incidence. Although there are some cancer proteome databases now, there is still no NPC proteome database. The raw NPC proteome experiment data were captured into one XML document with Human Proteome Markup Language (HUP-ML) editor and imported into native XML database Xindice. The 2D/MS repository of NPC proteome was constructed with Apache, PHP and Xindice to provide access to the database via Internet. On our website, two methods, keyword query and click query, were provided at the same time to access the entries of the NPC proteome database. Our 2D/MS repository can be used to share the raw NPC proteomics data that are generated from gel-based proteomics experiments. The database, as well as the PHP source codes for constructing users' own proteome repository, can be accessed at http://www.xyproteomics.org/.
Proteome analysis of the fungus Aspergillus carbonarius under ochratoxin A producing conditions.
Crespo-Sempere, A; Gil, J V; Martínez-Culebras, P V
2011-06-30
Aspergillus carbonarius is an important ochratoxin A producing fungus that is responsible for mycotoxin contamination of grapes and wine. In this study, the proteomes of highly (W04-40) and weakly (W04-46) OTA-producing A. carbonarius strains were compared to identify proteins that may be involved in OTA biosynthesis. Protein samples were extracted from two biological replicates and subjected to two dimensional gel electrophoresis analysis and mass spectrometry. Expression profile comparison (PDQuest software), revealed 21 differential spots that were statistically significant and showed a two-fold change in expression, or greater. Among these, nine protein spots were identified by MALDI-MS/MS and MASCOT database and twelve remain unidentified. Of the identified proteins, seven showed a higher expression in strain W04-40 (high OTA producer) and two in strain W04-46 (low OTA producer). Some of the identified amino acid sequences shared homology with proteins involved in regulation, amino acid metabolism, oxidative stress and sporulation. It is worth noting the presence of a protein with 126.5 fold higher abundance in strain W04-40 showing homology with protein CipC, a protein with unknown function related with pathogenesis and mycotoxin production by some authors. Variations in protein expression were also further investigated at the mRNA level by real-time PCR analysis. The mRNA expression levels from three identified proteins including CipC showed correlation with protein expression levels. This study represents the first proteomic analysis for a comparison of two A. carbonarius strains with different OTA production and will contribute to a better understanding of the molecular events involved in OTA biosynthesis. Copyright © 2011 Elsevier B.V. All rights reserved.
Sys-BodyFluid: a systematical database for human body fluid proteome research
Li, Su-Jun; Peng, Mao; Li, Hong; Liu, Bo-Shu; Wang, Chuan; Wu, Jia-Rui; Li, Yi-Xue; Zeng, Rong
2009-01-01
Recently, body fluids have widely become an important target for proteomic research and proteomic study has produced more and more body fluid related protein data. A database is needed to collect and analyze these proteome data. Thus, we developed this web-based body fluid proteome database Sys-BodyFluid. It contains eleven kinds of body fluid proteomes, including plasma/serum, urine, cerebrospinal fluid, saliva, bronchoalveolar lavage fluid, synovial fluid, nipple aspirate fluid, tear fluid, seminal fluid, human milk and amniotic fluid. Over 10 000 proteins are presented in the Sys-BodyFluid. Sys-BodyFluid provides the detailed protein annotations, including protein description, Gene Ontology, domain information, protein sequence and involved pathways. These proteome data can be retrieved by using protein name, protein accession number and sequence similarity. In addition, users can query between these different body fluids to get the different proteins identification information. Sys-BodyFluid database can facilitate the body fluid proteomics and disease proteomics research as a reference database. It is available at http://www.biosino.org/bodyfluid/. PMID:18978022
Sys-BodyFluid: a systematical database for human body fluid proteome research.
Li, Su-Jun; Peng, Mao; Li, Hong; Liu, Bo-Shu; Wang, Chuan; Wu, Jia-Rui; Li, Yi-Xue; Zeng, Rong
2009-01-01
Recently, body fluids have widely become an important target for proteomic research and proteomic study has produced more and more body fluid related protein data. A database is needed to collect and analyze these proteome data. Thus, we developed this web-based body fluid proteome database Sys-BodyFluid. It contains eleven kinds of body fluid proteomes, including plasma/serum, urine, cerebrospinal fluid, saliva, bronchoalveolar lavage fluid, synovial fluid, nipple aspirate fluid, tear fluid, seminal fluid, human milk and amniotic fluid. Over 10,000 proteins are presented in the Sys-BodyFluid. Sys-BodyFluid provides the detailed protein annotations, including protein description, Gene Ontology, domain information, protein sequence and involved pathways. These proteome data can be retrieved by using protein name, protein accession number and sequence similarity. In addition, users can query between these different body fluids to get the different proteins identification information. Sys-BodyFluid database can facilitate the body fluid proteomics and disease proteomics research as a reference database. It is available at http://www.biosino.org/bodyfluid/.
Bromilow, Sophie; Gethings, Lee A; Buckley, Mike; Bromley, Mike; Shewry, Peter R; Langridge, James I; Clare Mills, E N
2017-06-23
The unique physiochemical properties of wheat gluten enable a diverse range of food products to be manufactured. However, gluten triggers coeliac disease, a condition which is treated using a gluten-free diet. Analytical methods are required to confirm if foods are gluten-free, but current immunoassay-based methods can unreliable and proteomic methods offer an alternative but require comprehensive and well annotated sequence databases which are lacking for gluten. A manually a curated database (GluPro V1.0) of gluten proteins, comprising 630 discrete unique full length protein sequences has been compiled. It is representative of the different types of gliadin and glutenin components found in gluten. An in silico comparison of their coeliac toxicity was undertaken by analysing the distribution of coeliac toxic motifs. This demonstrated that whilst the α-gliadin proteins contained more toxic motifs, these were distributed across all gluten protein sub-types. Comparison of annotations observed using a discovery proteomics dataset acquired using ion mobility MS/MS showed that more reliable identifications were obtained using the GluPro V1.0 database compared to the complete reviewed Viridiplantae database. This highlights the value of a curated sequence database specifically designed to support the proteomic workflows and the development of methods to detect and quantify gluten. We have constructed the first manually curated open-source wheat gluten protein sequence database (GluPro V1.0) in a FASTA format to support the application of proteomic methods for gluten protein detection and quantification. We have also analysed the manually verified sequences to give the first comprehensive overview of the distribution of sequences able to elicit a reaction in coeliac disease, the prevalent form of gluten intolerance. Provision of this database will improve the reliability of gluten protein identification by proteomic analysis, and aid the development of targeted mass spectrometry methods in line with Codex Alimentarius Commission requirements for foods designed to meet the needs of gluten intolerant individuals. Copyright © 2017. Published by Elsevier B.V.
Biomarker Discovery and Mechanistic Studies of Prostate Cancer Using Targeted Proteomic Approaches
2010-07-01
1-0431 TITLE: Biomarker Discovery and Mechanistic Studies of Prostate Cancer Using Targeted Proteomic Approaches PRINCIPAL INVESTIGATOR...June 2010 4. TITLE AND SUBTITLE Biomarker Discovery and Mechanistic Studies of Prostate Cancer Using Targeted Proteomic 5a. CONTRACT NUMBER...1-0430; W81XWH-08-1-0431; Grant sponsor: NIH/NCRR COBRE Grant; Grant number: 1P20RR020171; Grant sponsor: NIH/NIDDK Grant; Grant number: R01DK053525
Proteome-wide Subcellular Topologies of E. coli Polypeptides Database (STEPdb)*
Orfanoudaki, Georgia; Economou, Anastassios
2014-01-01
Cell compartmentalization serves both the isolation and the specialization of cell functions. After synthesis in the cytoplasm, over a third of all proteins are targeted to other subcellular compartments. Knowing how proteins are distributed within the cell and how they interact is a prerequisite for understanding it as a whole. Surface and secreted proteins are important pathogenicity determinants. Here we present the STEP database (STEPdb) that contains a comprehensive characterization of subcellular localization and topology of the complete proteome of Escherichia coli. Two widely used E. coli proteomes (K-12 and BL21) are presented organized into thirteen subcellular classes. STEPdb exploits the wealth of genetic, proteomic, biochemical, and functional information on protein localization, secretion, and targeting in E. coli, one of the best understood model organisms. Subcellular annotations were derived from a combination of bioinformatics prediction, proteomic, biochemical, functional, topological data and extensive literature re-examination that were refined through manual curation. Strong experimental support for the location of 1553 out of 4303 proteins was based on 426 articles and some experimental indications for another 526. Annotations were provided for another 320 proteins based on firm bioinformatic predictions. STEPdb is the first database that contains an extensive set of peripheral IM proteins (PIM proteins) and includes their graphical visualization into complexes, cellular functions, and interactions. It also summarizes all currently known protein export machineries of E. coli K-12 and pairs them, where available, with the secretory proteins that use them. It catalogs the Sec- and TAT-utilizing secretomes and summarizes their topological features such as signal peptides and transmembrane regions, transmembrane topologies and orientations. It also catalogs physicochemical and structural features that influence topology such as abundance, solubility, disorder, heat resistance, and structural domain families. Finally, STEPdb incorporates prediction tools for topology (TMHMM, SignalP, and Phobius) and disorder (IUPred) and implements the BLAST2STEP that performs protein homology searches against the STEPdb. PMID:25210196
Ghorab, Hamida; Lammi, Carmen; Arnoldi, Anna; Kabouche, Zahia; Aiello, Gilda
2018-01-15
An investigation on the proteome of the sweet kernel of apricot, based on equalisation with combinatorial peptide ligand libraries (CPLLs), SDS-PAGE, nLC-ESI-MS/MS, and database search, permitted identifying 175 proteins. Gene ontology analysis indicated that their main molecular functions are in nucleotide binding (20.9%), hydrolase activities (10.6%), kinase activities (7%), and catalytic activity (5.6%). A protein-protein association network analysis using STRING software permitted to build an interactomic map of all detected proteins, characterised by 34 interactions. In order to forecast the potential health benefits deriving from the consumption of these proteins, the two most abundant, i.e. Prunin 1 and 2, were enzymatically digested in silico predicting 10 and 14 peptides, respectively. Searching their sequences in the database BIOPEP, it was possible to suggest a variety of bioactivities, including dipeptidyl peptidase-IV (DPP-IV) and angiotensin converting enzyme I (ACE) inhibition, glucose uptake stimulation and antioxidant properties. Copyright © 2017 Elsevier Ltd. All rights reserved.
Wheat proteomics: proteome modulation and abiotic stress acclimation
Komatsu, Setsuko; Kamal, Abu H. M.; Hossain, Zahed
2014-01-01
Cellular mechanisms of stress sensing and signaling represent the initial plant responses to adverse conditions. The development of high-throughput “Omics” techniques has initiated a new era of the study of plant molecular strategies for adapting to environmental changes. However, the elucidation of stress adaptation mechanisms in plants requires the accurate isolation and characterization of stress-responsive proteins. Because the functional part of the genome, namely the proteins and their post-translational modifications, are critical for plant stress responses, proteomic studies provide comprehensive information about the fine-tuning of cellular pathways that primarily involved in stress mitigation. This review summarizes the major proteomic findings related to alterations in the wheat proteomic profile in response to abiotic stresses. Moreover, the strengths and weaknesses of different sample preparation techniques, including subcellular protein extraction protocols, are discussed in detail. The continued development of proteomic approaches in combination with rapidly evolving bioinformatics tools and interactive databases will facilitate understanding of the plant mechanisms underlying stress tolerance. PMID:25538718
Approaches for Defining the Hsp90-dependent Proteome
Hartson, Steven D.; Matts, Robert L.
2011-01-01
Hsp90 is the target of ongoing drug discovery studies seeking new compounds to treat cancer, neurodegenerative diseases, and protein folding disorders. To better understand Hsp90’s roles in cellular pathologies and in normal cells, numerous studies have utilized proteomics assays and related high-throughput tools to characterize its physical and functional protein partnerships. This review surveys these studies, and summarizes the strengths and limitations of the individual attacks. We also include downloadable spreadsheets compiling all of the Hsp90-interacting proteins identified in more than 23 studies. These tools include cross-references among gene aliases, human homologues of yeast Hsp90-interacting proteins, hyperlinks to database entries, summaries of canonical pathways that are enriched in the Hsp90 interactome, and additional bioinformatic annotations. In addition to summarizing Hsp90 proteomics studies performed to date and the insights they have provided, we identify gaps in our current understanding of Hsp90-mediated proteostasis. PMID:21906632
Content Is King: Databases Preserve the Collective Information of Science.
Yates, John R
2018-04-01
Databases store sequence information experimentally gathered to create resources that further science. In the last 20 years databases have become critical components of fields like proteomics where they provide the basis for large-scale and high-throughput proteomic informatics. Amos Bairoch, winner of the Association of Biomolecular Resource Facilities Frederick Sanger Award, has created some of the important databases proteomic research depends upon for accurate interpretation of data.
2011-01-01
Background Despite the successful eradication of smallpox by the WHO-led vaccination programme, pox virus infections remain a considerable health threat. The possible use of smallpox as a bioterrorism agent as well as the continuous occurrence of zoonotic pox virus infections document the relevance to deepen the understanding for virus host interactions. Since the permissiveness of pox infections is independent of hosts surface receptors, but correlates with the ability of the virus to infiltrate the antiviral host response, it directly depends on the hosts proteome set. In this report the proteome of HEK293 cells infected with Vaccinia Virus strain IHD-W was analyzed by 2-dimensional gel electrophoresis and MALDI-PSD-TOF MS in a bottom-up approach. Results The cellular and viral proteomes of VACV IHD-W infected HEK293 cells, UV-inactivated VACV IHD-W-treated as well as non-infected cells were compared. Derivatization of peptides with 4-sulfophenyl isothiocyanate (SPITC) carried out on ZipTipμ-C18 columns enabled protein identification via the peptides' primary sequence, providing improved s/n ratios as well as signal intensities of the PSD spectra. The expression of more than 24 human proteins was modulated by the viral infection. Effects of UV-inactivated and infectious viruses on the hosts' proteome concerning energy metabolism and proteins associated with gene expression and protein-biosynthesis were quite similar. These effects might therefore be attributed to virus entry and virion proteins. However, the modulation of proteins involved in apoptosis was clearly correlated to infectious viruses. Conclusions The proteome analysis of infected cells provides insight into apoptosis modulation, regulation of cellular gene expression and the regulation of energy metabolism. The confidence of protein identifications was clearly improved by the peptides' derivatization with SPITC on a solid phase support. Some of the identified proteins have not been described in the context of poxvirus infections before and need to be further characterised to identify their meaning for apoptosis modulation and pathogenesis. PMID:21806805
Genes2Networks: connecting lists of gene symbols using mammalian protein interactions databases.
Berger, Seth I; Posner, Jeremy M; Ma'ayan, Avi
2007-10-04
In recent years, mammalian protein-protein interaction network databases have been developed. The interactions in these databases are either extracted manually from low-throughput experimental biomedical research literature, extracted automatically from literature using techniques such as natural language processing (NLP), generated experimentally using high-throughput methods such as yeast-2-hybrid screens, or interactions are predicted using an assortment of computational approaches. Genes or proteins identified as significantly changing in proteomic experiments, or identified as susceptibility disease genes in genomic studies, can be placed in the context of protein interaction networks in order to assign these genes and proteins to pathways and protein complexes. Genes2Networks is a software system that integrates the content of ten mammalian interaction network datasets. Filtering techniques to prune low-confidence interactions were implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from "seed" lists of human Entrez gene symbols. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list. Genes2Networks is powerful web-based software that can help experimental biologists to interpret lists of genes and proteins such as those commonly produced through genomic and proteomic experiments, as well as lists of genes and proteins associated with disease processes. This system can be used to find relationships between genes and proteins from seed lists, and predict additional genes or proteins that may play key roles in common pathways or protein complexes.
Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics.
Deutsch, Eric W; Sun, Zhi; Campbell, David S; Binz, Pierre-Alain; Farrah, Terry; Shteynberg, David; Mendoza, Luis; Omenn, Gilbert S; Moritz, Robert L
2016-11-04
The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances-a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ∼20,000 primary isoforms plus contaminants to a very large database that includes almost all nonredundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the discovered peptides against a more complex database. We have set up an automated system that downloads all the source databases on the first of each month and automatically generates a new set of search databases and makes them available for download at http://www.peptideatlas.org/thisp/ .
Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics
Deutsch, Eric W.; Sun, Zhi; Campbell, David S.; Binz, Pierre-Alain; Farrah, Terry; Shteynberg, David; Mendoza, Luis; Omenn, Gilbert S.; Moritz, Robert L.
2016-01-01
The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances – a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ~20,000 primary isoforms plus contaminants to a very large database that includes almost all non-redundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the discovered peptides against a more complex database. We have set up an automated system that downloads all the source databases on the first of each month and automatically generates a new set of search databases and makes them available for download at http://www.peptideatlas.org/thisp/. PMID:27577934
Proteomic and Bioinformatic Profile of Primary Human Oral Epithelial Cells
Ghosh, Santosh K.; Yohannes, Elizabeth; Bebek, Gurkan; Weinberg, Aaron; Jiang, Bin; Willard, Belinda; Chance, Mark R.; Kinter, Michael T.; McCormick, Thomas S.
2012-01-01
Wounding of the oral mucosa occurs frequently in a highly septic environment. Remarkably, these wounds heal quickly and the oral cavity, for the most part, remains healthy. Deciphering the normal human oral epithelial cell (NHOEC) proteome is critical for understanding the mechanism(s) of protection elicited when the mucosal barrier is intact, as well as when it is breached. Combining 2D gel electrophoresis with shotgun proteomics resulted in identification of 1662 NHOEC proteins. Proteome annotations were performed based on protein classes, molecular functions, disease association and membership in canonical and metabolic signaling pathways. Comparing the NHOEC proteome with a database of innate immunity-relevant interactions (InnateDB) identified 64 common proteins associated with innate immunity. Comparison with published salivary proteomes revealed that 738/1662 NHOEC proteins were common, suggesting that significant numbers of salivary proteins are of epithelial origin. Gene ontology analysis showed similarities in the distributions of NHOEC and saliva proteomes with regard to biological processes, and molecular functions. We also assessed the inter-individual variability of the NHOEC proteome and observed it to be comparable with other primary cells. The baseline proteome described in this study should serve as a resource for proteome studies of the oral mucosa, especially in relation to disease processes. PMID:23035736
Proteomics in the investigation of HIV-1 interactions with host proteins.
Li, Ming
2015-02-01
Productive HIV-1 infection depends on host machinery, including a broad array of cellular proteins. Proteomics has played a significant role in the discovery of HIV-1 host proteins. In this review, after a brief survey of the HIV-1 host proteins that were discovered by proteomic analyses, I focus on analyzing the interactions between the virion and host proteins, as well as the technologies and strategies used in those proteomic studies. With the help of proteomics, the identification and characterization of HIV-1 host proteins can be translated into novel antiretroviral therapeutics. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
HTAPP: High-Throughput Autonomous Proteomic Pipeline
Yu, Kebing; Salomon, Arthur R.
2011-01-01
Recent advances in the speed and sensitivity of mass spectrometers and in analytical methods, the exponential acceleration of computer processing speeds, and the availability of genomic databases from an array of species and protein information databases have led to a deluge of proteomic data. The development of a lab-based automated proteomic software platform for the automated collection, processing, storage, and visualization of expansive proteomic datasets is critically important. The high-throughput autonomous proteomic pipeline (HTAPP) described here is designed from the ground up to provide critically important flexibility for diverse proteomic workflows and to streamline the total analysis of a complex proteomic sample. This tool is comprised of software that controls the acquisition of mass spectral data along with automation of post-acquisition tasks such as peptide quantification, clustered MS/MS spectral database searching, statistical validation, and data exploration within a user-configurable lab-based relational database. The software design of HTAPP focuses on accommodating diverse workflows and providing missing software functionality to a wide range of proteomic researchers to accelerate the extraction of biological meaning from immense proteomic data sets. Although individual software modules in our integrated technology platform may have some similarities to existing tools, the true novelty of the approach described here is in the synergistic and flexible combination of these tools to provide an integrated and efficient analysis of proteomic samples. PMID:20336676
A comprehensive and scalable database search system for metaproteomics.
Chatterjee, Sandip; Stupp, Gregory S; Park, Sung Kyu Robin; Ducom, Jean-Christophe; Yates, John R; Su, Andrew I; Wolan, Dennis W
2016-08-16
Mass spectrometry-based shotgun proteomics experiments rely on accurate matching of experimental spectra against a database of protein sequences. Existing computational analysis methods are limited in the size of their sequence databases, which severely restricts the proteomic sequencing depth and functional analysis of highly complex samples. The growing amount of public high-throughput sequencing data will only exacerbate this problem. We designed a broadly applicable metaproteomic analysis method (ComPIL) that addresses protein database size limitations. Our approach to overcome this significant limitation in metaproteomics was to design a scalable set of sequence databases assembled for optimal library querying speeds. ComPIL was integrated with a modified version of the search engine ProLuCID (termed "Blazmass") to permit rapid matching of experimental spectra. Proof-of-principle analysis of human HEK293 lysate with a ComPIL database derived from high-quality genomic libraries was able to detect nearly all of the same peptides as a search with a human database (~500x fewer peptides in the database), with a small reduction in sensitivity. We were also able to detect proteins from the adenovirus used to immortalize these cells. We applied our method to a set of healthy human gut microbiome proteomic samples and showed a substantial increase in the number of identified peptides and proteins compared to previous metaproteomic analyses, while retaining a high degree of protein identification accuracy and allowing for a more in-depth characterization of the functional landscape of the samples. The combination of ComPIL with Blazmass allows proteomic searches to be performed with database sizes much larger than previously possible. These large database searches can be applied to complex meta-samples with unknown composition or proteomic samples where unexpected proteins may be identified. The protein database, proteomic search engine, and the proteomic data files for the 5 microbiome samples characterized and discussed herein are open source and available for use and additional analysis.
Yu, Kebing; Salomon, Arthur R
2009-12-01
Recently, dramatic progress has been achieved in expanding the sensitivity, resolution, mass accuracy, and scan rate of mass spectrometers able to fragment and identify peptides through MS/MS. Unfortunately, this enhanced ability to acquire proteomic data has not been accompanied by a concomitant increase in the availability of flexible tools allowing users to rapidly assimilate, explore, and analyze this data and adapt to various experimental workflows with minimal user intervention. Here we fill this critical gap by providing a flexible relational database called PeptideDepot for organization of expansive proteomic data sets, collation of proteomic data with available protein information resources, and visual comparison of multiple quantitative proteomic experiments. Our software design, built upon the synergistic combination of a MySQL database for safe warehousing of proteomic data with a FileMaker-driven graphical user interface for flexible adaptation to diverse workflows, enables proteomic end-users to directly tailor the presentation of proteomic data to the unique analysis requirements of the individual proteomics lab. PeptideDepot may be deployed as an independent software tool or integrated directly with our high throughput autonomous proteomic pipeline used in the automated acquisition and post-acquisition analysis of proteomic data.
Cell death proteomics database: consolidating proteomics data on cell death.
Arntzen, Magnus Ø; Bull, Vibeke H; Thiede, Bernd
2013-05-03
Programmed cell death is a ubiquitous process of utmost importance for the development and maintenance of multicellular organisms. More than 10 different types of programmed cell death forms have been discovered. Several proteomics analyses have been performed to gain insight in proteins involved in the different forms of programmed cell death. To consolidate these studies, we have developed the cell death proteomics (CDP) database, which comprehends data from apoptosis, autophagy, cytotoxic granule-mediated cell death, excitotoxicity, mitotic catastrophe, paraptosis, pyroptosis, and Wallerian degeneration. The CDP database is available as a web-based database to compare protein identifications and quantitative information across different experimental setups. The proteomics data of 73 publications were integrated and unified with protein annotations from UniProt-KB and gene ontology (GO). Currently, more than 6,500 records of more than 3,700 proteins are included in the CDP. Comparing apoptosis and autophagy using overrepresentation analysis of GO terms, the majority of enriched processes were found in both, but also some clear differences were perceived. Furthermore, the analysis revealed differences and similarities of the proteome between autophagosomal and overall autophagy. The CDP database represents a useful tool to consolidate data from proteome analyses of programmed cell death and is available at http://celldeathproteomics.uio.no.
mzResults: An Interactive Viewer for Interrogation and Distribution of Proteomics Results*
Webber, James T.; Askenazi, Manor; Marto, Jarrod A.
2011-01-01
The growing use of mass spectrometry in the context of biomedical research has been accompanied by an increased demand for distribution of results in a format that facilitates rapid and efficient validation of claims by reviewers and other interested parties. However, the continued evolution of mass spectrometry hardware, sample preparation methods, and peptide identification algorithms complicates standardization and creates hurdles related to compliance with journal submission requirements. Moreover, the recently announced Philadelphia Guidelines (1, 2) suggest that authors provide native mass spectrometry data files in support of their peer-reviewed research articles. These trends highlight the need for data viewers and other tools that work independently of manufacturers' proprietary data systems and seamlessly connect proteomics results with original data files to support user-driven data validation and review. Based upon our recently described API1-based framework for mass spectrometry data analysis (3, 4), we created an interactive viewer (mzResults) that is built on established database standards and enables efficient distribution and interrogation of results associated with proteomics experiments, while also providing a convenient mechanism for authors to comply with data submission standards as described in the Philadelphia Guidelines. In addition, the architecture of mzResults supports in-depth queries of the native mass spectrometry files through our multiplierz software environment. We use phosphoproteomics data to illustrate the features and capabilities of mzResults. PMID:21266631
Analysis of high accuracy, quantitative proteomics data in the MaxQB database.
Schaab, Christoph; Geiger, Tamar; Stoehr, Gabriele; Cox, Juergen; Mann, Matthias
2012-03-01
MS-based proteomics generates rapidly increasing amounts of precise and quantitative information. Analysis of individual proteomic experiments has made great strides, but the crucial ability to compare and store information across different proteome measurements still presents many challenges. For example, it has been difficult to avoid contamination of databases with low quality peptide identifications, to control for the inflation in false positive identifications when combining data sets, and to integrate quantitative data. Although, for example, the contamination with low quality identifications has been addressed by joint analysis of deposited raw data in some public repositories, we reasoned that there should be a role for a database specifically designed for high resolution and quantitative data. Here we describe a novel database termed MaxQB that stores and displays collections of large proteomics projects and allows joint analysis and comparison. We demonstrate the analysis tools of MaxQB using proteome data of 11 different human cell lines and 28 mouse tissues. The database-wide false discovery rate is controlled by adjusting the project specific cutoff scores for the combined data sets. The 11 cell line proteomes together identify proteins expressed from more than half of all human genes. For each protein of interest, expression levels estimated by label-free quantification can be visualized across the cell lines. Similarly, the expression rank order and estimated amount of each protein within each proteome are plotted. We used MaxQB to calculate the signal reproducibility of the detected peptides for the same proteins across different proteomes. Spearman rank correlation between peptide intensity and detection probability of identified proteins was greater than 0.8 for 64% of the proteome, whereas a minority of proteins have negative correlation. This information can be used to pinpoint false protein identifications, independently of peptide database scores. The information contained in MaxQB, including high resolution fragment spectra, is accessible to the community via a user-friendly web interface at http://www.biochem.mpg.de/maxqb.
A systems biology-led insight into the role of the proteome in neurodegenerative diseases.
Fasano, Mauro; Monti, Chiara; Alberio, Tiziana
2016-09-01
Multifactorial disorders are the result of nonlinear interactions of several factors; therefore, a reductionist approach does not appear to be appropriate. Proteomics is a global approach that can be efficiently used to investigate pathogenetic mechanisms of neurodegenerative diseases. Here, we report a general introduction about the systems biology approach and mechanistic insights recently obtained by over-representation analysis of proteomics data of cellular and animal models of Alzheimer's disease, Parkinson's disease and other neurodegenerative disorders, as well as of affected human tissues. Expert commentary: As an inductive method, proteomics is based on unbiased observations that further require validation of generated hypotheses. Pathway databases and over-representation analysis tools allow researchers to assign an expectation value to pathogenetic mechanisms linked to neurodegenerative diseases. The systems biology approach based on omics data may be the key to unravel the complex mechanisms underlying neurodegeneration.
A Bioinformatics Workflow for Variant Peptide Detection in Shotgun Proteomics*
Li, Jing; Su, Zengliu; Ma, Ze-Qiang; Slebos, Robbert J. C.; Halvey, Patrick; Tabb, David L.; Liebler, Daniel C.; Pao, William; Zhang, Bing
2011-01-01
Shotgun proteomics data analysis usually relies on database search. However, commonly used protein sequence databases do not contain information on protein variants and thus prevent variant peptides and proteins from been identified. Including known coding variations into protein sequence databases could help alleviate this problem. Based on our recently published human Cancer Proteome Variation Database, we have created a protein sequence database that comprehensively annotates thousands of cancer-related coding variants collected in the Cancer Proteome Variation Database as well as noncancer-specific ones from the Single Nucleotide Polymorphism Database (dbSNP). Using this database, we then developed a data analysis workflow for variant peptide identification in shotgun proteomics. The high risk of false positive variant identifications was addressed by a modified false discovery rate estimation method. Analysis of colorectal cancer cell lines SW480, RKO, and HCT-116 revealed a total of 81 peptides that contain either noncancer-specific or cancer-related variations. Twenty-three out of 26 variants randomly selected from the 81 were confirmed by genomic sequencing. We further applied the workflow on data sets from three individual colorectal tumor specimens. A total of 204 distinct variant peptides were detected, and five carried known cancer-related mutations. Each individual showed a specific pattern of cancer-related mutations, suggesting potential use of this type of information for personalized medicine. Compatibility of the workflow has been tested with four popular database search engines including Sequest, Mascot, X!Tandem, and MyriMatch. In summary, we have developed a workflow that effectively uses existing genomic data to enable variant peptide detection in proteomics. PMID:21389108
Park, Gun Wook; Hwang, Heeyoun; Kim, Kwang Hoe; Lee, Ju Yeon; Lee, Hyun Kyoung; Park, Ji Yeong; Ji, Eun Sun; Park, Sung-Kyu Robin; Yates, John R; Kwon, Kyung-Hoon; Park, Young Mok; Lee, Hyoung-Joo; Paik, Young-Ki; Kim, Jin Young; Yoo, Jong Shin
2016-11-04
In the Chromosome-Centric Human Proteome Project (C-HPP), false-positive identification by peptide spectrum matches (PSMs) after database searches is a major issue for proteogenomic studies using liquid-chromatography and mass-spectrometry-based large proteomic profiling. Here we developed a simple strategy for protein identification, with a controlled false discovery rate (FDR) at the protein level, using an integrated proteomic pipeline (IPP) that consists of four engrailed steps as follows. First, using three different search engines, SEQUEST, MASCOT, and MS-GF+, individual proteomic searches were performed against the neXtProt database. Second, the search results from the PSMs were combined using statistical evaluation tools including DTASelect and Percolator. Third, the peptide search scores were converted into E-scores normalized using an in-house program. Last, ProteinInferencer was used to filter the proteins containing two or more peptides with a controlled FDR of 1.0% at the protein level. Finally, we compared the performance of the IPP to a conventional proteomic pipeline (CPP) for protein identification using a controlled FDR of <1% at the protein level. Using the IPP, a total of 5756 proteins (vs 4453 using the CPP) including 477 alternative splicing variants (vs 182 using the CPP) were identified from human hippocampal tissue. In addition, a total of 10 missing proteins (vs 7 using the CPP) were identified with two or more unique peptides, and their tryptic peptides were validated using MS/MS spectral pattern from a repository database or their corresponding synthetic peptides. This study shows that the IPP effectively improved the identification of proteins, including alternative splicing variants and missing proteins, in human hippocampal tissues for the C-HPP. All RAW files used in this study were deposited in ProteomeXchange (PXD000395).
Proteomics: Protein Identification Using Online Databases
ERIC Educational Resources Information Center
Eurich, Chris; Fields, Peter A.; Rice, Elizabeth
2012-01-01
Proteomics is an emerging area of systems biology that allows simultaneous study of thousands of proteins expressed in cells, tissues, or whole organisms. We have developed this activity to enable high school or college students to explore proteomic databases using mass spectrometry data files generated from yeast proteins in a college laboratory…
Arntzen, Magnus Ø; Thiede, Bernd
2012-02-01
Apoptosis is the most commonly described form of programmed cell death, and dysfunction is implicated in a large number of human diseases. Many quantitative proteome analyses of apoptosis have been performed to gain insight in proteins involved in the process. This resulted in large and complex data sets that are difficult to evaluate. Therefore, we developed the ApoptoProteomics database for storage, browsing, and analysis of the outcome of large scale proteome analyses of apoptosis derived from human, mouse, and rat. The proteomics data of 52 publications were integrated and unified with protein annotations from UniProt-KB, the caspase substrate database homepage (CASBAH), and gene ontology. Currently, more than 2300 records of more than 1500 unique proteins were included, covering a large proportion of the core signaling pathways of apoptosis. Analysis of the data set revealed a high level of agreement between the reported changes in directionality reported in proteomics studies and expected apoptosis-related function and may disclose proteins without a current recognized involvement in apoptosis based on gene ontology. Comparison between induction of apoptosis by the intrinsic and the extrinsic apoptotic signaling pathway revealed slight differences. Furthermore, proteomics has significantly contributed to the field of apoptosis in identifying hundreds of caspase substrates. The database is available at http://apoptoproteomics.uio.no.
Arntzen, Magnus Ø.; Thiede, Bernd
2012-01-01
Apoptosis is the most commonly described form of programmed cell death, and dysfunction is implicated in a large number of human diseases. Many quantitative proteome analyses of apoptosis have been performed to gain insight in proteins involved in the process. This resulted in large and complex data sets that are difficult to evaluate. Therefore, we developed the ApoptoProteomics database for storage, browsing, and analysis of the outcome of large scale proteome analyses of apoptosis derived from human, mouse, and rat. The proteomics data of 52 publications were integrated and unified with protein annotations from UniProt-KB, the caspase substrate database homepage (CASBAH), and gene ontology. Currently, more than 2300 records of more than 1500 unique proteins were included, covering a large proportion of the core signaling pathways of apoptosis. Analysis of the data set revealed a high level of agreement between the reported changes in directionality reported in proteomics studies and expected apoptosis-related function and may disclose proteins without a current recognized involvement in apoptosis based on gene ontology. Comparison between induction of apoptosis by the intrinsic and the extrinsic apoptotic signaling pathway revealed slight differences. Furthermore, proteomics has significantly contributed to the field of apoptosis in identifying hundreds of caspase substrates. The database is available at http://apoptoproteomics.uio.no. PMID:22067098
TrSDB: a proteome database of transcription factors
Hermoso, Antoni; Aguilar, Daniel; Aviles, Francesc X.; Querol, Enrique
2004-01-01
TrSDB—TranScout Database—(http://ibb.uab.es/trsdb) is a proteome database of eukaryotic transcription factors based upon predicted motifs by TranScout and data sources such as InterPro and Gene Ontology Annotation. Nine eukaryotic proteomes are included in the current version. Extensive and diverse information for each database entry, different analyses considering TranScout classification and similarity relationships are offered for research on transcription factors or gene expression. PMID:14681387
Yu, Kebing; Salomon, Arthur R.
2010-01-01
Recently, dramatic progress has been achieved in expanding the sensitivity, resolution, mass accuracy, and scan rate of mass spectrometers able to fragment and identify peptides through tandem mass spectrometry (MS/MS). Unfortunately, this enhanced ability to acquire proteomic data has not been accompanied by a concomitant increase in the availability of flexible tools allowing users to rapidly assimilate, explore, and analyze this data and adapt to a variety of experimental workflows with minimal user intervention. Here we fill this critical gap by providing a flexible relational database called PeptideDepot for organization of expansive proteomic data sets, collation of proteomic data with available protein information resources, and visual comparison of multiple quantitative proteomic experiments. Our software design, built upon the synergistic combination of a MySQL database for safe warehousing of proteomic data with a FileMaker-driven graphical user interface for flexible adaptation to diverse workflows, enables proteomic end-users to directly tailor the presentation of proteomic data to the unique analysis requirements of the individual proteomics lab. PeptideDepot may be deployed as an independent software tool or integrated directly with our High Throughput Autonomous Proteomic Pipeline (HTAPP) used in the automated acquisition and post-acquisition analysis of proteomic data. PMID:19834895
Quan, Sheng; Yang, Pingfang; Cassin-Ross, Gaëlle; Kaur, Navneet; Switzenberg, Robert; Aung, Kyaw; Li, Jiying; Hu, Jianping
2013-01-01
Plant peroxisomes are highly dynamic organelles that mediate a suite of metabolic processes crucial to development. Peroxisomes in seeds/dark-grown seedlings and in photosynthetic tissues constitute two major subtypes of plant peroxisomes, which had been postulated to contain distinct primary biochemical properties. Multiple in-depth proteomic analyses had been performed on leaf peroxisomes, yet the major makeup of peroxisomes in seeds or dark-grown seedlings remained unclear. To compare the metabolic pathways of the two dominant plant peroxisomal subtypes and discover new peroxisomal proteins that function specifically during seed germination, we performed proteomic analysis of peroxisomes from etiolated Arabidopsis (Arabidopsis thaliana) seedlings. The detection of 77 peroxisomal proteins allowed us to perform comparative analysis with the peroxisomal proteome of green leaves, which revealed a large overlap between these two primary peroxisomal variants. Subcellular targeting analysis by fluorescence microscopy validated around 10 new peroxisomal proteins in Arabidopsis. Mutant analysis suggested the role of the cysteine protease RESPONSE TO DROUGHT21A-LIKE1 in β-oxidation, seed germination, and growth. This work provides a much-needed road map of a major type of plant peroxisome and has established a basis for future investigations of peroxisomal proteolytic processes to understand their roles in development and in plant interaction with the environment. PMID:24130194
Consolidation of proteomics data in the Cancer Proteomics database.
Arntzen, Magnus Ø; Boddie, Paul; Frick, Rahel; Koehler, Christian J; Thiede, Bernd
2015-11-01
Cancer is a class of diseases characterized by abnormal cell growth and one of the major reasons for human deaths. Proteins are involved in the molecular mechanisms leading to cancer, furthermore they are affected by anti-cancer drugs, and protein biomarkers can be used to diagnose certain cancer types. Therefore, it is important to explore the proteomics background of cancer. In this report, we developed the Cancer Proteomics database to re-interrogate published proteome studies investigating cancer. The database is divided in three sections related to cancer processes, cancer types, and anti-cancer drugs. Currently, the Cancer Proteomics database contains 9778 entries of 4118 proteins extracted from 143 scientific articles covering all three sections: cell death (cancer process), prostate cancer (cancer type) and platinum-based anti-cancer drugs including carboplatin, cisplatin, and oxaliplatin (anti-cancer drugs). The detailed information extracted from the literature includes basic information about the articles (e.g., PubMed ID, authors, journal name, publication year), information about the samples (type, study/reference, prognosis factor), and the proteomics workflow (Subcellular fractionation, protein, and peptide separation, mass spectrometry, quantification). Useful annotations such as hyperlinks to UniProt and PubMed were included. In addition, many filtering options were established as well as export functions. The database is freely available at http://cancerproteomics.uio.no. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Astrophysics Data System (ADS)
Sheynkman, Gloria M.; Shortreed, Michael R.; Cesnik, Anthony J.; Smith, Lloyd M.
2016-06-01
Mass spectrometry-based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications.
Sheynkman, Gloria M.; Shortreed, Michael R.; Cesnik, Anthony J.; Smith, Lloyd M.
2016-01-01
Mass spectrometry–based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications. PMID:27049631
Bowler, Russell P; Wendt, Chris H; Fessler, Michael B; Foster, Matthew W; Kelly, Rachel S; Lasky-Su, Jessica; Rogers, Angela J; Stringer, Kathleen A; Winston, Brent W
2017-12-01
This document presents the proceedings from the workshop entitled, "New Strategies and Challenges in Lung Proteomics and Metabolomics" held February 4th-5th, 2016, in Denver, Colorado. It was sponsored by the National Heart Lung Blood Institute, the American Thoracic Society, the Colorado Biological Mass Spectrometry Society, and National Jewish Health. The goal of this workshop was to convene, for the first time, relevant experts in lung proteomics and metabolomics to discuss and overcome specific challenges in these fields that are unique to the lung. The main objectives of this workshop were to identify, review, and/or understand: (1) emerging technologies in metabolomics and proteomics as applied to the study of the lung; (2) the unique composition and challenges of lung-specific biological specimens for metabolomic and proteomic analysis; (3) the diverse informatics approaches and databases unique to metabolomics and proteomics, with special emphasis on the lung; (4) integrative platforms across genetic and genomic databases that can be applied to lung-related metabolomic and proteomic studies; and (5) the clinical applications of proteomics and metabolomics. The major findings and conclusions of this workshop are summarized at the end of the report, and outline the progress and challenges that face these rapidly advancing fields.
Colangelo, Christopher M.; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L.; Carriero, Nicholas J.; Gulcicek, Erol E.; Lam, TuKiet T.; Wu, Terence; Bjornson, Robert D.; Bruce, Can; Nairn, Angus C.; Rinehart, Jesse; Miller, Perry L.; Williams, Kenneth R.
2015-01-01
We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography–tandem mass spectrometry (LC–MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED’s database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. PMID:25712262
Colangelo, Christopher M; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L; Carriero, Nicholas J; Gulcicek, Erol E; Lam, TuKiet T; Wu, Terence; Bjornson, Robert D; Bruce, Can; Nairn, Angus C; Rinehart, Jesse; Miller, Perry L; Williams, Kenneth R
2015-02-01
We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography-tandem mass spectrometry (LC-MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED's database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
The Escherichia coli Peripheral Inner Membrane Proteome*
Papanastasiou, Malvina; Orfanoudaki, Georgia; Koukaki, Marina; Kountourakis, Nikos; Sardis, Marios Frantzeskos; Aivaliotis, Michalis; Karamanou, Spyridoula; Economou, Anastassios
2013-01-01
Biological membranes are essential for cell viability. Their functional characteristics strongly depend on their protein content, which consists of transmembrane (integral) and peripherally associated membrane proteins. Both integral and peripheral inner membrane proteins mediate a plethora of biological processes. Whereas transmembrane proteins have characteristic hydrophobic stretches and can be predicted using bioinformatics approaches, peripheral inner membrane proteins are hydrophilic, exist in equilibria with soluble pools, and carry no discernible membrane targeting signals. We experimentally determined the cytoplasmic peripheral inner membrane proteome of the model organism Escherichia coli using a multidisciplinary approach. Initially, we extensively re-annotated the theoretical proteome regarding subcellular localization using literature searches, manual curation, and multi-combinatorial bioinformatics searches of the available databases. Next we used sequential biochemical fractionations coupled to direct identification of individual proteins and protein complexes using high resolution mass spectrometry. We determined that the proposed cytoplasmic peripheral inner membrane proteome occupies a previously unsuspected ∼19% of the basic E. coli BL21(DE3) proteome, and the detected peripheral inner membrane proteome occupies ∼25% of the estimated expressed proteome of this cell grown in LB medium to mid-log phase. This value might increase when fleeting interactions, not studied here, are taken into account. Several proteins previously regarded as exclusively cytoplasmic bind membranes avidly. Many of these proteins are organized in functional or/and structural oligomeric complexes that bind to the membrane with multiple interactions. Identified proteins cover the full spectrum of biological activities, and more than half of them are essential. Our data suggest that the cytoplasmic proteome displays remarkably dynamic and extensive communication with biological membrane surfaces that we are only beginning to decipher. PMID:23230279
Garland, Donita L.; Fernandez-Godino, Rosario; Kaur, Inderjeet; Speicher, Kaye D.; Harnly, James M.; Lambris, John D.; Speicher, David W.; Pierce, Eric A.
2014-01-01
Macular degenerations, inherited and age related, are important causes of vision loss. Human genetic studies have suggested perturbation of the complement system is important in the pathogenesis of age-related macular degeneration. The mechanisms underlying the involvement of the complement system are not understood, although complement and inflammation have been implicated in drusen formation. Drusen are an early clinical hallmark of inherited and age-related forms of macular degeneration. We studied one of the earliest stages of macular degeneration which precedes and leads to the formation of drusen, i.e. the formation of basal deposits. The studies were done using a mouse model of the inherited macular dystrophy Doyne Honeycomb Retinal Dystrophy/Malattia Leventinese (DHRD/ML) which is caused by a p.Arg345Trp mutation in EFEMP1. The hallmark of DHRD/ML is the formation of drusen at an early age, and gene targeted Efemp1R345W/R345W mice develop extensive basal deposits. Proteomic analyses of Bruch's membrane/choroid and Bruch's membrane in the Efemp1R345W/R345W mice indicate that the basal deposits comprise normal extracellular matrix (ECM) components present in abnormal amounts. The proteomic analyses also identified significant changes in proteins with immune-related function, including complement components, in the diseased tissue samples. Genetic ablation of the complement response via generation of Efemp1R345W/R345W:C3−/− double-mutant mice inhibited the formation of basal deposits. The results demonstrate a critical role for the complement system in basal deposit formation, and suggest that complement-mediated recognition of abnormal ECM may participate in basal deposit formation in DHRD/ML and perhaps other macular degenerations. PMID:23943789
Sankarasubramanian, Jagadesan; Vishnu, Udayakumar S; Dinakaran, Vasudevan; Sridhar, Jayavel; Gunasekaran, Paramasamy; Rajendhran, Jeyaprakash
2016-01-01
Brucella spp. are facultative intracellular pathogens that cause brucellosis in various mammals including humans. Brucella survive inside the host cells by forming vacuoles and subverting host defence systems. This study was aimed to predict the secretion systems and the secretomes of Brucella spp. from 39 complete genome sequences available in the databases. Furthermore, an attempt was made to identify the type IV secretion effectors and their interactions with host proteins. We predicted the secretion systems of Brucella by the KEGG pathway and SecReT4. Brucella secretomes and type IV effectors (T4SEs) were predicted through genome-wide screening using JVirGel and S4TE, respectively. Protein-protein interactions of Brucella T4SEs with their hosts were analyzed by HPIDB 2.0. Genes coding for Sec and Tat pathways of secretion and type I (T1SS), type IV (T4SS) and type V (T5SS) secretion systems were identified and they are conserved in all the species of Brucella. In addition to the well-known VirB operon coding for the type IV secretion system (T4SS), we have identified the presence of additional genes showing homology with T4SS of other organisms. On the whole, 10.26 to 14.94% of total proteomes were found to be either secreted (secretome) or membrane associated (membrane proteome). Approximately, 1.7 to 3.0% of total proteomes were identified as type IV secretion effectors (T4SEs). Prediction of protein-protein interactions showed 29 and 36 host-pathogen specific interactions between Bos taurus (cattle)-B. abortus and Ovis aries (sheep)-B. melitensis, respectively. Functional characterization of the predicted T4SEs and their interactions with their respective hosts may reveal the secrets of host specificity of Brucella.
Hegedűs, Tamás; Chaubey, Pururawa Mayank; Várady, György; Szabó, Edit; Sarankó, Hajnalka; Hofstetter, Lia; Roschitzki, Bernd; Sarkadi, Balázs
2015-01-01
Based on recent results, the determination of the easily accessible red blood cell (RBC) membrane proteins may provide new diagnostic possibilities for assessing mutations, polymorphisms or regulatory alterations in diseases. However, the analysis of the current mass spectrometry-based proteomics datasets and other major databases indicates inconsistencies—the results show large scattering and only a limited overlap for the identified RBC membrane proteins. Here, we applied membrane-specific proteomics studies in human RBC, compared these results with the data in the literature, and generated a comprehensive and expandable database using all available data sources. The integrated web database now refers to proteomic, genetic and medical databases as well, and contains an unexpected large number of validated membrane proteins previously thought to be specific for other tissues and/or related to major human diseases. Since the determination of protein expression in RBC provides a method to indicate pathological alterations, our database should facilitate the development of RBC membrane biomarker platforms and provide a unique resource to aid related further research and diagnostics. Database URL: http://rbcc.hegelab.org PMID:26078478
Verma, Nisha; Pink, Mario; Petrat, Frank; Rettenmeier, Albert W; Schmitz-Spanke, Simone
2015-01-02
A proteomic analysis of the interaction among multiprotein complexes involved in 2,3,7,8-dibenzo-p-dioxin (TCDD)-mediated toxicity in urinary bladder epithelial RT4 cells was performed using two-dimensional blue native SDS-PAGE (2D BN/SDS-PAGE). To enrich the protein complexes, unexposed and TCDD-exposed cells were fractionated. BN/SDS-PAGE of the resulting fractions led to an effective separation of proteins and protein complexes of various origins, including cell membrane, mitochondria, and other intracellular compartments. Major differences between the proteome of control and exposed cells involved the alteration of many calcium-regulated proteins (calmodulin, protein S100-A2, annexin A5, annexin A10, gelsolin isoform b) and iron-regulated proteins (ferritin, heme-binding protein 2, transferrin). On the basis of these findings, the intracellular calcium concentration was determined, revealing a significant increase after 24 h of exposure to TCDD. Moreover, the concentration of the labile iron pool (LIP) was also significantly elevated in TCDD-exposed cells. This increase was strongly inhibited by the calmodulin (CaM) antagonist W-7, which pointed toward a possible interaction between iron and calcium signaling. Because nitric oxide (NO) production was significantly enhanced in TCDD-exposed cells and was also inhibited by W-7, we hypothesize that alterations in calcium and iron homeostasis upon exposure to TCDD may be linked through NO generated by CaM-activated nitric oxide synthase. In our model, we propose that NO produced upon TCDD exposure interacts with the iron centers of iron-regulatory proteins (IRPs) that modulate the alteration of ferritin and transferrin, resulting in an augmented cellular LIP and, hence, increased toxicity.
Prinsi, Bhakti; Negri, Alfredo S; Quattrocchio, Francesca M; Koes, Ronald E; Espen, Luca
2016-01-10
The Petunia hybrida ANTHOCYANIN1 (AN1) gene encodes a transcription factor that regulates both the expression of genes involved in anthocyanin synthesis and the acidification of the vacuolar lumen in corolla epidermal cells. In this work, the comparison between the red flowers of the R27 line with the white flowers of the isogenic an1 mutant line W225 showed that the AN1 gene has further pleiotropic effects on flavonoid biosynthesis as well as on distant physiological traits. The proteomic profiling showed that the an1 mutation was associated to changes in accumulation of several proteins, affecting both anthocyanin synthesis and primary metabolism. The flavonoid composition study confirmed that the an1 mutation provoked a broad attenuation of the entire flavonoid pathway, probably by indirect biochemical events. Moreover, proteomic changes and variation of biochemical parameters revealed that the an1 mutation induced a delay in the onset of flower senescence in W225, as supported by the enhanced longevity of the W225 flowers in planta and the loss of sensitivity of cut flowers to sugar. This study suggests that AN1 is possibly involved in the perception and/or transduction of ethylene signal during flower senescence. Copyright © 2015 Elsevier B.V. All rights reserved.
USDA-ARS?s Scientific Manuscript database
The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are...
Proteome of Caulobacter crescentus cell cycle publicly accessible on SWICZ server.
Vohradsky, Jiri; Janda, Ivan; Grünenfelder, Björn; Berndt, Peter; Röder, Daniel; Langen, Hanno; Weiser, Jaroslav; Jenal, Urs
2003-10-01
Here we present the Swiss-Czech Proteomics Server (SWICZ), which hosts the proteomic database summarizing information about the cell cycle of the aquatic bacterium Caulobacter crescentus. The database provides a searchable tool for easy access of global protein synthesis and protein stability data as examined during the C. crescentus cell cycle. Protein synthesis data collected from five different cell cycle stages were determined for each protein spot as a relative value of the total amount of [(35)S]methionine incorporation. Protein stability of pulse-labeled extracts were measured during a chase period equivalent to one cell cycle unit. Quantitative information for individual proteins together with descriptive data such as protein identities, apparent molecular masses and isoelectric points, were combined with information on protein function, genomic context, and the cell cycle stage, and were then assembled in a relational database with a world wide web interface (http://proteom.biomed.cas.cz), which allows the database records to be searched and displays the recovered information. A total of 1250 protein spots were reproducibly detected on two-dimensional gel electropherograms, 295 of which were identified by mass spectroscopy. The database is accessible either through clickable two-dimensional gel electrophoretic maps or by means of a set of dedicated search engines. Basic characterization of the experimental procedures, data processing, and a comprehensive description of the web site are presented. In its current state, the SWICZ proteome database provides a platform for the incorporation of new data emerging from extended functional studies on the C. crescentus proteome.
Tandem Mass Spectrum Sequencing: An Alternative to Database Search Engines in Shotgun Proteomics.
Muth, Thilo; Rapp, Erdmann; Berven, Frode S; Barsnes, Harald; Vaudel, Marc
2016-01-01
Protein identification via database searches has become the gold standard in mass spectrometry based shotgun proteomics. However, as the quality of tandem mass spectra improves, direct mass spectrum sequencing gains interest as a database-independent alternative. In this chapter, the general principle of this so-called de novo sequencing is introduced along with pitfalls and challenges of the technique. The main tools available are presented with a focus on user friendly open source software which can be directly applied in everyday proteomic workflows.
[Proteome analysis on interaction between Anoectochilus roxburghii and Mycorrhizal fungus].
Gao, Chuan; Guo, Shun-Xing; Zhang, Jing; Chen, Juan; Zhang, Li-Chun
2012-12-01
To study the mechanism of plant growing promoted by Mycorrhizal fungus through the difference of proteomes. The differential proteomes between uninoculated and inoculated endophytic fungi, Epulorhiza sp. on Anoectochilus roxburghii were analyzed by two-dimensional gel electrophoresis and MALDI-TOF/TOF mass spectrum. Twenty-seven protein spots were analyzed by matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF-MS). Twenty-two candidate proteins were identified by database comparisons. The function of these proteins mostly involved in signal transduction, metabolic regulation, as well as photosynthesis and substance metabolism. The results indicate that the regulator control system of plant is influenced by fungi action, and the positive regulation improves substance metabolism and photosynthesis, which results in strong plant and higher resistance. It is also deduced that silent genes may exist in endosymbiosis plants.
Proteomic characterization of hempseed (Cannabis sativa L.).
Aiello, Gilda; Fasoli, Elisa; Boschin, Giovanna; Lammi, Carmen; Zanoni, Chiara; Citterio, Attilio; Arnoldi, Anna
2016-09-16
This paper presents an investigation on hempseed proteome. The experimental approach, based on combinatorial peptide ligand libraries (CPLLs), SDS-PAGE separation, nLC-ESI-MS/MS identification, and database search, permitted identifying in total 181 expressed proteins. This very large number of identifications was achieved by searching in two databases: Cannabis sativa L. (56 gene products identified) and Arabidopsis thaliana (125 gene products identified). By performing a protein-protein association network analysis using the STRING software, it was possible to build the first interactomic map of all detected proteins, characterized by 137 nodes and 410 interactions. Finally, a Gene Ontology analysis of the identified species permitted to classify their molecular functions: the great majority is involved in the seed metabolic processes (41%), responses to stimulus (8%), and biological process (7%). Hempseed is an underexploited non-legume protein-rich seed. Although its protein is well known for its digestibility, essential amino acid composition, and useful techno-functional properties, a comprehensive proteome characterization is still lacking. The objective of this work was to fill this knowledge gap and provide information useful for a better exploitation of this seed in different food products. Copyright © 2016 Elsevier B.V. All rights reserved.
Assembling proteomics data as a prerequisite for the analysis of large scale experiments
Schmidt, Frank; Schmid, Monika; Thiede, Bernd; Pleißner, Klaus-Peter; Böhme, Martina; Jungblut, Peter R
2009-01-01
Background Despite the complete determination of the genome sequence of a huge number of bacteria, their proteomes remain relatively poorly defined. Beside new methods to increase the number of identified proteins new database applications are necessary to store and present results of large- scale proteomics experiments. Results In the present study, a database concept has been developed to address these issues and to offer complete information via a web interface. In our concept, the Oracle based data repository system SQL-LIMS plays the central role in the proteomics workflow and was applied to the proteomes of Mycobacterium tuberculosis, Helicobacter pylori, Salmonella typhimurium and protein complexes such as 20S proteasome. Technical operations of our proteomics labs were used as the standard for SQL-LIMS template creation. By means of a Java based data parser, post-processed data of different approaches, such as LC/ESI-MS, MALDI-MS and 2-D gel electrophoresis (2-DE), were stored in SQL-LIMS. A minimum set of the proteomics data were transferred in our public 2D-PAGE database using a Java based interface (Data Transfer Tool) with the requirements of the PEDRo standardization. Furthermore, the stored proteomics data were extractable out of SQL-LIMS via XML. Conclusion The Oracle based data repository system SQL-LIMS played the central role in the proteomics workflow concept. Technical operations of our proteomics labs were used as standards for SQL-LIMS templates. Using a Java based parser, post-processed data of different approaches such as LC/ESI-MS, MALDI-MS and 1-DE and 2-DE were stored in SQL-LIMS. Thus, unique data formats of different instruments were unified and stored in SQL-LIMS tables. Moreover, a unique submission identifier allowed fast access to all experimental data. This was the main advantage compared to multi software solutions, especially if personnel fluctuations are high. Moreover, large scale and high-throughput experiments must be managed in a comprehensive repository system such as SQL-LIMS, to query results in a systematic manner. On the other hand, these database systems are expensive and require at least one full time administrator and specialized lab manager. Moreover, the high technical dynamics in proteomics may cause problems to adjust new data formats. To summarize, SQL-LIMS met the requirements of proteomics data handling especially in skilled processes such as gel-electrophoresis or mass spectrometry and fulfilled the PSI standardization criteria. The data transfer into a public domain via DTT facilitated validation of proteomics data. Additionally, evaluation of mass spectra by post-processing using MS-Screener improved the reliability of mass analysis and prevented storage of data junk. PMID:19166578
The path to enlightenment: making sense of genomic and proteomic information.
Maurer, Martin H
2004-05-01
Whereas genomics describes the study of genome, mainly represented by its gene expression on the DNA or RNA level, the term proteomics denotes the study of the proteome, which is the protein complement encoded by the genome. In recent years, the number of proteomic experiments increased tremendously. While all fields of proteomics have made major technological advances, the biggest step was seen in bioinformatics. Biological information management relies on sequence and structure databases and powerful software tools to translate experimental results into meaningful biological hypotheses and answers. In this resource article, I provide a collection of databases and software available on the Internet that are useful to interpret genomic and proteomic data. The article is a toolbox for researchers who have genomic or proteomic datasets and need to put their findings into a biological context.
Elguoshy, Amr; Hirao, Yoshitoshi; Xu, Bo; Saito, Suguru; Quadery, Ali F; Yamamoto, Keiko; Mitsui, Toshiaki; Yamamoto, Tadashi
2017-12-01
In an attempt to complete human proteome project (HPP), Chromosome-Centric Human Proteome Project (C-HPP) launched the journey of missing protein (MP) investigation in 2012. However, 2579 and 572 protein entries in the neXtProt (2017-1) are still considered as missing and uncertain proteins, respectively. Thus, in this study, we proposed a pipeline to analyze, identify, and validate human missing and uncertain proteins in open-access transcriptomics and proteomics databases. Analysis of RNA expression pattern for missing proteins in Human protein Atlas showed that 28% of them, such as Olfactory receptor 1I1 ( O60431 ), had no RNA expression, suggesting the necessity to consider uncommon tissues for transcriptomic and proteomic studies. Interestingly, 21% had elevated expression level in a particular tissue (tissue-enriched proteins), indicating the importance of targeting such proteins in their elevated tissues. Additionally, the analysis of RNA expression level for missing proteins showed that 95% had no or low expression level (0-10 transcripts per million), indicating that low abundance is one of the major obstacles facing the detection of missing proteins. Moreover, missing proteins are predicted to generate fewer predicted unique tryptic peptides than the identified proteins. Searching for these predicted unique tryptic peptides that correspond to missing and uncertain proteins in the experimental peptide list of open-access MS-based databases (PA, GPM) resulted in the detection of 402 missing and 19 uncertain proteins with at least two unique peptides (≥9 aa) at <(5 × 10 -4 )% FDR. Finally, matching the native spectra for the experimentally detected peptides with their SRMAtlas synthetic counterparts at three transition sources (QQQ, QTOF, QTRAP) gave us an opportunity to validate 41 missing proteins by ≥2 proteotypic peptides.
Identification of lactoferricin B intracellular targets using an Escherichia coli proteome chip.
Tu, Yu-Hsuan; Ho, Yu-Hsuan; Chuang, Ying-Chih; Chen, Po-Chung; Chen, Chien-Sheng
2011-01-01
Lactoferricin B (LfcinB) is a well-known antimicrobial peptide. Several studies have indicated that it can inhibit bacteria by affecting intracellular activities, but the intracellular targets of this antimicrobial peptide have not been identified. Therefore, we used E. coli proteome chips to identify the intracellular target proteins of LfcinB in a high-throughput manner. We probed LfcinB with E. coli proteome chips and further conducted normalization and Gene Ontology (GO) analyses. The results of the GO analyses showed that the identified proteins were associated with metabolic processes. Moreover, we validated the interactions between LfcinB and chip assay-identified proteins with fluorescence polarization (FP) assays. Sixteen proteins were identified, and an E. coli interaction database (EcID) analysis revealed that the majority of the proteins that interact with these 16 proteins affected the tricarboxylic acid (TCA) cycle. Knockout assays were conducted to further validate the FP assay results. These results showed that phosphoenolpyruvate carboxylase was a target of LfcinB, indicating that one of its mechanisms of action may be associated with pyruvate metabolism. Thus, we used pyruvate assays to conduct an in vivo validation of the relationship between LfcinB and pyruvate level in E. coli. These results showed that E. coli exposed to LfcinB had abnormal pyruvate amounts, indicating that LfcinB caused an accumulation of pyruvate. In conclusion, this study successfully revealed the intracellular targets of LfcinB using an E. coli proteome chip approach.
Identification of Lactoferricin B Intracellular Targets Using an Escherichia coli Proteome Chip
Chen, Po-Chung; Chen, Chien-Sheng
2011-01-01
Lactoferricin B (LfcinB) is a well-known antimicrobial peptide. Several studies have indicated that it can inhibit bacteria by affecting intracellular activities, but the intracellular targets of this antimicrobial peptide have not been identified. Therefore, we used E. coli proteome chips to identify the intracellular target proteins of LfcinB in a high-throughput manner. We probed LfcinB with E. coli proteome chips and further conducted normalization and Gene Ontology (GO) analyses. The results of the GO analyses showed that the identified proteins were associated with metabolic processes. Moreover, we validated the interactions between LfcinB and chip assay-identified proteins with fluorescence polarization (FP) assays. Sixteen proteins were identified, and an E. coli interaction database (EcID) analysis revealed that the majority of the proteins that interact with these 16 proteins affected the tricarboxylic acid (TCA) cycle. Knockout assays were conducted to further validate the FP assay results. These results showed that phosphoenolpyruvate carboxylase was a target of LfcinB, indicating that one of its mechanisms of action may be associated with pyruvate metabolism. Thus, we used pyruvate assays to conduct an in vivo validation of the relationship between LfcinB and pyruvate level in E. coli. These results showed that E. coli exposed to LfcinB had abnormal pyruvate amounts, indicating that LfcinB caused an accumulation of pyruvate. In conclusion, this study successfully revealed the intracellular targets of LfcinB using an E. coli proteome chip approach. PMID:22164243
Mapping protein-protein interactions with phage-displayed combinatorial peptide libraries.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kay, B. K.; Castagnoli, L.; Biosciences Division
This unit describes the process and analysis of affinity selecting bacteriophage M13 from libraries displaying combinatorial peptides fused to either a minor or major capsid protein. Direct affinity selection uses target protein bound to a microtiter plate followed by purification of selected phage by ELISA. Alternatively, there is a bead-based affinity selection method. These methods allow one to readily isolate peptide ligands that bind to a protein target of interest and use the consensus sequence to search proteomic databases for putative interacting proteins.
Comet: an open-source MS/MS sequence database search tool.
Eng, Jimmy K; Jahan, Tahmina A; Hoopmann, Michael R
2013-01-01
Proteomics research routinely involves identifying peptides and proteins via MS/MS sequence database search. Thus the database search engine is an integral tool in many proteomics research groups. Here, we introduce the Comet search engine to the existing landscape of commercial and open-source database search tools. Comet is open source, freely available, and based on one of the original sequence database search tools that has been widely used for many years. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Mancone, Carmine; Grimaldi, Alessio; Refolo, Giulia; Abbate, Isabella; Rozera, Gabriella; Benelli, Dario; Fimia, Gian Maria; Barnaba, Vincenzo; Tripodi, Marco; Piacentini, Mauro; Ciccosanti, Fabiola
2017-01-01
Changes in iron metabolism frequently accompany HIV-1 infection. However, while many clinical and in vitro studies report iron overload exacerbates the development of infection, many others have found no correlation. Therefore, the multi-faceted role of iron in HIV-1 infection remains enigmatic. RT-qPCR targeting the LTR region, gag , Tat and Rev were performed to measure the levels of viral RNAs in response to iron overload. Spike-in SILAC proteomics comparing i) iron-treated, ii) HIV-1-infected and iii) HIV-1-infected/iron treated T lymphocytes was performed to define modifications in the host cell proteome. Data from quantitative proteomics were integrated with the HIV-1 Human Interaction Database for assessing any viral cofactors modulated by iron overload in infected T lymphocytes. Here, we demonstrate that the iron overload down-regulates HIV-1 gene expression by decreasing the levels of viral RNAs. In addition, we found that iron overload modulates the expression of many viral cofactors. Among them, the downregulation of the REV cofactor eIF5A may correlate with the iron-induced inhibition of HIV-1 gene expression. Therefore, we demonstrated that eiF5A downregulation by shRNA resulted in a significant decrease of Nef levels, thus hampering HIV-1 replication. Our study indicates that HIV-1 cofactors influenced by iron metabolism represent potential targets for antiretroviral therapy and suggests eIF5A as a selective target for drug development.
Using the Proteomics Identifications Database (PRIDE).
Martens, Lennart; Jones, Phil; Côté, Richard
2008-03-01
The Proteomics Identifications Database (PRIDE) is a public data repository designed to store, disseminate, and analyze mass spectrometry based proteomics datasets. The PRIDE database can accommodate any level of detailed metadata about the submitted results, which can be queried, explored, viewed, or downloaded via the PRIDE Web interface. The PRIDE database also provides a simple, yet powerful, access control mechanism that fully supports confidential peer-reviewing of data related to a manuscript, ensuring that these results remain invisible to the general public while allowing referees and journal editors anonymized access to the data. This unit describes in detail the functionality that PRIDE provides with regards to searching, viewing, and comparing the available data, as well as different options for submitting data to PRIDE.
The unique peptidome: Taxon-specific tryptic peptides as biomarkers for targeted metaproteomics.
Mesuere, Bart; Van der Jeugt, Felix; Devreese, Bart; Vandamme, Peter; Dawyndt, Peter
2016-09-01
The Unique Peptide Finder (http://unipept.ugent.be/peptidefinder) is an interactive web application to quickly hunt for tryptic peptides that are unique to a particular species, genus, or any other taxon. Biodiversity within the target taxon is represented by a set of proteomes selected from a monthly updated list of complete and nonredundant UniProt proteomes, supplemented with proprietary proteomes loaded into persistent local browser storage. The software computes and visualizes pan and core peptidomes as unions and intersections of tryptic peptides occurring in the selected proteomes. In addition, it also computes and displays unique peptidomes as the set of all tryptic peptides that occur in all selected proteomes but not in any UniProt record not assigned to the target taxon. As a result, the unique peptides can serve as robust biomarkers for the target taxon, for example, in targeted metaproteomics studies. Computations are extremely fast since they are underpinned by the Unipept database, the lowest common ancestor algorithm implemented in Unipept and modern web technologies that facilitate in-browser data storage and parallel processing. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Natural variation in floral nectar proteins of two Nicotiana attenuata accessions.
Seo, Pil Joon; Wielsch, Natalie; Kessler, Danny; Svatos, Ales; Park, Chung-Mo; Baldwin, Ian T; Kim, Sang-Gyu
2013-07-13
Floral nectar (FN) contains not only energy-rich compounds to attract pollinators, but also defense chemicals and several proteins. However, proteomic analysis of FN has been hampered by the lack of publically available sequence information from nectar-producing plants. Here we used next-generation sequencing and advanced proteomics to profile FN proteins in the opportunistic outcrossing wild tobacco, Nicotiana attenuata. We constructed a transcriptome database of N. attenuata and characterized its nectar proteome using LC-MS/MS. The FN proteins of N. attenuata included nectarins, sugar-cleaving enzymes (glucosidase, galactosidase, and xylosidase), RNases, pathogen-related proteins, and lipid transfer proteins. Natural variation in FN proteins of eleven N. attenuata accessions revealed a negative relationship between the accumulation of two abundant proteins, nectarin1b and nectarin5. In addition, microarray analysis of nectary tissues revealed that protein accumulation in FN is not simply correlated with the accumulation of transcripts encoding FN proteins and identified a group of genes that were specifically expressed in the nectary. Natural variation of identified FN proteins in the ecological model plant N. attenuata suggests that nectar chemistry may have a complex function in plant-pollinator-microbe interactions.
Natural variation in floral nectar proteins of two Nicotiana attenuata accessions
2013-01-01
Background Floral nectar (FN) contains not only energy-rich compounds to attract pollinators, but also defense chemicals and several proteins. However, proteomic analysis of FN has been hampered by the lack of publically available sequence information from nectar-producing plants. Here we used next-generation sequencing and advanced proteomics to profile FN proteins in the opportunistic outcrossing wild tobacco, Nicotiana attenuata. Results We constructed a transcriptome database of N. attenuata and characterized its nectar proteome using LC-MS/MS. The FN proteins of N. attenuata included nectarins, sugar-cleaving enzymes (glucosidase, galactosidase, and xylosidase), RNases, pathogen-related proteins, and lipid transfer proteins. Natural variation in FN proteins of eleven N. attenuata accessions revealed a negative relationship between the accumulation of two abundant proteins, nectarin1b and nectarin5. In addition, microarray analysis of nectary tissues revealed that protein accumulation in FN is not simply correlated with the accumulation of transcripts encoding FN proteins and identified a group of genes that were specifically expressed in the nectary. Conclusions Natural variation of identified FN proteins in the ecological model plant N. attenuata suggests that nectar chemistry may have a complex function in plant-pollinator-microbe interactions. PMID:23848992
Proteomic Analyses of NF1-Interacting Proteins in Keratinocytes
2015-04-01
and knockout mice further confirmed the interactions suggested by the proteomic analyses. In relation to the development of psoriasis -like symptoms...in the NF1 null epidermis, we analyzed NF1 expression in a mouse model of psoriasis (imiquimod-induced psoriasis -like skin inflammation) and...knockout of epidermal NF1 to elucidate the molecular underpinnings of psoriasis . 15. SUBJECT TERMS neurofibromin-1 (NF1), psoriasis , inflammation
de Jong, Luitzen; de Koning, Edward A; Roseboom, Winfried; Buncherd, Hansuk; Wanner, Martin J; Dapic, Irena; Jansen, Petra J; van Maarseveen, Jan H; Corthals, Garry L; Lewis, Peter J; Hamoen, Leendert W; de Koster, Chris G
2017-07-07
Identification of dynamic protein-protein interactions at the peptide level on a proteomic scale is a challenging approach that is still in its infancy. We have developed a system to cross-link cells directly in culture with the special lysine cross-linker bis(succinimidyl)-3-azidomethyl-glutarate (BAMG). We used the Gram-positive model bacterium Bacillus subtilis as an exemplar system. Within 5 min extensive intracellular cross-linking was detected, while intracellular cross-linking in a Gram-negative species, Escherichia coli, was still undetectable after 30 min, in agreement with the low permeability in this organism for lipophilic compounds like BAMG. We were able to identify 82 unique interprotein cross-linked peptides with <1% false discovery rate by mass spectrometry and genome-wide database searching. Nearly 60% of the interprotein cross-links occur in assemblies involved in transcription and translation. Several of these interactions are new, and we identified a binding site between the δ and β' subunit of RNA polymerase close to the downstream DNA channel, providing a clue into how δ might regulate promoter selectivity and promote RNA polymerase recycling. Our methodology opens new avenues to investigate the functional dynamic organization of complex protein assemblies involved in bacterial growth. Data are available via ProteomeXchange with identifier PXD006287.
Metaproteomics as a Complementary Approach to Gut Microbiota in Health and Disease
NASA Astrophysics Data System (ADS)
Petriz, Bernardo A.; Franco, Octávio L.
2017-01-01
Classic studies on phylotype profiling are limited to the identification of microbial constituents, where information is lacking about the molecular interaction of these bacterial communities with the host genome and the possible outcomes in host biology. A range of OMICs approaches have provided great progress linking the microbiota to health and disease. However, the investigation of this context through proteomic mass spectrometry-based tools is still being improved. Therefore, metaproteomics or community proteogenomics has emerged as a complementary approach to metagenomic data, as a field in proteomics aiming to perform large-scale characterization of proteins from environmental microbiota such as the human gut. The advances in molecular separation methods coupled with mass spectrometry (e.g. LC-MS/MS) and proteome bioinformatics have been fundamental in these novel large-scale metaproteomic studies, which have further been performed in a wide range of samples including soil, plant and human environments. Metaproteomic studies will make major progress if a comprehensive database covering the genes and expresses proteins from all gut microbial species is developed. To this end, we here present some of the main limitations of metaproteomic studies in complex microbiota environments such as the gut, also addressing the up-to-date pipelines in sample preparation prior to fractionation/separation and mass spectrometry analysis. In addition, a novel approach to the limitations of metagenomic databases is also discussed. Finally, prospects are addressed regarding the application of metaproteomic analysis using a unified host-microbiome gene database and other meta-OMICs platforms.
Ling, Xueping; Guo, Jing; Zheng, Chuqiang; Ye, Chiming; Lu, Yinghua; Pan, Xueshan; Chen, Zhengqi; Ng, I-Son
2015-12-01
Polyunsaturated fatty acids (PUFAs) are valuable ingredients in the food and pharmaceutical products due to their beneficial influence on human health. Most studies paid attention on the production of PUFAs from oleaginous micro-organisms but seldom on the comparative proteomics of cells. In the study, three methods (i.e., cold shock, acetone precipitation and ethanol precipitation) for lipid removal from crude protein extracts were applied in different PUFAs-producing micro-organisms. Among the selective strains, Schizochytrium was used as an oleaginous strain with high lipid of 60.3 (w/w%) in biomass. The Mortierella alpina and Cunninghamella echinulata were chosen as the low-lipid-content strains with 25.8 (w/w%) and 21.8 (w/w%) of lipid in biomass, respectively. The cold shock resulted as the most effective method for lipid removed, thus obtained higher protein amount for Schizochytrium. Moreover, from the comparative proteomics for the three PUFAs-producing strains, it showed more significant proteins of up or down-regulation were explored under cold shock treatment. Therefore, the essential proteins (i.e., polyunsaturated fatty acid synthase) and regulating proteins were observed. In conclusion, this study provides a valuable and practical approach for analysis of high PUFAs-producing strains at the proteomics level, and would further accelerate the understanding of the metabolic flux in oleaginous micro-organisms.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Denef, Vincent; Shah, Manesh B; Verberkmoes, Nathan C
The recent surge in microbial genomic sequencing, combined with the development of high-throughput liquid chromatography-mass-spectrometry-based (LC/LC-MS/MS) proteomics, has raised the question of the extent to which genomic information of one strain or environmental sample can be used to profile proteomes of related strains or samples. Even with decreasing sequencing costs, it remains impractical to obtain genomic sequence for every strain or sample analyzed. Here, we evaluate how shotgun proteomics is affected by amino acid divergence between the sample and the genomic database using a probability-based model and a random mutation simulation model constrained by experimental data. To assess the effectsmore » of nonrandom distribution of mutations, we also evaluated identification levels using in silico peptide data from sequenced isolates with average amino acid identities (AAI) varying between 76 and 98%. We compared the predictions to experimental protein identification levels for a sample that was evaluated using a database that included genomic information for the dominant organism and for a closely related variant (95% AAI). The range of models set the boundaries at which half of the proteins in a proteomic experiment can be identified to be 77-92% AAI between orthologs in the sample and database. Consistent with this prediction, experimental data indicated loss of half the identifiable proteins at 90% AAI. Additional analysis indicated a 6.4% reduction of the initial protein coverage per 1% amino acid divergence and total identification loss at 86% AAI. Consequently, shotgun proteomics is capable of cross-strain identifications but avoids most crossspecies false positives.« less
2016 update of the PRIDE database and its related tools
Vizcaíno, Juan Antonio; Csordas, Attila; del-Toro, Noemi; Dianes, José A.; Griss, Johannes; Lavidas, Ilias; Mayer, Gerhard; Perez-Riverol, Yasset; Reisinger, Florian; Ternent, Tobias; Xu, Qing-Wei; Wang, Rui; Hermjakob, Henning
2016-01-01
The PRoteomics IDEntifications (PRIDE) database is one of the world-leading data repositories of mass spectrometry (MS)-based proteomics data. Since the beginning of 2014, PRIDE Archive (http://www.ebi.ac.uk/pride/archive/) is the new PRIDE archival system, replacing the original PRIDE database. Here we summarize the developments in PRIDE resources and related tools since the previous update manuscript in the Database Issue in 2013. PRIDE Archive constitutes a complete redevelopment of the original PRIDE, comprising a new storage backend, data submission system and web interface, among other components. PRIDE Archive supports the most-widely used PSI (Proteomics Standards Initiative) data standard formats (mzML and mzIdentML) and implements the data requirements and guidelines of the ProteomeXchange Consortium. The wide adoption of ProteomeXchange within the community has triggered an unprecedented increase in the number of submitted data sets (around 150 data sets per month). We outline some statistics on the current PRIDE Archive data contents. We also report on the status of the PRIDE related stand-alone tools: PRIDE Inspector, PRIDE Converter 2 and the ProteomeXchange submission tool. Finally, we will give a brief update on the resources under development ‘PRIDE Cluster’ and ‘PRIDE Proteomes’, which provide a complementary view and quality-scored information of the peptide and protein identification data available in PRIDE Archive. PMID:26527722
Listeriomics: an Interactive Web Platform for Systems Biology of Listeria
Koutero, Mikael; Tchitchek, Nicolas; Cerutti, Franck; Lechat, Pierre; Maillet, Nicolas; Hoede, Claire; Chiapello, Hélène; Gaspin, Christine
2017-01-01
ABSTRACT As for many model organisms, the amount of Listeria omics data produced has recently increased exponentially. There are now >80 published complete Listeria genomes, around 350 different transcriptomic data sets, and 25 proteomic data sets available. The analysis of these data sets through a systems biology approach and the generation of tools for biologists to browse these various data are a challenge for bioinformaticians. We have developed a web-based platform, named Listeriomics, that integrates different tools for omics data analyses, i.e., (i) an interactive genome viewer to display gene expression arrays, tiling arrays, and sequencing data sets along with proteomics and genomics data sets; (ii) an expression and protein atlas that connects every gene, small RNA, antisense RNA, or protein with the most relevant omics data; (iii) a specific tool for exploring protein conservation through the Listeria phylogenomic tree; and (iv) a coexpression network tool for the discovery of potential new regulations. Our platform integrates all the complete Listeria species genomes, transcriptomes, and proteomes published to date. This website allows navigation among all these data sets with enriched metadata in a user-friendly format and can be used as a central database for systems biology analysis. IMPORTANCE In the last decades, Listeria has become a key model organism for the study of host-pathogen interactions, noncoding RNA regulation, and bacterial adaptation to stress. To study these mechanisms, several genomics, transcriptomics, and proteomics data sets have been produced. We have developed Listeriomics, an interactive web platform to browse and correlate these heterogeneous sources of information. Our website will allow listeriologists and microbiologists to decipher key regulation mechanism by using a systems biology approach. PMID:28317029
Data Independent Acquisition analysis in ProHits 4.0.
Liu, Guomin; Knight, James D R; Zhang, Jian Ping; Tsou, Chih-Chiang; Wang, Jian; Lambert, Jean-Philippe; Larsen, Brett; Tyers, Mike; Raught, Brian; Bandeira, Nuno; Nesvizhskii, Alexey I; Choi, Hyungwon; Gingras, Anne-Claude
2016-10-21
Affinity purification coupled with mass spectrometry (AP-MS) is a powerful technique for the identification and quantification of physical interactions. AP-MS requires careful experimental design, appropriate control selection and quantitative workflows to successfully identify bona fide interactors amongst a large background of contaminants. We previously introduced ProHits, a Laboratory Information Management System for interaction proteomics, which tracks all samples in a mass spectrometry facility, initiates database searches and provides visualization tools for spectral counting-based AP-MS approaches. More recently, we implemented Significance Analysis of INTeractome (SAINT) within ProHits to provide scoring of interactions based on spectral counts. Here, we provide an update to ProHits to support Data Independent Acquisition (DIA) with identification software (DIA-Umpire and MSPLIT-DIA), quantification tools (through DIA-Umpire, or externally via targeted extraction), and assessment of quantitative enrichment (through mapDIA) and scoring of interactions (through SAINT-intensity). With additional improvements, notably support of the iProphet pipeline, facilitated deposition into ProteomeXchange repositories and enhanced export and viewing functions, ProHits 4.0 offers a comprehensive suite of tools to facilitate affinity proteomics studies. It remains challenging to score, annotate and analyze proteomics data in a transparent manner. ProHits was previously introduced as a LIMS to enable storing, tracking and analysis of standard AP-MS data. In this revised version, we expand ProHits to include integration with a number of identification and quantification tools based on Data-Independent Acquisition (DIA). ProHits 4.0 also facilitates data deposition into public repositories, and the transfer of data to new visualization tools. Copyright © 2016 Elsevier B.V. All rights reserved.
Lee, Ji-Hyun; You, Sungyong; Hyeon, Do Young; Kang, Byeongsoo; Kim, Hyerim; Park, Kyoung Mii; Han, Byungwoo; Hwang, Daehee; Kim, Sunghoon
2015-01-01
Mammalian cells have cytoplasmic and mitochondrial aminoacyl-tRNA synthetases (ARSs) that catalyze aminoacylation of tRNAs during protein synthesis. Despite their housekeeping functions in protein synthesis, recently, ARSs and ARS-interacting multifunctional proteins (AIMPs) have been shown to play important roles in disease pathogenesis through their interactions with disease-related molecules. However, there are lacks of data resources and analytical tools that can be used to examine disease associations of ARS/AIMPs. Here, we developed an Integrated Database for ARSs (IDA), a resource database including cancer genomic/proteomic and interaction data of ARS/AIMPs. IDA includes mRNA expression, somatic mutation, copy number variation and phosphorylation data of ARS/AIMPs and their interacting proteins in various cancers. IDA further includes an array of analytical tools for exploration of disease association of ARS/AIMPs, identification of disease-associated ARS/AIMP interactors and reconstruction of ARS-dependent disease-perturbed network models. Therefore, IDA provides both comprehensive data resources and analytical tools for understanding potential roles of ARS/AIMPs in cancers. Database URL: http://ida.biocon.re.kr/, http://ars.biocon.re.kr/ PMID:25824651
SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics
2013-01-01
Background Alternative splicing is an important and widespread mechanism for generating protein diversity and regulating protein expression. High-throughput identification and analysis of alternative splicing in the protein level has more advantages than in the mRNA level. The combination of alternative splicing database and tandem mass spectrometry provides a powerful technique for identification, analysis and characterization of potential novel alternative splicing protein isoforms from proteomics. Therefore, based on the peptidomic database of human protein isoforms for proteomics experiments, our objective is to design a new alternative splicing database to 1) provide more coverage of genes, transcripts and alternative splicing, 2) exclusively focus on the alternative splicing, and 3) perform context-specific alternative splicing analysis. Results We used a three-step pipeline to create a synthetic alternative splicing database (SASD) to identify novel alternative splicing isoforms and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. First, we extracted information on gene structures of all genes in the Ensembl Genes 71 database and incorporated the Integrated Pathway Analysis Database. Then, we compiled artificial splicing transcripts. Lastly, we translated the artificial transcripts into alternative splicing peptides. The SASD is a comprehensive database containing 56,630 genes (Ensembl gene IDs), 95,260 transcripts (Ensembl transcript IDs), and 11,919,779 Alternative Splicing peptides, and also covering about 1,956 pathways, 6,704 diseases, 5,615 drugs, and 52 organs. The database has a web-based user interface that allows users to search, display and download a single gene/transcript/protein, custom gene set, pathway, disease, drug, organ related alternative splicing. Moreover, the quality of the database was validated with comparison to other known databases and two case studies: 1) in liver cancer and 2) in breast cancer. Conclusions The SASD provides the scientific community with an efficient means to identify, analyze, and characterize novel Exon Skipping and Intron Retention protein isoforms from mass spectrometry and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. PMID:24267658
Vizcaíno, Juan Antonio; Foster, Joseph M.; Martens, Lennart
2010-01-01
Despite the fact that data deposition is not a generalised fact yet in the field of proteomics, several mass spectrometry (MS) based proteomics repositories are publicly available for the scientific community. The main existing resources are: the Global Proteome Machine Database (GPMDB), PeptideAtlas, the PRoteomics IDEntifications database (PRIDE), Tranche, and NCBI Peptidome. In this review the capabilities of each of these will be described, paying special attention to four key properties: data types stored, applicable data submission strategies, supported formats, and available data mining and visualization tools. Additionally, the data contents from model organisms will be enumerated for each resource. There are other valuable smaller and/or more specialized repositories but they will not be covered in this review. Finally, the concept behind the ProteomeXchange consortium, a collaborative effort among the main resources in the field, will be introduced. PMID:20615486
Mohamed, Mohamed R; Rahman, Masmudur M; Lanchbury, Jerry S; Shattuck, Donna; Neff, Chris; Dufford, Max; van Buuren, Nick; Fagan, Katharine; Barry, Michele; Smith, Scott; Damon, Inger; McFadden, Grant
2009-06-02
Identification of the binary interactions between viral and host proteins has become a valuable tool for investigating viral tropism and pathogenesis. Here, we present the first systematic protein interaction screening of the unique variola virus proteome by using yeast 2-hybrid screening against a variety of human cDNA libraries. Several protein-protein interactions were identified, including an interaction between variola G1R, an ankryin/F-box containing protein, and human nuclear factor kappa-B1 (NF-kappaB1)/p105. This represents the first direct interaction between a pathogen-encoded protein and NF-kappaB1/p105. Orthologs of G1R are present in a variety of pathogenic orthopoxviruses, but not in vaccinia virus, and expression of any one of these viral proteins blocks NF-kappaB signaling in human cells. Thus, proteomic screening of variola virus has the potential to uncover modulators of the human innate antiviral responses.
The Role of Central Metabolism in Prostate Cancer Progression
2012-07-01
AD_________________ Award Number: W81XWH-08-1-0694 TITLE: The Role of Central Metabolism in Prostate...SUBTITLE The Role of Central Metabolism in Prostate Cancer Progression 5a. CONTRACT NUMBER 5b. GRANT NUMBER W81XWH-08-1-0694 5c. PROGRAM...examine the excised prostate tissue for differences in tumor growth, proteomes and intermediates in polyunsaturated fatty acid (PUFA) metabolism . The
A semantic proteomics dashboard (SemPoD) for data management in translational research.
Jayapandian, Catherine P; Zhao, Meng; Ewing, Rob M; Zhang, Guo-Qiang; Sahoo, Satya S
2012-01-01
One of the primary challenges in translational research data management is breaking down the barriers between the multiple data silos and the integration of 'omics data with clinical information to complete the cycle from the bench to the bedside. The role of contextual metadata, also called provenance information, is a key factor ineffective data integration, reproducibility of results, correct attribution of original source, and answering research queries involving "What", "Where", "When", "Which", "Who", "How", and "Why" (also known as the W7 model). But, at present there is limited or no effective approach to managing and leveraging provenance information for integrating data across studies or projects. Hence, there is an urgent need for a paradigm shift in creating a "provenance-aware" informatics platform to address this challenge. We introduce an ontology-driven, intuitive Semantic Proteomics Dashboard (SemPoD) that uses provenance together with domain information (semantic provenance) to enable researchers to query, compare, and correlate different types of data across multiple projects, and allow integration with legacy data to support their ongoing research. The SemPoD platform, currently in use at the Case Center for Proteomics and Bioinformatics (CPB), consists of three components: (a) Ontology-driven Visual Query Composer, (b) Result Explorer, and (c) Query Manager. Currently, SemPoD allows provenance-aware querying of 1153 mass-spectrometry experiments from 20 different projects. SemPod uses the systems molecular biology provenance ontology (SysPro) to support a dynamic query composition interface, which automatically updates the components of the query interface based on previous user selections and efficiently prunes the result set usinga "smart filtering" approach. The SysPro ontology re-uses terms from the PROV-ontology (PROV-O) being developed by the World Wide Web Consortium (W3C) provenance working group, the minimum information required for reporting a molecular interaction experiment (MIMIx), and the minimum information about a proteomics experiment (MIAPE) guidelines. The SemPoD was evaluated both in terms of user feedback and as scalability of the system. SemPoD is an intuitive and powerful provenance ontology-driven data access and query platform that uses the MIAPE and MIMIx metadata guideline to create an integrated view over large-scale systems molecular biology datasets. SemPoD leverages the SysPro ontology to create an intuitive dashboard for biologists to compose queries, explore the results, and use a query manager for storing queries for later use. SemPoD can be deployed over many existing database applications storing 'omics data, including, as illustrated here, the LabKey data-management system. The initial user feedback evaluating the usability and functionality of SemPoD has been very positive and it is being considered for wider deployment beyond the proteomics domain, and in other 'omics' centers.
Proteomics data exchange and storage: the need for common standards and public repositories.
Jiménez, Rafael C; Vizcaíno, Juan Antonio
2013-01-01
Both the existence of data standards and public databases or repositories have been key factors behind the development of the existing "omics" approaches. In this book chapter we first review the main existing mass spectrometry (MS)-based proteomics resources: PRIDE, PeptideAtlas, GPMDB, and Tranche. Second, we report on the current status of the different proteomics data standards developed by the Proteomics Standards Initiative (PSI): the formats mzML, mzIdentML, mzQuantML, TraML, and PSI-MI XML are then reviewed. Finally, we present an easy way to query and access MS proteomics data in the PRIDE database, as a representative of the existing repositories, using the workflow management system (WMS) tool Taverna. Two different publicly available workflows are explained and described.
Alonso-López, Diego; Gutiérrez, Miguel A.; Lopes, Katia P.; Prieto, Carlos; Santamaría, Rodrigo; De Las Rivas, Javier
2016-01-01
APID (Agile Protein Interactomes DataServer) is an interactive web server that provides unified generation and delivery of protein interactomes mapped to their respective proteomes. This resource is a new, fully redesigned server that includes a comprehensive collection of protein interactomes for more than 400 organisms (25 of which include more than 500 interactions) produced by the integration of only experimentally validated protein–protein physical interactions. For each protein–protein interaction (PPI) the server includes currently reported information about its experimental validation to allow selection and filtering at different quality levels. As a whole, it provides easy access to the interactomes from specific species and includes a global uniform compendium of 90,379 distinct proteins and 678,441 singular interactions. APID integrates and unifies PPIs from major primary databases of molecular interactions, from other specific repositories and also from experimentally resolved 3D structures of protein complexes where more than two proteins were identified. For this purpose, a collection of 8,388 structures were analyzed to identify specific PPIs. APID also includes a new graph tool (based on Cytoscape.js) for visualization and interactive analyses of PPI networks. The server does not require registration and it is freely available for use at http://apid.dep.usal.es. PMID:27131791
Genic insights from integrated human proteomics in GeneCards.
Fishilevich, Simon; Zimmerman, Shahar; Kohn, Asher; Iny Stein, Tsippi; Olender, Tsviya; Kolker, Eugene; Safran, Marilyn; Lancet, Doron
2016-01-01
GeneCards is a one-stop shop for searchable human gene annotations (http://www.genecards.org/). Data are automatically mined from ∼120 sources and presented in an integrated web card for every human gene. We report the application of recent advances in proteomics to enhance gene annotation and classification in GeneCards. First, we constructed the Human Integrated Protein Expression Database (HIPED), a unified database of protein abundance in human tissues, based on the publically available mass spectrometry (MS)-based proteomics sources ProteomicsDB, Multi-Omics Profiling Expression Database, Protein Abundance Across Organisms and The MaxQuant DataBase. The integrated database, residing within GeneCards, compares favourably with its individual sources, covering nearly 90% of human protein-coding genes. For gene annotation and comparisons, we first defined a protein expression vector for each gene, based on normalized abundances in 69 normal human tissues. This vector is portrayed in the GeneCards expression section as a bar graph, allowing visual inspection and comparison. These data are juxtaposed with transcriptome bar graphs. Using the protein expression vectors, we further defined a pairwise metric that helps assess expression-based pairwise proximity. This new metric for finding functional partners complements eight others, including sharing of pathways, gene ontology (GO) terms and domains, implemented in the GeneCards Suite. In parallel, we calculated proteome-based differential expression, highlighting a subset of tissues that overexpress a gene and subserving gene classification. This textual annotation allows users of VarElect, the suite's next-generation phenotyper, to more effectively discover causative disease variants. Finally, we define the protein-RNA expression ratio and correlation as yet another attribute of every gene in each tissue, adding further annotative information. The results constitute a significant enhancement of several GeneCards sections and help promote and organize the genome-wide structural and functional knowledge of the human proteome. Database URL:http://www.genecards.org/. © The Author(s) 2016. Published by Oxford University Press.
Libraries of Peptide Fragmentation Mass Spectra Database
National Institute of Standards and Technology Data Gateway
SRD 1C NIST Libraries of Peptide Fragmentation Mass Spectra Database (Web, free access) The purpose of the library is to provide peptide reference data for laboratories employing mass spectrometry-based proteomics methods for protein analysis. Mass spectral libraries identify these compounds in a more sensitive and robust manner than alternative methods. These databases are freely available for testing and development of new applications.
The Pfam protein families database: towards a more sustainable future.
Finn, Robert D; Coggill, Penelope; Eberhardt, Ruth Y; Eddy, Sean R; Mistry, Jaina; Mitchell, Alex L; Potter, Simon C; Punta, Marco; Qureshi, Matloob; Sangrador-Vegas, Amaia; Salazar, Gustavo A; Tate, John; Bateman, Alex
2016-01-04
In the last two years the Pfam database (http://pfam.xfam.org) has undergone a substantial reorganisation to reduce the effort involved in making a release, thereby permitting more frequent releases. Arguably the most significant of these changes is that Pfam is now primarily based on the UniProtKB reference proteomes, with the counts of matched sequences and species reported on the website restricted to this smaller set. Building families on reference proteomes sequences brings greater stability, which decreases the amount of manual curation required to maintain them. It also reduces the number of sequences displayed on the website, whilst still providing access to many important model organisms. Matches to the full UniProtKB database are, however, still available and Pfam annotations for individual UniProtKB sequences can still be retrieved. Some Pfam entries (1.6%) which have no matches to reference proteomes remain; we are working with UniProt to see if sequences from them can be incorporated into reference proteomes. Pfam-B, the automatically-generated supplement to Pfam, has been removed. The current release (Pfam 29.0) includes 16 295 entries and 559 clans. The facility to view the relationship between families within a clan has been improved by the introduction of a new tool. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Linking the proteins--elucidation of proteome-scale networks using mass spectrometry.
Pflieger, Delphine; Gonnet, Florence; de la Fuente van Bentem, Sergio; Hirt, Heribert; de la Fuente, Alberto
2011-01-01
Proteomes are intricate. Typically, thousands of proteins interact through physical association and post-translational modifications (PTMs) to give rise to the emergent functions of cells. Understanding these functions requires one to study proteomes as "systems" rather than collections of individual protein molecules. The abstraction of the interacting proteome to "protein networks" has recently gained much attention, as networks are effective representations, that lose specific molecular details, but provide the ability to see the proteome as a whole. Mostly two aspects of the proteome have been represented by network models: proteome-wide physical protein-protein-binding interactions organized into Protein Interaction Networks (PINs), and proteome-wide PTM relations organized into Protein Signaling Networks (PSNs). Mass spectrometry (MS) techniques have been shown to be essential to reveal both of these aspects on a proteome-wide scale. Techniques such as affinity purification followed by MS have been used to elucidate protein-protein interactions, and MS-based quantitative phosphoproteomics is critical to understand the structure and dynamics of signaling through the proteome. We here review the current state-of-the-art MS-based analytical pipelines for the purpose to characterize proteome-scale networks. Copyright © 2010 Wiley Periodicals, Inc.
Literature Mining of Pathogenesis-Related Proteins in Human Pathogens for Database Annotation
2009-10-01
person shall be subject to any penalty for failing to comply with a collection of information if it does not display a currently valid OMB control...submission and for literature mining result display with automatically tagged abstracts. I. Literature data sets for machine learning algorithm training...mass spectrometry) proteomics data from Burkholderia strains. • Task1 ( M13 -15): Preliminary analysis of the Burkholderia proteomic space
ERIC Educational Resources Information Center
Brown, Cecelia
2003-01-01
Discusses the growth in use and acceptance of Web-based genomic and proteomic databases (GPD) in scholarly communication. Confirms the role of GPD in the scientific literature cycle, suggests GPD are a storage and retrieval mechanism for molecular biology information, and recommends that existing models of scientific communication be updated to…
Protein profile of mouse ovarian follicles grown in vitro.
Anastácio, Amandine; Rodriguez-Wallberg, Kenny A; Chardonnet, Solenne; Pionneau, Cédric; Fédérici, Christian; Almeida Santos, Teresa; Poirot, Catherine
2017-12-01
Could the follicle proteome be mapped by identifying specific proteins that are common or differ between three developmental stages from the secondary follicle (SF) to the antrum-like stage? From a total of 1401 proteins identified in the follicles, 609 were common to the three developmental stages investigated and 444 were found uniquely at one of the stages. The importance of the follicle as a functional structure has been recognized; however, up-to-date the proteome of the whole follicle has not been described. A few studies using proteomics have previously reported on either isolated fully-grown oocytes before or after meiosis resumption or cumulus cells. The experimental design included a validated mice model for isolation and individual culture of SFs. The system was chosen as it allows continuous evaluation of follicle growth and selection of follicles for analysis at pre-determined developmental stages: SF, complete Slavjanski membrane rupture (SMR) and antrum-like cavity (AF). The experiments were repeated 13 times independently to acquire the material that was analyzed by proteomics. SFs (n = 2166) were isolated from B6CBA/F1 female mice (n = 42), 12 days old, from 15 l. About half of the follicles isolated as SF were analyzed as such (n = 1143) and pooled to obtain 139 μg of extracted protein. Both SMR (n = 359) and AF (n = 124) were obtained after individual culture of 1023 follicles in a microdrop system under oil, selected for analysis and pooled, to obtain 339 μg and 170 μg of protein, respectively. The follicle proteome was analyzed combining isoelectric focusing (IEF) fractionation with 1D and 2D LC-MS/MS analysis to enhance protein identification. The three protein lists were submitted to the 'Compare gene list' tool in the PANTHER website to gain insights on the Gene Ontology Biological processes present and to Ingenuity Pathway Analysis to highlight protein networks. A label-free quantification was performed with 1D LC-MS/MS analyses to emphasize proteins with different expression profiles between the three follicular stages. Supplementary western blot analysis (using new biological replicates) was performed to confirm the expression variations of three proteins during follicle development in vitro. It was found that 609 out of 1401 identified proteins were common to the three follicle developmental stages investigated. Some proteins were identified uniquely at one stage: 71 of the 775 identified proteins in SF, 181 of 1092 in SMR and 192 of 1100 in AF. Additional qualitative and quantitative analysis highlighted 44 biological processes over-represented in our samples compared to the Mus musculus gene database. In particular, it was possible to identify proteins implicated in the cell cycle, calcium ion binding and glycolysis, with specific expressions and abundance, throughout in vitro follicle development. Data are available via ProteomeXchange with identifier PXD006227. The proteome analyses described in this study were performed after in vitro development. Despite fractionation of the samples before LC-MS/MS, proteomic approaches are not exhaustive, thus proteins that are not identified in a group are not necessarily absent from that group, although they are likely to be less abundant. This study allowed a general view of proteins implicated in follicle development in vitro and it represents the most complete catalog of the whole follicle proteome available so far. Not only were well known proteins of the oocyte identified but also proteins that are probably expressed only in granulosa cells. This study was supported by the Portuguese Foundation for Science and Technology, FCT (PhD fellowship SFRH/BD/65299/2009 to A.A.), the Swedish Childhood Cancer Foundation (PR 2014-0144 to K.A.R-.W.) and Stockholm County Council to K.A.R-.W. The authors of the study have no conflict of interest to report. © The Author 2017. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology.
Xia, Kai; Dong, Dong; Han, Jing-Dong J
2006-01-01
Background Although protein-protein interaction (PPI) networks have been explored by various experimental methods, the maps so built are still limited in coverage and accuracy. To further expand the PPI network and to extract more accurate information from existing maps, studies have been carried out to integrate various types of functional relationship data. A frequently updated database of computationally analyzed potential PPIs to provide biological researchers with rapid and easy access to analyze original data as a biological network is still lacking. Results By applying a probabilistic model, we integrated 27 heterogeneous genomic, proteomic and functional annotation datasets to predict PPI networks in human. In addition to previously studied data types, we show that phenotypic distances and genetic interactions can also be integrated to predict PPIs. We further built an easy-to-use, updatable integrated PPI database, the Integrated Network Database (IntNetDB) online, to provide automatic prediction and visualization of PPI network among genes of interest. The networks can be visualized in SVG (Scalable Vector Graphics) format for zooming in or out. IntNetDB also provides a tool to extract topologically highly connected network neighborhoods from a specific network for further exploration and research. Using the MCODE (Molecular Complex Detections) algorithm, 190 such neighborhoods were detected among all the predicted interactions. The predicted PPIs can also be mapped to worm, fly and mouse interologs. Conclusion IntNetDB includes 180,010 predicted protein-protein interactions among 9,901 human proteins and represents a useful resource for the research community. Our study has increased prediction coverage by five-fold. IntNetDB also provides easy-to-use network visualization and analysis tools that allow biological researchers unfamiliar with computational biology to access and analyze data over the internet. The web interface of IntNetDB is freely accessible at . Visualization requires Mozilla version 1.8 (or higher) or Internet Explorer with installation of SVGviewer. PMID:17112386
Detection of alternative splice variants at the proteome level in Aspergillus flavus.
Chang, Kung-Yen; Georgianna, D Ryan; Heber, Steffen; Payne, Gary A; Muddiman, David C
2010-03-05
Identification of proteins from proteolytic peptides or intact proteins plays an essential role in proteomics. Researchers use search engines to match the acquired peptide sequences to the target proteins. However, search engines depend on protein databases to provide candidates for consideration. Alternative splicing (AS), the mechanism where the exon of pre-mRNAs can be spliced and rearranged to generate distinct mRNA and therefore protein variants, enable higher eukaryotic organisms, with only a limited number of genes, to have the requisite complexity and diversity at the proteome level. Multiple alternative isoforms from one gene often share common segments of sequences. However, many protein databases only include a limited number of isoforms to keep minimal redundancy. As a result, the database search might not identify a target protein even with high quality tandem MS data and accurate intact precursor ion mass. We computationally predicted an exhaustive list of putative isoforms of Aspergillus flavus proteins from 20 371 expressed sequence tags to investigate whether an alternative splicing protein database can assign a greater proportion of mass spectrometry data. The newly constructed AS database provided 9807 new alternatively spliced variants in addition to 12 832 previously annotated proteins. The searches of the existing tandem MS spectra data set using the AS database identified 29 new proteins encoded by 26 genes. Nine fungal genes appeared to have multiple protein isoforms. In addition to the discovery of splice variants, AS database also showed potential to improve genome annotation. In summary, the introduction of an alternative splicing database helps identify more proteins and unveils more information about a proteome.
Yang, Zhi-Hong; Gordon, Scott M; Sviridov, Denis; Wang, Shuibang; Danner, Robert L; Pryor, Milton; Vaisman, Boris; Shichijo, Yuka; Doisaki, Nobushige; Remaley, Alan T
2017-07-01
Concentrated fish oils, containing a mixture of long-chain monounsaturated fatty acids (LCMUFA) with aliphatic chains longer than 18 C atoms (i.e., C20:1 and C22:1), have been shown to attenuate atherosclerosis development in mouse models. It is not clear, however, how individual LCMUFA isomers may act on atherosclerosis. In the present study, we used saury fish oil-derived concentrates enriched in either C20:1 or C22:1 isomer fractions to investigate their individual effect on atherosclerosis and lipoprotein metabolism. LDLR-deficient (LDLr -/- ) mice were fed a Western diet supplemented with 5% (w/w) of either C20:1 or C22:1 concentrate for 12 wk. Compared to the control Western diet with no supplement, both LCMUFA isomers increased hepatic levels of LCMUFA by 2∼3-fold (p < 0.05), and decreased atherosclerotic lesion areas by more than 40% (p < 0.05), although there were no major differences in plasma lipoproteins or hepatic lipid content. Both LCMUFA isomers significantly decreased plasma CRP levels, improved Abca1-dependent cholesterol efflux capacity of apoB-depleted plasma, and enhanced Ppar transcriptional activities in HepG2 cells. LC-MS/MS proteomic analysis of lipoproteins (HDL, LDL and VLDL) revealed that both LCMUFA isomer diets resulted in similar potentially beneficial alterations in proteins involved in complement activation, blood coagulation, and lipid metabolism. Several lipoprotein proteome changes were significantly correlated with atherosclerotic plaque reduction. Dietary supplementation with the LCMUFA isomers C20:1 or C22:1 was equally effective in reducing atherosclerosis in LDLr -/- mice and this may partly occur through activation of the Ppar signaling pathways and favorable alterations in the proteome of lipoproteins. Published by Elsevier B.V.
Estimation of the proteomic cancer co-expression sub networks by using association estimators.
Erdoğan, Cihat; Kurt, Zeyneb; Diri, Banu
2017-01-01
In this study, the association estimators, which have significant influences on the gene network inference methods and used for determining the molecular interactions, were examined within the co-expression network inference concept. By using the proteomic data from five different cancer types, the hub genes/proteins within the disease-associated gene-gene/protein-protein interaction sub networks were identified. Proteomic data from various cancer types is collected from The Cancer Proteome Atlas (TCPA). Correlation and mutual information (MI) based nine association estimators that are commonly used in the literature, were compared in this study. As the gold standard to measure the association estimators' performance, a multi-layer data integration platform on gene-disease associations (DisGeNET) and the Molecular Signatures Database (MSigDB) was used. Fisher's exact test was used to evaluate the performance of the association estimators by comparing the created co-expression networks with the disease-associated pathways. It was observed that the MI based estimators provided more successful results than the Pearson and Spearman correlation approaches, which are used in the estimation of biological networks in the weighted correlation network analysis (WGCNA) package. In correlation-based methods, the best average success rate for five cancer types was 60%, while in MI-based methods the average success ratio was 71% for James-Stein Shrinkage (Shrink) and 64% for Schurmann-Grassberger (SG) association estimator, respectively. Moreover, the hub genes and the inferred sub networks are presented for the consideration of researchers and experimentalists.
Estimation of the proteomic cancer co-expression sub networks by using association estimators
Kurt, Zeyneb; Diri, Banu
2017-01-01
In this study, the association estimators, which have significant influences on the gene network inference methods and used for determining the molecular interactions, were examined within the co-expression network inference concept. By using the proteomic data from five different cancer types, the hub genes/proteins within the disease-associated gene-gene/protein-protein interaction sub networks were identified. Proteomic data from various cancer types is collected from The Cancer Proteome Atlas (TCPA). Correlation and mutual information (MI) based nine association estimators that are commonly used in the literature, were compared in this study. As the gold standard to measure the association estimators’ performance, a multi-layer data integration platform on gene-disease associations (DisGeNET) and the Molecular Signatures Database (MSigDB) was used. Fisher's exact test was used to evaluate the performance of the association estimators by comparing the created co-expression networks with the disease-associated pathways. It was observed that the MI based estimators provided more successful results than the Pearson and Spearman correlation approaches, which are used in the estimation of biological networks in the weighted correlation network analysis (WGCNA) package. In correlation-based methods, the best average success rate for five cancer types was 60%, while in MI-based methods the average success ratio was 71% for James-Stein Shrinkage (Shrink) and 64% for Schurmann-Grassberger (SG) association estimator, respectively. Moreover, the hub genes and the inferred sub networks are presented for the consideration of researchers and experimentalists. PMID:29145449
NASA Astrophysics Data System (ADS)
Diaz, K. S.; Kim, E. H.; Jones, R. M.; de Leon, K. C.; Woodcroft, B. J.; Tyson, G. W.; Rich, V. I.
2014-12-01
The growing field of metaproteomics links microbial communities to their expressed functions by using mass spectrometry methods to characterize community proteins. Comparison of mass spectrometry protein search algorithms and their biases is crucial for maximizing the quality and amount of protein identifications in mass spectral data. Available algorithms employ different approaches when mapping mass spectra to peptides against a database. We compared mass spectra from four microbial proteomes derived from high-organic content soils searched with two search algorithms: 1) Sequest HT as packaged within Proteome Discoverer (v.1.4) and 2) X!Tandem as packaged in TransProteomicPipeline (v.4.7.1). Searches used matched metagenomes, and results were filtered to allow identification of high probability proteins. There was little overlap in proteins identified by both algorithms, on average just ~24% of the total. However, when adjusted for spectral abundance, the overlap improved to ~70%. Proteome Discoverer generally outperformed X!Tandem, identifying an average of 12.5% more proteins than X!Tandem, with X!Tandem identifying more proteins only in the first two proteomes. For spectrally-adjusted results, the algorithms were similar, with X!Tandem marginally outperforming Proteome Discoverer by an average of ~4%. We then assessed differences in heat shock proteins (HSP) identification by the two algorithms by BLASTing identified proteins against the Heat Shock Protein Information Resource, because HSP hits typically account for the majority signal in proteomes, due to extraction protocols. Total HSP identifications for each of the 4 proteomes were approximately ~15%, ~11%, ~17%, and ~19%, with ~14% for total HSPs with redundancies removed. Of the ~15% average of proteins from the 4 proteomes identified as HSPs, ~10% of proteins and spectra were identified by both algorithms. On average, Proteome Discoverer identified ~9% more HSPs than X!Tandem.
Whetton, Anthony D; Azmi, Norhaida Che; Pearson, Stella; Jaworska, Ewa; Zhang, Liqun; Blance, Rognvald; Kendall, Alexandra C; Nicolaou, Anna; Taylor, Samuel; Williamson, Andrew J K; Pierce, Andrew
2016-03-08
The thrombopoietin receptor (MPL) has been shown to be mutated (MPL W515L) in myelofibrosis and thrombocytosis yet new approaches to treat this disorder are still required. We have previously shown that transcriptome and proteomic effects do not correlate well in oncogene-mediated leukemogenesis. We therefore investigated the effects of MPL W515L using proteomics. The consequences of MPL W515L expression on over 3300 nuclear and 3500 cytoplasmic proteins were assessed using relative quantification mass spectrometry. We demonstrate that MPL W515L expression markedly modulates the CXCL12/CXCR4/CD45 pathway associated with stem and progenitor cell chemotactic movement. We also demonstrated that MPL W515L expressing cells displayed increased chemokinesis which required the MPL W515L-mediated dysregulation of MYC expression via phosphorylation of the RNA transport protein THOC5 on tyrosine 225. In addition MPL W515L expression induced TGFβ secretion which is linked to sphingosine 1-phosphate production and the increased chemokinesis. These studies identify several pathways which offer potential targets for therapeutic intervention in the treatment of MPL W515L-driven malignancy. We validate our approach by showing that CD34+ cells from MPL W515L positive patients display increased chemokinesis and that treatment with a combination of MYC and sphingosine kinase inhibitors leads to the preferential killing of MPL W515L expressing cells.
Whetton, Anthony D.; Azmi, Norhaida Che; Pearson, Stella; Jaworska, Ewa; Zhang, Liqun; Blance, Rognvald; Kendall, Alexandra C.; Nicolaou, Anna; Taylor, Samuel; Williamson, Andrew J.K.; Pierce, Andrew
2016-01-01
The thrombopoietin receptor (MPL) has been shown to be mutated (MPL W515L) in myelofibrosis and thrombocytosis yet new approaches to treat this disorder are still required. We have previously shown that transcriptome and proteomic effects do not correlate well in oncogene-mediated leukemogenesis. We therefore investigated the effects of MPL W515L using proteomics. The consequences of MPL W515L expression on over 3300 nuclear and 3500 cytoplasmic proteins were assessed using relative quantification mass spectrometry. We demonstrate that MPL W515L expression markedly modulates the CXCL12/CXCR4/CD45 pathway associated with stem and progenitor cell chemotactic movement. We also demonstrated that MPL W515L expressing cells displayed increased chemokinesis which required the MPL W515L-mediated dysregulation of MYC expression via phosphorylation of the RNA transport protein THOC5 on tyrosine 225. In addition MPL W515L expression induced TGFβ secretion which is linked to sphingosine 1-phosphate production and the increased chemokinesis. These studies identify several pathways which offer potential targets for therapeutic intervention in the treatment of MPL W515L-driven malignancy. We validate our approach by showing that CD34+ cells from MPL W515L positive patients display increased chemokinesis and that treatment with a combination of MYC and sphingosine kinase inhibitors leads to the preferential killing of MPL W515L expressing cells. PMID:26919114
2013-01-01
Background Contemporary coral reef research has firmly established that a genomic approach is urgently needed to better understand the effects of anthropogenic environmental stress and global climate change on coral holobiont interactions. Here we present KEGG orthology-based annotation of the complete genome sequence of the scleractinian coral Acropora digitifera and provide the first comprehensive view of the genome of a reef-building coral by applying advanced bioinformatics. Description Sequences from the KEGG database of protein function were used to construct hidden Markov models. These models were used to search the predicted proteome of A. digitifera to establish complete genomic annotation. The annotated dataset is published in ZoophyteBase, an open access format with different options for searching the data. A particularly useful feature is the ability to use a Google-like search engine that links query words to protein attributes. We present features of the annotation that underpin the molecular structure of key processes of coral physiology that include (1) regulatory proteins of symbiosis, (2) planula and early developmental proteins, (3) neural messengers, receptors and sensory proteins, (4) calcification and Ca2+-signalling proteins, (5) plant-derived proteins, (6) proteins of nitrogen metabolism, (7) DNA repair proteins, (8) stress response proteins, (9) antioxidant and redox-protective proteins, (10) proteins of cellular apoptosis, (11) microbial symbioses and pathogenicity proteins, (12) proteins of viral pathogenicity, (13) toxins and venom, (14) proteins of the chemical defensome and (15) coral epigenetics. Conclusions We advocate that providing annotation in an open-access searchable database available to the public domain will give an unprecedented foundation to interrogate the fundamental molecular structure and interactions of coral symbiosis and allow critical questions to be addressed at the genomic level based on combined aspects of evolutionary, developmental, metabolic, and environmental perspectives. PMID:23889801
Gilany, Kambiz; Minai-Tehrani, Arash; Savadi-Shiraz, Elham; Rezadoost, Hassan; Lakpour, Niknam
2015-01-01
The human seminal fluid is a complex body fluid. It is not known how many proteins are expressed in the seminal plasma; however in analog with the blood it is possible up to 10,000 proteins are expressed in the seminal plasma. The human seminal fluid is a rich source of potential biomarkers for male infertility and reproduction disorder. In this review, the ongoing list of proteins identified from the human seminal fluid was collected. To date, 4188 redundant proteins of the seminal fluid are identified using different proteomics technology, including 2-DE, SDS-PAGE-LC-MS/MS, MudPIT. However, this was reduced to a database of 2168 non-redundant protein using UniProtKB/Swiss-Prot reviewed database. The core concept of proteome were analyzed including pI, MW, Amino Acids, Chromosome and PTM distribution in the human seminal plasma proteome. Additionally, the biological process, molecular function and KEGG pathway were investigated using DAVID software. Finally, the biomarker identified in different male reproductive system disorder was investigated using proteomics platforms so far. In this study, an attempt was made to update the human seminal plasma proteome database. Our finding showed that human seminal plasma studies used to date seem to have converged on a set of proteins that are repeatedly identified in many studies and that represent only a small fraction of the entire human seminal plasma proteome.
Welker, F
2018-02-20
The study of ancient protein sequences is increasingly focused on the analysis of older samples, including those of ancient hominins. The analysis of such ancient proteomes thereby potentially suffers from "cross-species proteomic effects": the loss of peptide and protein identifications at increased evolutionary distances due to a larger number of protein sequence differences between the database sequence and the analyzed organism. Error-tolerant proteomic search algorithms should theoretically overcome this problem at both the peptide and protein level; however, this has not been demonstrated. If error-tolerant searches do not overcome the cross-species proteomic issue then there might be inherent biases in the identified proteomes. Here, a bioinformatics experiment is performed to test this using a set of modern human bone proteomes and three independent searches against sequence databases at increasing evolutionary distances: the human (0 Ma), chimpanzee (6-8 Ma) and orangutan (16-17 Ma) reference proteomes, respectively. Incorrectly suggested amino acid substitutions are absent when employing adequate filtering criteria for mutable Peptide Spectrum Matches (PSMs), but roughly half of the mutable PSMs were not recovered. As a result, peptide and protein identification rates are higher in error-tolerant mode compared to non-error-tolerant searches but did not recover protein identifications completely. Data indicates that peptide length and the number of mutations between the target and database sequences are the main factors influencing mutable PSM identification. The error-tolerant results suggest that the cross-species proteomics problem is not overcome at increasing evolutionary distances, even at the protein level. Peptide and protein loss has the potential to significantly impact divergence dating and proteome comparisons when using ancient samples as there is a bias towards the identification of conserved sequences and proteins. Effects are minimized between moderately divergent proteomes, as indicated by almost complete recovery of informative positions in the search against the chimpanzee proteome (≈90%, 6-8 Ma). This provides a bioinformatic background to future phylogenetic and proteomic analysis of ancient hominin proteomes, including the future description of novel hominin amino acid sequences, but also has negative implications for the study of fast-evolving proteins in hominins, non-hominin animals, and ancient bacterial proteins in evolutionary contexts.
Proteomic platform for the identification of proteins in olive (Olea europaea) pulp.
Capriotti, Anna Laura; Cavaliere, Chiara; Foglia, Patrizia; Piovesana, Susy; Samperi, Roberto; Stampachiacchiere, Serena; Laganà, Aldo
2013-10-24
The nutritional and cancer-protective properties of the oil extracted mechanically from the ripe fruits of Olea europaea trees are attracting constantly more attention worldwide. The preparation of high-quality protein samples from plant tissues for proteomic analysis poses many challenging problems. In this study we employed a proteomic platform based on two different extraction methods, SDS and CHAPS based protocols, followed by two precipitation protocols, TCA/acetone and MeOH precipitation, in order to increase the final number of identified proteins. The use of advanced MS techniques in combination with the Swissprot and NCBI Viridiplantae databases and TAIR10 Arabidopsis database allowed us to identify 1265 proteins, of which 22 belong to O. europaea. The application of this proteomic platform for protein extraction and identification will be useful also for other proteomic studies on recalcitrant plant/fruit tissues. Copyright © 2013. Published by Elsevier B.V.
Rescuing discarded spectra: Full comprehensive analysis of a minimal proteome.
Lluch-Senar, Maria; Mancuso, Francesco M; Climente-González, Héctor; Peña-Paz, Marcia I; Sabido, Eduard; Serrano, Luis
2016-02-01
A common problem encountered when performing large-scale MS proteome analysis is the loss of information due to the high percentage of unassigned spectra. To determine the causes behind this loss we have analyzed the proteome of one of the smallest living bacteria that can be grown axenically, Mycoplasma pneumoniae (729 ORFs). The proteome of M. pneumoniae cells, grown in defined media, was analyzed by MS. An initial search with both Mascot and a species-specific NCBInr database with common contaminants (NCBImpn), resulted in around 79% of the acquired spectra not having an assignment. The percentage of non-assigned spectra was reduced to 27% after re-analysis of the data with the PEAKS software, thereby increasing the proteome coverage of M. pneumoniae from the initial 60% to over 76%. Nonetheless, 33,413 spectra with assigned amino acid sequences could not be mapped to any NCBInr database protein sequence. Approximately, 1% of these unassigned peptides corresponded to PTMs and 4% to M. pneumoniae protein variants (deamidation and translation inaccuracies). The most abundant peptide sequence variants (Phe-Tyr and Ala-Ser) could be explained by alterations in the editing capacity of the corresponding tRNA synthases. About another 1% of the peptides not associated to any protein had repetitions of the same aromatic/hydrophobic amino acid at the N-terminus, or had Arg/Lys at the C-terminus. Thus, in a model system, we have maximized the number of assigned spectra to 73% (51,453 out of the 70,040 initial acquired spectra). All MS data have been deposited in the ProteomeXchange with identifier PXD002779 (http://proteomecentral.proteomexchange.org/dataset/PXD002779). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Zhang, Yaoyang; Xu, Tao; Shan, Bing; Hart, Jonathan; Aslanian, Aaron; Han, Xuemei; Zong, Nobel; Li, Haomin; Choi, Howard; Wang, Dong; Acharya, Lipi; Du, Lisa; Vogt, Peter K; Ping, Peipei; Yates, John R
2015-11-03
Shotgun proteomics generates valuable information from large-scale and target protein characterizations, including protein expression, protein quantification, protein post-translational modifications (PTMs), protein localization, and protein-protein interactions. Typically, peptides derived from proteolytic digestion, rather than intact proteins, are analyzed by mass spectrometers because peptides are more readily separated, ionized and fragmented. The amino acid sequences of peptides can be interpreted by matching the observed tandem mass spectra to theoretical spectra derived from a protein sequence database. Identified peptides serve as surrogates for their proteins and are often used to establish what proteins were present in the original mixture and to quantify protein abundance. Two major issues exist for assigning peptides to their originating protein. The first issue is maintaining a desired false discovery rate (FDR) when comparing or combining multiple large datasets generated by shotgun analysis and the second issue is properly assigning peptides to proteins when homologous proteins are present in the database. Herein we demonstrate a new computational tool, ProteinInferencer, which can be used for protein inference with both small- or large-scale data sets to produce a well-controlled protein FDR. In addition, ProteinInferencer introduces confidence scoring for individual proteins, which makes protein identifications evaluable. This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015. Published by Elsevier B.V.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jacobs, Jon M.; Diamond, Deborah L.; Chan, Eric Y.
2005-06-01
The development of a reproducible model system for the study of Hepatitis C virus (HCV) infection has the potential to significantly enhance the study of virus-host interactions and provide future direction for modeling the pathogenesis of HCV. While there are studies describing global gene expression changes associated with HCV infection, changes in the proteome have not been characterized. We report the first large scale proteome analysis of the highly permissive Huh-7.5 cell line containing a full length HCV replicon. We detected > 4,400 proteins in this cell line, including HCV replicon proteins, using multidimensional liquid chromatographic (LC) separations coupled tomore » mass spectrometry (MS). The set of Huh-7.5 proteins confidently identified is, to our knowledge, the most comprehensive yet reported for a human cell line. Consistent with the literature, a comparison of Huh-7.5 cells (+) and (-) the HCV replicon identified expression changes of proteins involved in lipid metabolism. We extended these analyses to liver biopsy material from HCV-infected patients where > 1,500 proteins were detected from 2 {micro}g protein lysate using the Huh-7.5 protein database and the accurate mass and time (AMT) tag strategy. These findings demonstrate the utility of multidimensional proteome analysis of the HCV replicon model system for assisting the determination of proteins/pathways affected by HCV infection. Our ability to extend these analyses to the highly complex proteome of small liver biopsies with limiting protein yields offers the unique opportunity to begin evaluating the clinical significance of protein expression changes associated with HCV infection.« less
PIQMIe: a web server for semi-quantitative proteomics data management and analysis
Kuzniar, Arnold; Kanaar, Roland
2014-01-01
We present the Proteomics Identifications and Quantitations Data Management and Integration Service or PIQMIe that aids in reliable and scalable data management, analysis and visualization of semi-quantitative mass spectrometry based proteomics experiments. PIQMIe readily integrates peptide and (non-redundant) protein identifications and quantitations from multiple experiments with additional biological information on the protein entries, and makes the linked data available in the form of a light-weight relational database, which enables dedicated data analyses (e.g. in R) and user-driven queries. Using the web interface, users are presented with a concise summary of their proteomics experiments in numerical and graphical forms, as well as with a searchable protein grid and interactive visualization tools to aid in the rapid assessment of the experiments and in the identification of proteins of interest. The web server not only provides data access through a web interface but also supports programmatic access through RESTful web service. The web server is available at http://piqmie.semiqprot-emc.cloudlet.sara.nl or http://www.bioinformatics.nl/piqmie. This website is free and open to all users and there is no login requirement. PMID:24861615
Informed-Proteomics: open-source software package for top-down proteomics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Park, Jungkap; Piehowski, Paul D.; Wilkins, Christopher
Top-down proteomics involves the analysis of intact proteins. This approach is very attractive as it allows for analyzing proteins in their endogenous form without proteolysis, preserving valuable information about post-translation modifications, isoforms, proteolytic processing or their combinations collectively called proteoforms. Moreover, the quality of the top-down LC-MS/MS datasets is rapidly increasing due to advances in the liquid chromatography and mass spectrometry instrumentation and sample processing protocols. However, the top-down mass spectra are substantially more complex compare to the more conventional bottom-up data. To take full advantage of the increasing quality of the top-down LC-MS/MS datasets there is an urgent needmore » to develop algorithms and software tools for confident proteoform identification and quantification. In this study we present a new open source software suite for top-down proteomics analysis consisting of an LC-MS feature finding algorithm, a database search algorithm, and an interactive results viewer. The presented tool along with several other popular tools were evaluated using human-in-mouse xenograft luminal and basal breast tumor samples that are known to have significant differences in protein abundance based on bottom-up analysis.« less
PIQMIe: a web server for semi-quantitative proteomics data management and analysis.
Kuzniar, Arnold; Kanaar, Roland
2014-07-01
We present the Proteomics Identifications and Quantitations Data Management and Integration Service or PIQMIe that aids in reliable and scalable data management, analysis and visualization of semi-quantitative mass spectrometry based proteomics experiments. PIQMIe readily integrates peptide and (non-redundant) protein identifications and quantitations from multiple experiments with additional biological information on the protein entries, and makes the linked data available in the form of a light-weight relational database, which enables dedicated data analyses (e.g. in R) and user-driven queries. Using the web interface, users are presented with a concise summary of their proteomics experiments in numerical and graphical forms, as well as with a searchable protein grid and interactive visualization tools to aid in the rapid assessment of the experiments and in the identification of proteins of interest. The web server not only provides data access through a web interface but also supports programmatic access through RESTful web service. The web server is available at http://piqmie.semiqprot-emc.cloudlet.sara.nl or http://www.bioinformatics.nl/piqmie. This website is free and open to all users and there is no login requirement. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
The PROTICdb database for 2-DE proteomics.
Langella, Olivier; Zivy, Michel; Joets, Johann
2007-01-01
PROTICdb is a web-based database mainly designed to store and analyze plant proteome data obtained by 2D polyacrylamide gel electrophoresis (2D PAGE) and mass spectrometry (MS). The goals of PROTICdb are (1) to store, track, and query information related to proteomic experiments, i.e., from tissue sampling to protein identification and quantitative measurements; and (2) to integrate information from the user's own expertise and other sources into a knowledge base, used to support data interpretation (e.g., for the determination of allelic variants or products of posttranslational modifications). Data insertion into the relational database of PROTICdb is achieved either by uploading outputs from Mélanie, PDQuest, IM2d, ImageMaster(tm) 2D Platinum v5.0, Progenesis, Sequest, MS-Fit, and Mascot software, or by filling in web forms (experimental design and methods). 2D PAGE-annotated maps can be displayed, queried, and compared through the GelBrowser. Quantitative data can be easily exported in a tabulated format for statistical analyses with any third-party software. PROTICdb is based on the Oracle or the PostgreSQLDataBase Management System (DBMS) and is freely available upon request at http://cms.moulon.inra.fr/content/view/14/44/.
Proteome reference map and regulation network of neonatal rat cardiomyocyte
Li, Zi-jian; Liu, Ning; Han, Qi-de; Zhang, You-yi
2011-01-01
Aim: To study and establish a proteome reference map and regulation network of neonatal rat cardiomyocyte. Methods: Cultured cardiomyocytes of neonatal rats were used. All proteins expressed in the cardiomyocytes were separated and identified by two-dimensional polyacrylamide gel electrophoresis (2-DE) and matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS). Biological networks and pathways of the neonatal rat cardiomyocytes were analyzed using the Ingenuity Pathway Analysis (IPA) program (www.ingenuity.com). A 2-DE database was made accessible on-line by Make2ddb package on a web server. Results: More than 1000 proteins were separated on 2D gels, and 148 proteins were identified. The identified proteins were used for the construction of an extensible markup language-based database. Biological networks and pathways were constructed to analyze the functions associate with cardiomyocyte proteins in the database. The 2-DE database of rat cardiomyocyte proteins can be accessed at http://2d.bjmu.edu.cn. Conclusion: A proteome reference map and regulation network of the neonatal rat cardiomyocytes have been established, which may serve as an international platform for storage, analysis and visualization of cardiomyocyte proteomic data. PMID:21841810
2010-01-01
Background Papaver somniferum (opium poppy) is the source for several pharmaceutical benzylisoquinoline alkaloids including morphine, the codeine and sanguinarine. In response to treatment with a fungal elicitor, the biosynthesis and accumulation of sanguinarine is induced along with other plant defense responses in opium poppy cell cultures. The transcriptional induction of alkaloid metabolism in cultured cells provides an opportunity to identify components of this process via the integration of deep transcriptome and proteome databases generated using next-generation technologies. Results A cDNA library was prepared for opium poppy cell cultures treated with a fungal elicitor for 10 h. Using 454 GS-FLX Titanium pyrosequencing, 427,369 expressed sequence tags (ESTs) with an average length of 462 bp were generated. Assembly of these sequences yielded 93,723 unigenes, of which 23,753 were assigned Gene Ontology annotations. Transcripts encoding all known sanguinarine biosynthetic enzymes were identified in the EST database, 5 of which were represented among the 50 most abundant transcripts. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) of total protein extracts from cell cultures treated with a fungal elicitor for 50 h facilitated the identification of 1,004 proteins. Proteins were fractionated by one-dimensional SDS-PAGE and digested with trypsin prior to LC-MS/MS analysis. Query of an opium poppy-specific EST database substantially enhanced peptide identification. Eight out of 10 known sanguinarine biosynthetic enzymes and many relevant primary metabolic enzymes were represented in the peptide database. Conclusions The integration of deep transcriptome and proteome analyses provides an effective platform to catalogue the components of secondary metabolism, and to identify genes encoding uncharacterized enzymes. The establishment of corresponding transcript and protein databases generated by next-generation technologies in a system with a well-defined metabolite profile facilitates an improved linkage between genes, enzymes, and pathway components. The proteome database represents the most relevant alkaloid-producing enzymes, compared with the much deeper and more complete transcriptome library. The transcript database contained full-length mRNAs encoding most alkaloid biosynthetic enzymes, which is a key requirement for the functional characterization of novel gene candidates. PMID:21083930
An emerging cyberinfrastructure for biodefense pathogen and pathogen-host data.
Zhang, C; Crasta, O; Cammer, S; Will, R; Kenyon, R; Sullivan, D; Yu, Q; Sun, W; Jha, R; Liu, D; Xue, T; Zhang, Y; Moore, M; McGarvey, P; Huang, H; Chen, Y; Zhang, J; Mazumder, R; Wu, C; Sobral, B
2008-01-01
The NIAID-funded Biodefense Proteomics Resource Center (RC) provides storage, dissemination, visualization and analysis capabilities for the experimental data deposited by seven Proteomics Research Centers (PRCs). The data and its publication is to support researchers working to discover candidates for the next generation of vaccines, therapeutics and diagnostics against NIAID's Category A, B and C priority pathogens. The data includes transcriptional profiles, protein profiles, protein structural data and host-pathogen protein interactions, in the context of the pathogen life cycle in vivo and in vitro. The database has stored and supported host or pathogen data derived from Bacillus, Brucella, Cryptosporidium, Salmonella, SARS, Toxoplasma, Vibrio and Yersinia, human tissue libraries, and mouse macrophages. These publicly available data cover diverse data types such as mass spectrometry, yeast two-hybrid (Y2H), gene expression profiles, X-ray and NMR determined protein structures and protein expression clones. The growing database covers over 23 000 unique genes/proteins from different experiments and organisms. All of the genes/proteins are annotated and integrated across experiments using UniProt Knowledgebase (UniProtKB) accession numbers. The web-interface for the database enables searching, querying and downloading at the level of experiment, group and individual gene(s)/protein(s) via UniProtKB accession numbers or protein function keywords. The system is accessible at http://www.proteomicsresource.org/.
The Mouse Heart Attack Research Tool (mHART) 1.0 Database.
DeLeon-Pennell, Kristine Y; Iyer, Rugmani Padmanabhan; Ma, Yonggang; Yabluchanskiy, Andriy; Zamilpa, Rogelio; Chiao, Ying Ann; Cannon, Presley; Cates, Courtney; Flynn, Elizabeth R; Halade, Ganesh V; de Castro Bras, Lisandra E; Lindsey, Merry L
2018-05-18
The generation of Big Data has enabled systems-level dissections into the mechanisms of cardiovascular pathology. Integration of genetic, proteomic, and pathophysiological variables across platforms and laboratories fosters discoveries through multidisciplinary investigations and minimizes unnecessary redundancy in research efforts. The Mouse Heart Attack Research Tool (mHART) consolidates a large dataset of over 10 years of experiments from a single laboratory for cardiovascular investigators to generate novel hypotheses and identify new predictive markers of progressive left ventricular remodeling following myocardial infarction (MI) in mice. We designed the mHART REDCap database using our own data to integrate cardiovascular community participation. We generated physiological, biochemical, cellular, and proteomic outputs from plasma and left ventricles obtained from post-MI and no MI (naïve) control groups. We included both male and female mice ranging in age from 3 to 36 months old. After variable collection, data underwent quality assessment for data curation (e.g. eliminate technical errors, check for completeness, remove duplicates, and define terms). Currently, mHART 1.0 contains >888,000 data points and includes results from >2,100 unique mice. Database performance was tested and an example provided to illustrate database utility. This report explains how the first version of the mHART database was established and provides researchers with a standard framework to aid in the integration of their data into our database or in the development of a similar database.
The secrets of Oriental panacea: Panax ginseng.
Colzani, Mara; Altomare, Alessandra; Caliendo, Matteo; Aldini, Giancarlo; Righetti, Pier Giorgio; Fasoli, Elisa
2016-01-01
The Panax ginseng root proteome has been investigated via capture with combinatorial peptide ligand libraries (CPLL) at three different pH values. Proteomic characterization by SDS-PAGE and nLC–MS/MS analysis, via LTQ-Orbitrap XL, led to the identification of a total of 207 expressed proteins. This quite large number of identifications was achieved by consulting two different plant databases: P. ginseng and Arabidopsis thaliana. The major groups of identified proteins were associated to structural species (19.2%), oxidoreductase (19.5%), dehydrogenases (7.6%) and synthases (9.0%). For the first time, an exploration of protein–protein interactions was performed by merging all recognized proteins and building an interactomic map, characterized by 196 nodes and 1554 interactions. Finally a peptidomic analysis was developed combining different in-silico enzymatic digestions to simulate the human gastrointestinal process: from 661 generated peptides, 95 were identified as possible bioactives and in particular 6 of them were characterized by antimicrobial activity. The present report offers new insight for future investigations focused on elucidation of biological properties of P. ginseng proteome and peptidome. Ginseng is a traditional oriental herbal remedy whose use is very diffused in all the world for its numerous pharmacological effects. However, the exact mechanism of action of ginseng components, both ginsenosides and proteins, is still unidentified. So the common use of ginseng requires strict investigations to assess both its efficiency and its safety. Although many reports have been published regarding the pharmacological effects of ginseng, little is known about the biochemical pathways of root. Proteomics analysis could be useful to elucidate the physiological pathways. In this manuscript, an integrated approach to proteomics and peptidomics will usher in exploration of Panax ginseng proteins and proteolytic peptides, obtained by in-silico gastrointestinal digestion, characterized by antimicrobial action. The present research would pave the way for better knowledge of metabolic functions connected with ginseng proteome and provide with new information necessary to understand better antimicrobial activity of P. ginseng.
Ferro, Myriam; Brugière, Sabine; Salvi, Daniel; Seigneurin-Berny, Daphné; Court, Magali; Moyet, Lucas; Ramus, Claire; Miras, Stéphane; Mellal, Mourad; Le Gall, Sophie; Kieffer-Jaquinod, Sylvie; Bruley, Christophe; Garin, Jérôme; Joyard, Jacques; Masselon, Christophe; Rolland, Norbert
2010-06-01
Recent advances in the proteomics field have allowed a series of high throughput experiments to be conducted on chloroplast samples, and the data are available in several public databases. However, the accurate localization of many chloroplast proteins often remains hypothetical. This is especially true for envelope proteins. We went a step further into the knowledge of the chloroplast proteome by focusing, in the same set of experiments, on the localization of proteins in the stroma, the thylakoids, and envelope membranes. LC-MS/MS-based analyses first allowed building the AT_CHLORO database (http://www.grenoble.prabi.fr/protehome/grenoble-plant-proteomics/), a comprehensive repertoire of the 1323 proteins, identified by 10,654 unique peptide sequences, present in highly purified chloroplasts and their subfractions prepared from Arabidopsis thaliana leaves. This database also provides extensive proteomics information (peptide sequences and molecular weight, chromatographic retention times, MS/MS spectra, and spectral count) for a unique chloroplast protein accurate mass and time tag database gathering identified peptides with their respective and precise analytical coordinates, molecular weight, and retention time. We assessed the partitioning of each protein in the three chloroplast compartments by using a semiquantitative proteomics approach (spectral count). These data together with an in-depth investigation of the literature were compiled to provide accurate subplastidial localization of previously known and newly identified proteins. A unique knowledge base containing extensive information on the proteins identified in envelope fractions was thus obtained, allowing new insights into this membrane system to be revealed. Altogether, the data we obtained provide unexpected information about plastidial or subplastidial localization of some proteins that were not suspected to be associated to this membrane system. The spectral counting-based strategy was further validated as the compartmentation of well known pathways (for instance, photosynthesis and amino acid, fatty acid, or glycerolipid biosynthesis) within chloroplasts could be dissected. It also allowed revisiting the compartmentation of the chloroplast metabolism and functions.
Wegrzynowicz, Michal; Holt, Hunter K; Friedman, David B; Bowman, Aaron B
2012-02-03
Huntington's disease (HD) is a neurodegenerative disorder caused by expansion of a CAG repeat within the Huntingtin (HTT) gene, though the clinical presentation of disease and age-of-onset are strongly influenced by ill-defined environmental factors. We recently reported a gene-environment interaction wherein expression of mutant HTT is associated with neuroprotection against manganese (Mn) toxicity. Here, we are testing the hypothesis that this interaction may be manifested by altered protein expression patterns in striatum, a primary target of both neurodegeneration in HD and neurotoxicity of Mn. To this end, we compared striatal proteomes of wild-type and HD (YAC128Q) mice exposed to vehicle or Mn. Principal component analysis of proteomic data revealed that Mn exposure disrupted a segregation of WT versus mutant proteomes by the major principal component observed in vehicle-exposed mice. Identification of altered proteins revealed novel markers of Mn toxicity, particularly proteins involved in glycolysis, excitotoxicity, and cytoskeletal dynamics. In addition, YAC128Q-dependent changes suggest that axonal pathology may be an early feature in HD pathogenesis. Finally, for several proteins, genotype-specific responses to Mn were observed. These differences include increased sensitivity to exposure in YAC128Q mice (UBQLN1) and amelioration of some mutant HTT-induced alterations (SAE1, ENO1). We conclude that the interaction of Mn and mutant HTT may suppress proteomic phenotypes of YAC128Q mice, which could reveal potential targets in novel treatment strategies for HD.
Guidelines for reporting quantitative mass spectrometry based experiments in proteomics.
Martínez-Bartolomé, Salvador; Deutsch, Eric W; Binz, Pierre-Alain; Jones, Andrew R; Eisenacher, Martin; Mayer, Gerhard; Campos, Alex; Canals, Francesc; Bech-Serra, Joan-Josep; Carrascal, Montserrat; Gay, Marina; Paradela, Alberto; Navajas, Rosana; Marcilla, Miguel; Hernáez, María Luisa; Gutiérrez-Blázquez, María Dolores; Velarde, Luis Felipe Clemente; Aloria, Kerman; Beaskoetxea, Jabier; Medina-Aunon, J Alberto; Albar, Juan P
2013-12-16
Mass spectrometry is already a well-established protein identification tool and recent methodological and technological developments have also made possible the extraction of quantitative data of protein abundance in large-scale studies. Several strategies for absolute and relative quantitative proteomics and the statistical assessment of quantifications are possible, each having specific measurements and therefore, different data analysis workflows. The guidelines for Mass Spectrometry Quantification allow the description of a wide range of quantitative approaches, including labeled and label-free techniques and also targeted approaches such as Selected Reaction Monitoring (SRM). The HUPO Proteomics Standards Initiative (HUPO-PSI) has invested considerable efforts to improve the standardization of proteomics data handling, representation and sharing through the development of data standards, reporting guidelines, controlled vocabularies and tooling. In this manuscript, we describe a key output from the HUPO-PSI-namely the MIAPE Quant guidelines, which have developed in parallel with the corresponding data exchange format mzQuantML [1]. The MIAPE Quant guidelines describe the HUPO-PSI proposal concerning the minimum information to be reported when a quantitative data set, derived from mass spectrometry (MS), is submitted to a database or as supplementary information to a journal. The guidelines have been developed with input from a broad spectrum of stakeholders in the proteomics field to represent a true consensus view of the most important data types and metadata, required for a quantitative experiment to be analyzed critically or a data analysis pipeline to be reproduced. It is anticipated that they will influence or be directly adopted as part of journal guidelines for publication and by public proteomics databases and thus may have an impact on proteomics laboratories across the world. This article is part of a Special Issue entitled: Standardization and Quality Control. Copyright © 2013 Elsevier B.V. All rights reserved.
Jeong, Seul-Ki; Hancock, William S; Paik, Young-Ki
2015-09-04
Since the launch of the Chromosome-centric Human Proteome Project (C-HPP) in 2012, the number of "missing" proteins has fallen to 2932, down from ∼5932 since the number was first counted in 2011. We compared the characteristics of missing proteins with those of already annotated proteins with respect to transcriptional expression pattern and the time periods in which newly identified proteins were annotated. We learned that missing proteins commonly exhibit lower levels of transcriptional expression and less tissue-specific expression compared with already annotated proteins. This makes it more difficult to identify missing proteins as time goes on. One of the C-HPP goals is to identify alternative spliced product of proteins (ASPs), which are usually difficult to find by shot-gun proteomic methods due to their sequence similarities with the representative proteins. To resolve this problem, it may be necessary to use a targeted proteomics approach (e.g., selected and multiple reaction monitoring [S/MRM] assays) and an innovative bioinformatics platform that enables the selection of target peptides for rarely expressed missing proteins or ASPs. Given that the success of efforts to identify missing proteins may rely on more informative public databases, it was necessary to upgrade the available integrative databases. To this end, we attempted to improve the features and utility of GenomewidePDB by integrating transcriptomic information (e.g., alternatively spliced transcripts), annotated peptide information, and an advanced search interface that can find proteins of interest when applying a targeted proteomics strategy. This upgraded version of the database, GenomewidePDB 2.0, may not only expedite identification of the remaining missing proteins but also enhance the exchange of information among the proteome community. GenomewidePDB 2.0 is available publicly at http://genomewidepdb.proteomix.org/.
Piro, Amalia; Serra, Ilia Anna; Spadafora, Antonia; Cardilio, Monica; Bianco, Linda; Perrotta, Gaetano; Santos, Rui; Mazzuca, Silvia
2015-12-01
Posidonia oceanica is a marine angiosperm, or seagrass, adapted to grow to the underwater life from shallow waters to 50 m depth. This raises questions of how their photosynthesis adapted to the attenuation of light through the water column and leads to the assumption that biochemistry and metabolism of the chloroplast are the basis of adaptive capacity. In the present study, we described a protocol that was adapted from those optimized for terrestrial plants, to extract chloroplasts from as minimal tissue as possible. We obtained the best balance between tissue amount/intact chloroplasts yield using one leaf from one plant. After isopynic separations, the chloroplasts purity and integrity were evaluated by biochemical assay and using a proteomic approach. Chloroplast proteins were extracted from highly purified organelles and resolved by 1DE SDS-PAGE. Proteins were sequenced by nLC-ESI-IT-MS/MS of 1DE gel bands and identified against NCBInr green plant databases, Dr. Zompo database for seagrasses in a local customized dataset. The curated localization of proteins in sub-plastidial compartments (i.e. envelope, stroma and thylakoids) was retrieved in the AT_CHLORO database. This purification protocol and the validation of compartment markers may serve as basis for sub-cellular proteomics in P. oceanica and other seagrasses. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
High throughput profile-profile based fold recognition for the entire human proteome.
McGuffin, Liam J; Smith, Richard T; Bryson, Kevin; Sørensen, Søren-Aksel; Jones, David T
2006-06-07
In order to maintain the most comprehensive structural annotation databases we must carry out regular updates for each proteome using the latest profile-profile fold recognition methods. The ability to carry out these updates on demand is necessary to keep pace with the regular updates of sequence and structure databases. Providing the highest quality structural models requires the most intensive profile-profile fold recognition methods running with the very latest available sequence databases and fold libraries. However, running these methods on such a regular basis for every sequenced proteome requires large amounts of processing power. In this paper we describe and benchmark the JYDE (Job Yield Distribution Environment) system, which is a meta-scheduler designed to work above cluster schedulers, such as Sun Grid Engine (SGE) or Condor. We demonstrate the ability of JYDE to distribute the load of genomic-scale fold recognition across multiple independent Grid domains. We use the most recent profile-profile version of our mGenTHREADER software in order to annotate the latest version of the Human proteome against the latest sequence and structure databases in as short a time as possible. We show that our JYDE system is able to scale to large numbers of intensive fold recognition jobs running across several independent computer clusters. Using our JYDE system we have been able to annotate 99.9% of the protein sequences within the Human proteome in less than 24 hours, by harnessing over 500 CPUs from 3 independent Grid domains. This study clearly demonstrates the feasibility of carrying out on demand high quality structural annotations for the proteomes of major eukaryotic organisms. Specifically, we have shown that it is now possible to provide complete regular updates of profile-profile based fold recognition models for entire eukaryotic proteomes, through the use of Grid middleware such as JYDE.
Seeds in Chernobyl: the database on proteome response on radioactive environment
Klubicová, Katarína; Vesel, Martin; Rashydov, Namik M.; Hajduch, Martin
2012-01-01
Two serious nuclear accidents during the last quarter century (Chernobyl, 1986 and Fukushima, 2011) contaminated large agricultural areas with radioactivity. The database “Seeds in Chernobyl” (http://www.chernobylproteomics.sav.sk) contains the information about the abundances of hundreds of proteins from on-going investigation of mature and developing seed harvested from plants grown in radioactive Chernobyl area. This database provides a useful source of information concerning the response of the seed proteome to permanently increased level of ionizing radiation in a user-friendly format. PMID:23087698
Comparative Proteomic Insights into the Lactate Responses of Halophilic Salinicoccus roseus W12
Wang, Hongyan; Wang, Limin; Yang, Han; Cai, Yumeng; Sun, Lifan; Xue, Yanfen; Yu, Bo; Ma, Yanhe
2015-01-01
Extremophiles use adaptive mechanisms to survive in extreme environments, which is of great importance for several biotechnological applications. A halophilic strain, Salinicoccus roseus W12, was isolated from salt lake in Inner Mongolia, China in this study. The ability of the strain to survive under high sodium conditions (including 20% sodium lactate or 25% sodium chloride, [w/v]) made it an ideal host to screen for key factors related to sodium lactate resistance. The proteomic responses to lactate were studied using W12 cells cultivated with or without lactate stress. A total of 1,656 protein spots in sodium lactate-treated culture and 1,843 spots in NaCl-treated culture were detected by 2-dimensional gel electrophoresis, and 32 of 120 significantly altered protein spots (fold change > 2, p < 0.05) were identified by matrix-assisted laser-desorption ionization time-of-flight mass spectrometry. Among 21 successfully identified spots, 19 proteins were upregulated and 2 were downregulated. The identified proteins are mainly involved in metabolism, cellular processes and signaling, and information storage and processing. Transcription studies confirmed that most of the encoding genes were upregulated after the cells were exposed to lactate in 10 min. Cross-protecting and energy metabolism-related proteins played an important role in lactate tolerance for S. roseus W12. PMID:26358621
Proteomic analysis of blue light-induced twining response in Cuscuta australis.
Li, Dongxiao; Wang, Liangjiang; Yang, Xiaopo; Zhang, Guoguang; Chen, Liang
2010-01-01
The parasitic plant Cuscuta australis (dodder) invades a variety of species by entwining the stem and leaves of a host and developing haustoria. The twining response prior to haustoria formation is regarded as the first sign for dodders to parasitize host plants, and thus has been the focus of studies on the host-parasite interaction. However, the molecular mechanism is still poorly understood. In the present work, we have investigated the different effects of blue and white light on the twining response, and identified a set of proteins that were differentially expressed in dodder seedlings using a proteomic approach. Approximately 1,800 protein spots were detected on each 2-D gel, and 47 spots with increased or decreased protein levels were selected and analyzed with MALDI-TOF-MS. Peptide mass fingerprints (PMFs) obtained for these spots were used for protein identification through cross-species database searches. The results suggest that the blue light-induced twining response in dodder seedlings may be mediated by proteins involved in light signal transduction, cell wall degradation, cell structure, and metabolism.
Demircan, Turan; Keskin, Ilknur; Dumlu, Seda Nilgün; Aytürk, Nilüfer; Avşaroğlu, Mahmut Erhan; Akgün, Emel; Öztürk, Gürkan; Baykal, Ahmet Tarık
2017-01-01
Salamander axolotl has been emerging as an important model for stem cell research due to its powerful regenerative capacity. Several advantages, such as the high capability of advanced tissue, organ, and appendages regeneration, promote axolotl as an ideal model system to extend our current understanding on the mechanisms of regeneration. Acknowledging the common molecular pathways between amphibians and mammals, there is a great potential to translate the messages from axolotl research to mammalian studies. However, the utilization of axolotl is hindered due to the lack of reference databases of genomic, transcriptomic, and proteomic data. Here, we introduce the proteome analysis of the axolotl tail section searched against an mRNA-seq database. We translated axolotl mRNA sequences to protein sequences and annotated these to process the LC-MS/MS data and identified 1001 nonredundant proteins. Functional classification of identified proteins was performed by gene ontology searches. The presence of some of the identified proteins was validated by in situ antibody labeling. Furthermore, we have analyzed the proteome expressional changes postamputation at three time points to evaluate the underlying mechanisms of the regeneration process. Taken together, this work expands the proteomics data of axolotl to contribute to its establishment as a fully utilized model. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Pan, Lang; Zhang, Jian; Wang, Junzhi; Yu, Qin; Bai, Lianyang; Dong, Liyao
2017-05-08
American sloughgrass (Beckmannia syzigachne Steud.) is a weed widely distributed in wheat fields of China. In recent years, the evolution of herbicide (fenoxaprop-P-ethyl)-resistant populations has decreased the susceptibility of B. syzigachne. This study compared 4 B. syzigachne populations (3 resistant and 1 susceptible) using iTRAQ to characterize fenoxaprop-P-ethyl resistance in B. syzigachne at the proteomic level. Through searching the UniProt database, 3104 protein species were identified from 13,335 unique peptides. Approximately 2834 protein species were assigned to 23 functional classifications provided by the COG database. Among these, 2299 protein species were assigned to 125 predicted pathways. The resistant biotype contained 8 protein species that changed in abundance relative to the susceptible biotype; they were involved in photosynthesis, oxidative phosphorylation, and fatty acid biosynthesis pathways. In contrast to previous studies comparing only 1 resistant and 1 susceptible population, our use of 3 fenoxaprop-resistant B. syzigachne populations with different genetic backgrounds minimized irrelevant differential expression and eliminated false positives. Therefore, we could more confidently link the differentially expressed proteins to herbicide resistance. Proteomic analysis demonstrated that fenoxaprop-P-ethyl resistance is associated with photosynthetic capacity, a connection that might be related to the target-site mutations in resistant B. syzigachne. This is the first large-scale proteomics study examining herbicide stress responses in different B. syzigachne biotypes. This study has biological relevance because it is the first to employ proteomic analysis for understanding the mechanisms underlying Beckmannia syzigachne herbicide resistance. The plant is a major weed in China and negatively affects crop yield, but has developed considerable resistance to the most common herbicide, fenoxaprop-P-ethyl. Through comparisons of resistant and sensitive biotypes, our study identified multiple proteins (involved in photosynthesis, oxidative phosphorylation, and fatty acid biosynthesis) that are putatively linked to B. syzigachne herbicide response. This large-scale proteomics study, sorely lacking in weed science, contributes valuable data that can be applied to more fine-tuned analyses on the functions of specific proteins in herbicide resistance. Copyright © 2017 Elsevier B.V. All rights reserved.
Hernandez-Valladares, Maria; Vaudel, Marc; Selheim, Frode; Berven, Frode; Bruserud, Øystein
2017-08-01
Mass spectrometry (MS)-based proteomics has become an indispensable tool for the characterization of the proteome and its post-translational modifications (PTM). In addition to standard protein sequence databases, proteogenomics strategies search the spectral data against the theoretical spectra obtained from customized protein sequence databases. Up to date, there are no published proteogenomics studies on acute myeloid leukemia (AML) samples. Areas covered: Proteogenomics involves the understanding of genomic and proteomic data. The intersection of both datatypes requires advanced bioinformatics skills. A standard proteogenomics workflow that could be used for the study of AML samples is described. The generation of customized protein sequence databases as well as bioinformatics tools and pipelines commonly used in proteogenomics are discussed in detail. Expert commentary: Drawing on evidence from recent cancer proteogenomics studies and taking into account the public availability of AML genomic data, the interpretation of present and future MS-based AML proteomic data using AML-specific protein sequence databases could discover new biological mechanisms and targets in AML. However, proteogenomics workflows including bioinformatics guidelines can be challenging for the wide AML research community. It is expected that further automation and simplification of the bioinformatics procedures might attract AML investigators to adopt the proteogenomics strategy.
Morphinome Database - The database of proteins altered by morphine administration - An update.
Bodzon-Kulakowska, Anna; Padrtova, Tereza; Drabik, Anna; Ner-Kluza, Joanna; Antolak, Anna; Kulakowski, Konrad; Suder, Piotr
2018-04-13
Morphine is considered a gold standard in pain treatment. Nevertheless, its use could be associated with severe side effects, including drug addiction. Thus, it is very important to understand the molecular mechanism of morphine action in order to develop new methods of pain therapy, or at least to attenuate the side effects of opioids usage. Proteomics allows for the indication of proteins involved in certain biological processes, but the number of items identified in a single study is usually overwhelming. Thus, researchers face the difficult problem of choosing the proteins which are really important for the investigated processes and worth further studies. Therefore, based on the 29 published articles, we created a database of proteins regulated by morphine administration - The Morphinome Database (addiction-proteomics.org). This web tool allows for indicating proteins that were identified during different proteomics studies. Moreover, the collection and organization of such a vast amount of data allows us to find the same proteins that were identified in various studies and to create their ranking, based on the frequency of their identification. STRING and KEGG databases indicated metabolic pathways which those molecules are involved in. This means that those molecular pathways seem to be strongly affected by morphine administration and could be important targets for further investigations. The data about proteins identified by different proteomics studies of molecular changes caused by morphine administration (29 published articles) were gathered in the Morphinome Database. Unification of those data allowed for the identification of proteins that were indicated several times by distinct proteomics studies, which means that they seem to be very well verified and important for the entire process. Those proteins might be now considered promising aims for more detailed studies of their role in the molecular mechanism of morphine action. Copyright © 2018. Published by Elsevier B.V.
Choosing an Optimal Database for Protein Identification from Tandem Mass Spectrometry Data.
Kumar, Dhirendra; Yadav, Amit Kumar; Dash, Debasis
2017-01-01
Database searching is the preferred method for protein identification from digital spectra of mass to charge ratios (m/z) detected for protein samples through mass spectrometers. The search database is one of the major influencing factors in discovering proteins present in the sample and thus in deriving biological conclusions. In most cases the choice of search database is arbitrary. Here we describe common search databases used in proteomic studies and their impact on final list of identified proteins. We also elaborate upon factors like composition and size of the search database that can influence the protein identification process. In conclusion, we suggest that choice of the database depends on the type of inferences to be derived from proteomics data. However, making additional efforts to build a compact and concise database for a targeted question should generally be rewarding in achieving confident protein identifications.
Proteomic profile of dormant Trichophyton Rubrum conidia
Leng, Wenchuan; Liu, Tao; Li, Rui; Yang, Jian; Wei, Candong; Zhang, Wenliang; Jin, Qi
2008-01-01
Background Trichophyton rubrum is the most common dermatophyte causing fungal skin infections in humans. Asexual sporulation is an important means of propagation for T. rubrum, and conidia produced by this way are thought to be the primary cause of human infections. Despite their importance in pathogenesis, the conidia of T. rubrum remain understudied. We intend to intensively investigate the proteome of dormant T. rubrum conidia to characterize its molecular and cellular features and to enhance the development of novel therapeutic strategies. Results The proteome of T. rubrum conidia was analyzed by combining shotgun proteomics with sample prefractionation and multiple enzyme digestion. In total, 1026 proteins were identified. All identified proteins were compared to those in the NCBI non-redundant protein database, the eukaryotic orthologous groups database, and the gene ontology database to obtain functional annotation information. Functional classification revealed that the identified proteins covered nearly all major biological processes. Some proteins were spore specific and related to the survival and dispersal of T. rubrum conidia, and many proteins were important to conidial germination and response to environmental conditions. Conclusion Our results suggest that the proteome of T. rubrum conidia is considerably complex, and that the maintenance of conidial dormancy is an intricate and elaborate process. This data set provides the first global framework for the dormant T. rubrum conidia proteome and is a stepping stone on the way to further study of the molecular mechanisms of T. rubrum conidial germination and the maintenance of conidial dormancy. PMID:18578874
Castellanos-Martínez, Sheila; Diz, Angel P; Álvarez-Chaver, Paula; Gestal, Camino
2014-06-13
The immune system of cephalopods is poorly known to date. The lack of genomic information makes difficult to understand vital processes like immune defense mechanisms and their interaction with pathogens at molecular level. The common octopus Octopus vulgaris has a high economic relevance and potential for aquaculture. However, disease outbreaks provoke serious reductions in production with potentially severe economic losses. In this study, a proteomic approach is used to analyze the immune response of O. vulgaris against the coccidia Aggregata octopiana, a gastrointestinal parasite which impairs the cephalopod nutritional status. The hemocytes and plasma proteomes were compared by 2-DE between sick and healthy octopus. The identities of 12 differentially expressed spots and other 27 spots without significant alteration from hemocytes, and 5 spots from plasma, were determined by mass spectrometry analysis aided by a six reading-frame translation of an octopus hemocyte RNA-seq database and also public databases. Principal component analysis pointed to 7 proteins from hemocytes as the major contributors to the overall difference between levels of infection and so could be considered as potential biomarkers. Particularly, filamin, fascin and peroxiredoxin are highlighted because of their implication in octopus immune defense activity. From the octopus plasma, hemocyanin was identified. This work represents a first step forward in order to characterize the protein profile of O. vulgaris hemolymph, providing important information for subsequent studies of the octopus immune system at molecular level and also to the understanding of the basis of octopus tolerance-resistance to A. octopiana. The immune system of cephalopods is poorly known to date. The lack of genomic information makes difficult to understand vital processes like immune defense mechanisms and their interaction with pathogens at molecular level. The study herein presented is focused to the comprehension of the octopus immune defense against a parasite infection. Particularly, it is centered in the host-parasite relationship developed between the octopus and the protozoan A. octopiana, which induces severe gastrointestinal injuries in octopus that produce a malabsorption syndrome. The common octopus is a commercially important species with a high potential for aquaculture in semi-open systems, and this pathology reduces the condition of the octopus populations on-growing in open-water systems resulting in important economical loses. This is the first proteomic approach developed on this host-parasite relationship, and therefore, the contribution of this work goes from i) ecological, since this particular relationship is tending to be established as a model of host-parasite interaction in natural populations; ii) evolutionary, due to the characterization of immune molecules that could contribute to understand the functioning of the immune defense in these highly evolved mollusks; and iii) to economical view. The results of this study provide an overview of the octopus hemolymph proteome. Furthermore, proteins influenced by the level of infection and implicated in the octopus cellular response are also showed. Consequently, a set of biomarkers for disease resistance is suggested for further research that could be valuable for the improvement of the octopus culture, taken into account their high economical value, the declining of landings and the need for the diversification of reared species in order to ensure the growth of the aquaculture activity. Although cephalopods are model species for biomedical studies and possess potential in aquaculture, their genomes have not been sequenced yet, which limits the application of genomic data to research important biological processes. Similarly, the octopus proteome, like other non-model organisms, is poorly represented in public databases. Most of the proteins were identified from an octopus' hemocyte RNA-seq database that we have performed, which will be the object of another manuscript in preparation. Therefore, the need to increase molecular data from non-model organisms is herein highlighted. Particularly, here is encouraged to expand the knowledge of the genomic of cephalopods in order to increase successful protein identifications. This article is part of a Special Issue entitled: Proteomics of non-model organisms. Copyright © 2013 Elsevier B.V. All rights reserved.
The scientific exploration of saliva in the post-proteomic era: from database back to basic function
Ruhl, Stefan
2012-01-01
The proteome of human saliva can be considered as being essentially completed. Diagnostic markers for a number of diseases have been identified among salivary proteins and peptides, taking advantage of saliva as an easy-to-obtain biological fluid. Yet, the majority of disease markers identified so far are serum components and not intrinsic proteins produced by the salivary glands. Furthermore, despite the fact that saliva is essential for protecting the oral integuments and dentition, little progress has been made in finding risk predictors in the salivary proteome for dental caries or periodontal disease. Since salivary proteins, and in particular the attached glycans, play an important role in interactions with the microbial world, the salivary glycoproteome and other post-translational modifications of salivary proteins need to be studied. Risk markers for microbial diseases, including dental caries, are likely to be discovered among the highly glycosylated major protein species in saliva. This review will attempt to raise new ideas and also point to under-researched areas that may hold promise for future applicability in oral diagnostics and prediction of oral disease. PMID:22292826
Khan, Gulafshana Hafeez; Galazis, Nicolas; Docheva, Nikolina; Layfield, Robert; Atiomo, William
2015-01-01
STUDY QUESTION Do any proteomic biomarkers previously identified for pre-eclampsia (PE) overlap with those identified in women with polycystic ovary syndrome (PCOS). SUMMARY ANSWER Five previously identified proteomic biomarkers were found to be common in women with PE and PCOS when compared with controls. WHAT IS KNOWN ALREADY Various studies have indicated an association between PCOS and PE; however, the pathophysiological mechanisms supporting this association are not known. STUDY DESIGN, SIZE, DURATION A systematic review and update of our PCOS proteomic biomarker database was performed, along with a parallel review of PE biomarkers. The study included papers from 1980 to December 2013. PARTICIPANTS/MATERIALS, SETTING, METHODS In all the studies analysed, there were a total of 1423 patients and controls. The number of proteomic biomarkers that were catalogued for PE was 192. MAIN RESULTS AND THE ROLE OF CHANCE Five proteomic biomarkers were shown to be differentially expressed in women with PE and PCOS when compared with controls: transferrin, fibrinogen α, β and γ chain variants, kininogen-1, annexin 2 and peroxiredoxin 2. In PE, the biomarkers were identified in serum, plasma and placenta and in PCOS, the biomarkers were identified in serum, follicular fluid, and ovarian and omental biopsies. LIMITATIONS, REASONS FOR CAUTION The techniques employed to detect proteomics have limited ability in identifying proteins that are of low abundance, some of which may have a diagnostic potential. The sample sizes and number of biomarkers identified from these studies do not exclude the risk of false positives, a limitation of all biomarker studies. The biomarkers common to PE and PCOS were identified from proteomic analyses of different tissues. WIDER IMPLICATIONS OF THE FINDINGS This data amalgamation of the proteomic studies in PE and in PCOS, for the first time, discovered a panel of five biomarkers for PE which are common to women with PCOS, including transferrin, fibrinogen α, β and γ chain variants, kininogen-1, annexin 2 and peroxiredoxin 2. If validated, these biomarkers could provide a useful framework for the knowledge infrastructure in this area. To accomplish this goal, a well co-ordinated multidisciplinary collaboration of clinicians, basic scientists and mathematicians is vital. STUDY FUNDING/COMPETING INTEREST(S) No financial support was obtained for this project. There are no conflicts of interest. PMID:25351721
Ma, Lu; Hatlen, Andrea; Kelly, Laura J.; Becher, Hannes; Wang, Wencai; Kovarik, Ales; Leitch, Ilia J.; Leitch, Andrew R.
2015-01-01
The RNA-directed DNA methylation (RdDM) pathway can be divided into three phases: 1) small interfering RNA biogenesis, 2) de novo methylation, and 3) chromatin modification. To determine the degree of conservation of this pathway we searched for key genes among land plants. We used OrthoMCL and the OrthoMCL Viridiplantae database to analyze proteomes of species in bryophytes, lycophytes, monilophytes, gymnosperms, and angiosperms. We also analyzed small RNA size categories and, in two gymnosperms, cytosine methylation in ribosomal DNA. Six proteins were restricted to angiosperms, these being NRPD4/NRPE4, RDM1, DMS3 (defective in meristem silencing 3), SHH1 (SAWADEE homeodomain homolog 1), KTF1, and SUVR2, although we failed to find the latter three proteins in Fritillaria persica, a species with a giant genome. Small RNAs of 24 nt in length were abundant only in angiosperms. Phylogenetic analyses of Dicer-like (DCL) proteins showed that DCL2 was restricted to seed plants, although it was absent in Gnetum gnemon and Welwitschia mirabilis. The data suggest that phases (1) and (2) of the RdDM pathway, described for model angiosperms, evolved with angiosperms. The absence of some features of RdDM in F. persica may be associated with its large genome. Phase (3) is probably the most conserved part of the pathway across land plants. DCL2, involved in virus defense and interaction with the canonical RdDM pathway to facilitate methylation of CHH, is absent outside seed plants. Its absence in G. gnemon, and W. mirabilis coupled with distinctive patterns of CHH methylation, suggest a secondary loss of DCL2 following the divergence of Gnetales. PMID:26338185
Mapping the Small Molecule Interactome by Mass Spectrometry.
Flaxman, Hope A; Woo, Christina M
2018-01-16
Mapping small molecule interactions throughout the proteome provides the critical structural basis for functional analysis of their impact on biochemistry. However, translation of mass spectrometry-based proteomics methods to directly profile the interaction between a small molecule and the whole proteome is challenging because of the substoichiometric nature of many interactions, the diversity of covalent and noncovalent interactions involved, and the subsequent computational complexity associated with their spectral assignment. Recent advances in chemical proteomics have begun fill this gap to provide a structural basis for the breadth of small molecule-protein interactions in the whole proteome. Innovations enabling direct characterization of the small molecule interactome include faster, more sensitive instrumentation coupled to chemical conjugation, enrichment, and labeling methods that facilitate detection and assignment. These methods have started to measure molecular interaction hotspots due to inherent differences in local amino acid reactivity and binding affinity throughout the proteome. Measurement of the small molecule interactome is producing structural insights and methods for probing and engineering protein biochemistry. Direct structural characterization of the small molecule interactome is a rapidly emerging area pushing new frontiers in biochemistry at the interface of small molecules and the proteome.
Changes in the Proteome of Xylem Sap in Brassica oleracea in Response to Fusarium oxysporum Stress
Pu, Zijing; Ino, Yoko; Kimura, Yayoi; Tago, Asumi; Shimizu, Motoki; Natsume, Satoshi; Sano, Yoshitaka; Fujimoto, Ryo; Kaneko, Kentaro; Shea, Daniel J.; Fukai, Eigo; Fuji, Shin-Ichi; Hirano, Hisashi; Okazaki, Keiichi
2016-01-01
Fusarium oxysporum f.sp. conlutinans (Foc) is a serious root-invading and xylem-colonizing fungus that causes yellowing in Brassica oleracea. To comprehensively understand the interaction between F. oxysporum and B. oleracea, composition of the xylem sap proteome of the non-infected and Foc-infected plants was investigated in both resistant and susceptible cultivars using liquid chromatography-tandem mass spectrometry (LC-MS/MS) after in-solution digestion of xylem sap proteins. Whole genome sequencing of Foc was carried out and generated a predicted Foc protein database. The predicted Foc protein database was then combined with the public B. oleracea and B. rapa protein databases downloaded from Uniprot and used for protein identification. About 200 plant proteins were identified in the xylem sap of susceptible and resistant plants. Comparison between the non-infected and Foc-infected samples revealed that Foc infection causes changes to the protein composition in B. oleracea xylem sap where repressed proteins accounted for a greater proportion than those of induced in both the susceptible and resistant reactions. The analysis on the proteins with concentration change > = 2-fold indicated a large portion of up- and down-regulated proteins were those acting on carbohydrates. Proteins with leucine-rich repeats and legume lectin domains were mainly induced in both resistant and susceptible system, so was the case of thaumatins. Twenty-five Foc proteins were identified in the infected xylem sap and 10 of them were cysteine-containing secreted small proteins that are good candidates for virulence and/or avirulence effectors. The findings of differential response of protein contents in the xylem sap between the non-infected and Foc-infected samples as well as the Foc candidate effectors secreted in xylem provide valuable insights into B. oleracea-Foc interactions. PMID:26870056
Changes in the Proteome of Xylem Sap in Brassica oleracea in Response to Fusarium oxysporum Stress.
Pu, Zijing; Ino, Yoko; Kimura, Yayoi; Tago, Asumi; Shimizu, Motoki; Natsume, Satoshi; Sano, Yoshitaka; Fujimoto, Ryo; Kaneko, Kentaro; Shea, Daniel J; Fukai, Eigo; Fuji, Shin-Ichi; Hirano, Hisashi; Okazaki, Keiichi
2016-01-01
Fusarium oxysporum f.sp. conlutinans (Foc) is a serious root-invading and xylem-colonizing fungus that causes yellowing in Brassica oleracea. To comprehensively understand the interaction between F. oxysporum and B. oleracea, composition of the xylem sap proteome of the non-infected and Foc-infected plants was investigated in both resistant and susceptible cultivars using liquid chromatography-tandem mass spectrometry (LC-MS/MS) after in-solution digestion of xylem sap proteins. Whole genome sequencing of Foc was carried out and generated a predicted Foc protein database. The predicted Foc protein database was then combined with the public B. oleracea and B. rapa protein databases downloaded from Uniprot and used for protein identification. About 200 plant proteins were identified in the xylem sap of susceptible and resistant plants. Comparison between the non-infected and Foc-infected samples revealed that Foc infection causes changes to the protein composition in B. oleracea xylem sap where repressed proteins accounted for a greater proportion than those of induced in both the susceptible and resistant reactions. The analysis on the proteins with concentration change > = 2-fold indicated a large portion of up- and down-regulated proteins were those acting on carbohydrates. Proteins with leucine-rich repeats and legume lectin domains were mainly induced in both resistant and susceptible system, so was the case of thaumatins. Twenty-five Foc proteins were identified in the infected xylem sap and 10 of them were cysteine-containing secreted small proteins that are good candidates for virulence and/or avirulence effectors. The findings of differential response of protein contents in the xylem sap between the non-infected and Foc-infected samples as well as the Foc candidate effectors secreted in xylem provide valuable insights into B. oleracea-Foc interactions.
Galazis, Nicolas; Olaleye, Olalekan; Haoula, Zeina; Layfield, Robert; Atiomo, William
2012-12-01
To review and identify possible biomarkers for ovarian cancer (OC) in women with polycystic ovary syndrome (PCOS). Systematic literature searches of MEDLINE, EMBASE, and Cochrane using the search terms "proteomics," "proteomic," and "ovarian cancer" or "ovarian carcinoma." Proteomic biomarkers for OC were then integrated with an updated previously published database of all proteomic biomarkers identified to date in patients with PCOS. Academic department of obstetrics and gynecology in the United Kingdom. A total of 180 women identified in the six studies. Tissue samples from women with OC vs. tissue samples from women without OC. Proteomic biomarkers, proteomic technique used, and methodologic quality score. A panel of six biomarkers was overexpressed both in women with OC and in women with PCOS. These biomarkers include calreticulin, fibrinogen-γ, superoxide dismutase, vimentin, malate dehydrogenase, and lamin B2. These biomarkers could help improve our understanding of the links between PCOS and OC and could potentially be used to identify subgroups of women with PCOS at increased risk of OC. More studies are required to further evaluate the role these biomarkers play in women with PCOS and OC. Copyright © 2012 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.
A Global Map of Lipid-Binding Proteins and Their Ligandability in Cells.
Niphakis, Micah J; Lum, Kenneth M; Cognetta, Armand B; Correia, Bruno E; Ichu, Taka-Aki; Olucha, Jose; Brown, Steven J; Kundu, Soumajit; Piscitelli, Fabiana; Rosen, Hugh; Cravatt, Benjamin F
2015-06-18
Lipids play central roles in physiology and disease, where their structural, metabolic, and signaling functions often arise from interactions with proteins. Here, we describe a set of lipid-based chemical proteomic probes and their global interaction map in mammalian cells. These interactions involve hundreds of proteins from diverse functional classes and frequently occur at sites of drug action. We determine the target profiles for several drugs across the lipid-interaction proteome, revealing that its ligandable content extends far beyond traditionally defined categories of druggable proteins. In further support of this finding, we describe a selective ligand for the lipid-binding protein nucleobindin-1 (NUCB1) and show that this compound perturbs the hydrolytic and oxidative metabolism of endocannabinoids in cells. The described chemical proteomic platform thus provides an integrated path to both discover and pharmacologically characterize a wide range of proteins that participate in lipid pathways in cells. Copyright © 2015 Elsevier Inc. All rights reserved.
Piehowski, Paul D; Petyuk, Vladislav A; Sandoval, John D; Burnum, Kristin E; Kiebel, Gary R; Monroe, Matthew E; Anderson, Gordon A; Camp, David G; Smith, Richard D
2013-03-01
For bottom-up proteomics, there are wide variety of database-searching algorithms in use for matching peptide sequences to tandem MS spectra. Likewise, there are numerous strategies being employed to produce a confident list of peptide identifications from the different search algorithm outputs. Here we introduce a grid-search approach for determining optimal database filtering criteria in shotgun proteomics data analyses that is easily adaptable to any search. Systematic Trial and Error Parameter Selection--referred to as STEPS--utilizes user-defined parameter ranges to test a wide array of parameter combinations to arrive at an optimal "parameter set" for data filtering, thus maximizing confident identifications. The benefits of this approach in terms of numbers of true-positive identifications are demonstrated using datasets derived from immunoaffinity-depleted blood serum and a bacterial cell lysate, two common proteomics sample types. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A glimpse into the proteome of phototrophic bacterium Rhodobacter capsulatus.
Onder, Ozlem; Aygun-Sunar, Semra; Selamoglu, Nur; Daldal, Fevzi
2010-01-01
A first glimpse into the proteome of Rhodobacter capsulatus revealed more than 450 (with over 210 cytoplasmic and 185 extracytoplasmic known as well as 55 unknown) proteins that are identified with high degree of confidence using nLC-MS/MS analyses. The accumulated data provide a solid platform for ongoing efforts to establish the proteome of this species and the cellular locations of its constituents. They also indicate that at least 40 of the identified proteins, which were annotated in genome databases as unknown hypothetical proteins, correspond to predicted translation products that are indeed present in cells under the growth conditions used in this work. In addition, matching the identification labels of the proteins reported between the two available R. capsulatus genome databases (ERGO-light with RRCxxxxx and NT05 with NT05RCxxxx numbers) indicated that 11 such proteins are listed only in the latter database.
MASPECTRAS: a platform for management and analysis of proteomics LC-MS/MS data
Hartler, Jürgen; Thallinger, Gerhard G; Stocker, Gernot; Sturn, Alexander; Burkard, Thomas R; Körner, Erik; Rader, Robert; Schmidt, Andreas; Mechtler, Karl; Trajanoski, Zlatko
2007-01-01
Background The advancements of proteomics technologies have led to a rapid increase in the number, size and rate at which datasets are generated. Managing and extracting valuable information from such datasets requires the use of data management platforms and computational approaches. Results We have developed the MAss SPECTRometry Analysis System (MASPECTRAS), a platform for management and analysis of proteomics LC-MS/MS data. MASPECTRAS is based on the Proteome Experimental Data Repository (PEDRo) relational database schema and follows the guidelines of the Proteomics Standards Initiative (PSI). Analysis modules include: 1) import and parsing of the results from the search engines SEQUEST, Mascot, Spectrum Mill, X! Tandem, and OMSSA; 2) peptide validation, 3) clustering of proteins based on Markov Clustering and multiple alignments; and 4) quantification using the Automated Statistical Analysis of Protein Abundance Ratios algorithm (ASAPRatio). The system provides customizable data retrieval and visualization tools, as well as export to PRoteomics IDEntifications public repository (PRIDE). MASPECTRAS is freely available at Conclusion Given the unique features and the flexibility due to the use of standard software technology, our platform represents significant advance and could be of great interest to the proteomics community. PMID:17567892
Nylund, Reetta; Kuster, Niels; Leszczynski, Dariusz
2010-10-18
Use of mobile phones has widely increased over the past decade. However, in spite of the extensive research, the question of potential health effects of the mobile phone radiation remains unanswered. We have earlier proposed, and applied, proteomics as a tool to study biological effects of the mobile phone radiation, using as a model human endothelial cell line EA.hy926. Exposure of EA.hy926 cells to 900 MHz GSM radiation has caused statistically significant changes in expression of numerous proteins. However, exposure of EA.hy926 cells to 1800 MHz GSM signal had only very small effect on cell proteome, as compared with 900 MHz GSM exposure. In the present study, using as model human primary endothelial cells, we have examined whether exposure to 1800 MHz GSM mobile phone radiation can affect cell proteome. Primary human umbilical vein endothelial cells and primary human brain microvascular endothelial cells were exposed for 1 hour to 1800 MHz GSM mobile phone radiation at an average specific absorption rate of 2.0 W/kg. The cells were harvested immediately after the exposure and the protein expression patterns of the sham-exposed and radiation-exposed cells were examined using two dimensional difference gel electrophoresis-based proteomics (2DE-DIGE). There were observed numerous differences between the proteomes of human umbilical vein endothelial cells and human brain microvascular endothelial cells (both sham-exposed). These differences are most likely representing physiological differences between endothelia in different vascular beds. However, the exposure of both types of primary endothelial cells to mobile phone radiation did not cause any statistically significant changes in protein expression. Exposure of primary human endothelial cells to the mobile phone radiation, 1800 MHz GSM signal for 1 hour at an average specific absorption rate of 2.0 W/kg, does not affect protein expression, when the proteomes were examined immediately after the end of the exposure and when the false discovery rate correction was applied to analysis. This observation agrees with our earlier study showing that the 1800 MHz GSM radiation exposure had only very limited effect on the proteome of human endothelial cell line EA.hy926, as compared with the effect of 900 MHz GSM radiation.
Mohien, Ceereena Ubaida; Colquhoun, David R.; Mathias, Derrick K.; Gibbons, John G.; Armistead, Jennifer S.; Rodriguez, Maria C.; Rodriguez, Mario Henry; Edwards, Nathan J.; Hartler, Jürgen; Thallinger, Gerhard G.; Graham, David R.; Martinez-Barnetche, Jesus; Rokas, Antonis; Dinglasan, Rhoel R.
2013-01-01
Malaria morbidity and mortality caused by both Plasmodium falciparum and Plasmodium vivax extend well beyond the African continent, and although P. vivax causes between 80 and 300 million severe cases each year, vivax transmission remains poorly understood. Plasmodium parasites are transmitted by Anopheles mosquitoes, and the critical site of interaction between parasite and host is at the mosquito's luminal midgut brush border. Although the genome of the “model” African P. falciparum vector, Anopheles gambiae, has been sequenced, evolutionary divergence limits its utility as a reference across anophelines, especially non-sequenced P. vivax vectors such as Anopheles albimanus. Clearly, technologies and platforms that bridge this substantial scientific gap are required in order to provide public health scientists with key transcriptomic and proteomic information that could spur the development of novel interventions to combat this disease. To our knowledge, no approaches have been published that address this issue. To bolster our understanding of P. vivax–An. albimanus midgut interactions, we developed an integrated bioinformatic-hybrid RNA-Seq-LC-MS/MS approach involving An. albimanus transcriptome (15,764 contigs) and luminal midgut subproteome (9,445 proteins) assembly, which, when used with our custom Diptera protein database (685,078 sequences), facilitated a comparative proteomic analysis of the midgut brush borders of two important malaria vectors, An. gambiae and An. albimanus. PMID:23082028
Ubaida Mohien, Ceereena; Colquhoun, David R; Mathias, Derrick K; Gibbons, John G; Armistead, Jennifer S; Rodriguez, Maria C; Rodriguez, Mario Henry; Edwards, Nathan J; Hartler, Jürgen; Thallinger, Gerhard G; Graham, David R; Martinez-Barnetche, Jesus; Rokas, Antonis; Dinglasan, Rhoel R
2013-01-01
Malaria morbidity and mortality caused by both Plasmodium falciparum and Plasmodium vivax extend well beyond the African continent, and although P. vivax causes between 80 and 300 million severe cases each year, vivax transmission remains poorly understood. Plasmodium parasites are transmitted by Anopheles mosquitoes, and the critical site of interaction between parasite and host is at the mosquito's luminal midgut brush border. Although the genome of the "model" African P. falciparum vector, Anopheles gambiae, has been sequenced, evolutionary divergence limits its utility as a reference across anophelines, especially non-sequenced P. vivax vectors such as Anopheles albimanus. Clearly, technologies and platforms that bridge this substantial scientific gap are required in order to provide public health scientists with key transcriptomic and proteomic information that could spur the development of novel interventions to combat this disease. To our knowledge, no approaches have been published that address this issue. To bolster our understanding of P. vivax-An. albimanus midgut interactions, we developed an integrated bioinformatic-hybrid RNA-Seq-LC-MS/MS approach involving An. albimanus transcriptome (15,764 contigs) and luminal midgut subproteome (9,445 proteins) assembly, which, when used with our custom Diptera protein database (685,078 sequences), facilitated a comparative proteomic analysis of the midgut brush borders of two important malaria vectors, An. gambiae and An. albimanus.
An emerging cyberinfrastructure for biodefense pathogen and pathogen–host data
Zhang, C.; Crasta, O.; Cammer, S.; Will, R.; Kenyon, R.; Sullivan, D.; Yu, Q.; Sun, W.; Jha, R.; Liu, D.; Xue, T.; Zhang, Y.; Moore, M.; McGarvey, P.; Huang, H.; Chen, Y.; Zhang, J.; Mazumder, R.; Wu, C.; Sobral, B.
2008-01-01
The NIAID-funded Biodefense Proteomics Resource Center (RC) provides storage, dissemination, visualization and analysis capabilities for the experimental data deposited by seven Proteomics Research Centers (PRCs). The data and its publication is to support researchers working to discover candidates for the next generation of vaccines, therapeutics and diagnostics against NIAID's Category A, B and C priority pathogens. The data includes transcriptional profiles, protein profiles, protein structural data and host–pathogen protein interactions, in the context of the pathogen life cycle in vivo and in vitro. The database has stored and supported host or pathogen data derived from Bacillus, Brucella, Cryptosporidium, Salmonella, SARS, Toxoplasma, Vibrio and Yersinia, human tissue libraries, and mouse macrophages. These publicly available data cover diverse data types such as mass spectrometry, yeast two-hybrid (Y2H), gene expression profiles, X-ray and NMR determined protein structures and protein expression clones. The growing database covers over 23 000 unique genes/proteins from different experiments and organisms. All of the genes/proteins are annotated and integrated across experiments using UniProt Knowledgebase (UniProtKB) accession numbers. The web-interface for the database enables searching, querying and downloading at the level of experiment, group and individual gene(s)/protein(s) via UniProtKB accession numbers or protein function keywords. The system is accessible at http://www.proteomicsresource.org/. PMID:17984082
Lundquist, Peter K.; Poliakov, Anton; Bhuiyan, Nazmul H.; Zybailov, Boris; Sun, Qi; van Wijk, Klaas J.
2012-01-01
Plastoglobules (PGs) in chloroplasts are thylakoid-associated monolayer lipoprotein particles containing prenyl and neutral lipids and several dozen proteins mostly with unknown functions. An integrated view of the role of the PG is lacking. Here, we better define the PG proteome and provide a conceptual framework for further studies. The PG proteome from Arabidopsis (Arabidopsis thaliana) leaf chloroplasts was determined by mass spectrometry of isolated PGs and quantitative comparison with the proteomes of unfractionated leaves, thylakoids, and stroma. Scanning electron microscopy showed the purity and size distribution of the isolated PGs. Compared with previous PG proteome analyses, we excluded several proteins and identified six new PG proteins, including an M48 metallopeptidase and two Absence of bc1 complex (ABC1) atypical kinases, confirmed by immunoblotting. This refined PG proteome consisted of 30 proteins, including six ABC1 kinases and seven fibrillins together comprising more than 70% of the PG protein mass. Other fibrillins were located predominantly in the stroma or thylakoid and not in PGs; we discovered that this partitioning can be predicted by their isoelectric point and hydrophobicity. A genome-wide coexpression network for the PG genes was then constructed from mRNA expression data. This revealed a modular network with four distinct modules that each contained at least one ABC1K and/or fibrillin gene. Each module showed clear enrichment in specific functions, including chlorophyll degradation/senescence, isoprenoid biosynthesis, plastid proteolysis, and redox regulators and phosphoregulators of electron flow. We propose a new testable model for the PGs, in which sets of genes are associated with specific PG functions. PMID:22274653
Proteomic analysis of bovine nucleolus.
Patel, Amrutlal K; Olson, Doug; Tikoo, Suresh K
2010-09-01
Nucleolus is the most prominent subnuclear structure, which performs a wide variety of functions in the eukaryotic cellular processes. In order to understand the structural and functional role of the nucleoli in bovine cells, we analyzed the proteomic composition of the bovine nucleoli. The nucleoli were isolated from Madin Darby bovine kidney cells and subjected to proteomic analysis by LC-MS/MS after fractionation by SDS-PAGE and strong cation exchange chromatography. Analysis of the data using the Mascot database search and the GPM database search identified 311 proteins in the bovine nucleoli, which contained 22 proteins previously not identified in the proteomic analysis of human nucleoli. Analysis of the identified proteins using the GoMiner software suggested that the bovine nucleoli contained proteins involved in ribosomal biogenesis, cell cycle control, transcriptional, translational and post-translational regulation, transport, and structural organization. Copyright © 2010 Beijing Genomics Institute. Published by Elsevier Ltd. All rights reserved.
A HUPO test sample study reveals common problems in mass spectrometry-based proteomics
Bell, Alexander W.; Deutsch, Eric W.; Au, Catherine E.; Kearney, Robert E.; Beavis, Ron; Sechi, Salvatore; Nilsson, Tommy; Bergeron, John J.M.
2009-01-01
We carried out a test sample study to try to identify errors leading to irreproducibility, including incompleteness of peptide sampling, in LC-MS-based proteomics. We distributed a test sample consisting of an equimolar mix of 20 highly purified recombinant human proteins, to 27 laboratories for identification. Each protein contained one or more unique tryptic peptides of 1250 Da to also test for ion selection and sampling in the mass spectrometer. Of the 27 labs, initially only 7 labs reported all 20 proteins correctly, and only 1 lab reported all the tryptic peptides of 1250 Da. Nevertheless, a subsequent centralized analysis of the raw data revealed that all 20 proteins and most of the 1250 Da peptides had in fact been detected by all 27 labs. The centralized analysis allowed us to determine sources of problems encountered in the study, which include missed identifications (false negatives), environmental contamination, database matching, and curation of protein identifications. Improved search engines and databases are likely to increase the fidelity of mass spectrometry-based proteomics. PMID:19448641
jqcML: an open-source java API for mass spectrometry quality control data in the qcML format.
Bittremieux, Wout; Kelchtermans, Pieter; Valkenborg, Dirk; Martens, Lennart; Laukens, Kris
2014-07-03
The awareness that systematic quality control is an essential factor to enable the growth of proteomics into a mature analytical discipline has increased over the past few years. To this aim, a controlled vocabulary and document structure have recently been proposed by Walzer et al. to store and disseminate quality-control metrics for mass-spectrometry-based proteomics experiments, called qcML. To facilitate the adoption of this standardized quality control routine, we introduce jqcML, a Java application programming interface (API) for the qcML data format. First, jqcML provides a complete object model to represent qcML data. Second, jqcML provides the ability to read, write, and work in a uniform manner with qcML data from different sources, including the XML-based qcML file format and the relational database qcDB. Interaction with the XML-based file format is obtained through the Java Architecture for XML Binding (JAXB), while generic database functionality is obtained by the Java Persistence API (JPA). jqcML is released as open-source software under the permissive Apache 2.0 license and can be downloaded from https://bitbucket.org/proteinspector/jqcml .
An object model and database for functional genomics.
Jones, Andrew; Hunt, Ela; Wastling, Jonathan M; Pizarro, Angel; Stoeckert, Christian J
2004-07-10
Large-scale functional genomics analysis is now feasible and presents significant challenges in data analysis, storage and querying. Data standards are required to enable the development of public data repositories and to improve data sharing. There is an established data format for microarrays (microarray gene expression markup language, MAGE-ML) and a draft standard for proteomics (PEDRo). We believe that all types of functional genomics experiments should be annotated in a consistent manner, and we hope to open up new ways of comparing multiple datasets used in functional genomics. We have created a functional genomics experiment object model (FGE-OM), developed from the microarray model, MAGE-OM and two models for proteomics, PEDRo and our own model (Gla-PSI-Glasgow Proposal for the Proteomics Standards Initiative). FGE-OM comprises three namespaces representing (i) the parts of the model common to all functional genomics experiments; (ii) microarray-specific components; and (iii) proteomics-specific components. We believe that FGE-OM should initiate discussion about the contents and structure of the next version of MAGE and the future of proteomics standards. A prototype database called RNA And Protein Abundance Database (RAPAD), based on FGE-OM, has been implemented and populated with data from microbial pathogenesis. FGE-OM and the RAPAD schema are available from http://www.gusdb.org/fge.html, along with a set of more detailed diagrams. RAPAD can be accessed by registration at the site.
Wang, Hualin; Sit, Wat-Hung; Tipoe, George Lim; Liu, Zhiguo; Wan, Jennifer Man-Fan
2017-02-01
The influences of dietary fatty acids on the progress of chronic liver diseases have attracted lots of attentions, but the mechanisms of the effects of lipids rich in saturated fatty acids or PUFAs on hepatic fibrogenesis remain unclear. Female Fischer 344 rats were fed normal chow or chow plus 20% (w/w) of corn oil or lard, respectively, and injected CCl 4 twice a week for 4 weeks to induce liver fibrosis. Masson's staining was adopted to illustrate the fibrosis level. The mRNA expression level of α-SMA and the DNA methylation level of its promoter region were analyzed. A 2-DE gel based proteomic approach was constructed to investigate the differential expression level of hepatic proteome between three diet groups. Histological evaluations and α-SMA expression analysis illustrated the high corn oil intake has no effects on hepatic fibrogenesis, but lard intake aggravated liver fibrosis, partly attributed to DNA demethylation of α-SMA promoter region. 2-DE Gel based proteomic study demonstrated excessive lard consumption elevated the expression of fibrosis related alpha-1-antitrypsin precursor, and endoplasmic reticulum stress related proteins such as heat shock cognate 71 kDa, eukaryotic translation initiation factor 4A1 and protein disulfide isomerase associated 3. Moreover, unlike corn oil rich in PUFAs, lard had no effects to elevate the expression of glutathione S-transferases, but decreased the expression of iron store related proteins heme binding protein 1 and ferritin. Lard intake aggravates CCl 4 induced liver fibrosis via enhancing the expression of fibrogenesis and ER stress related proteins, and disturbing the hepatic transmethylation reaction. Copyright © 2015 Elsevier Ltd and European Society for Clinical Nutrition and Metabolism. All rights reserved.
Kumar, Amit; Thotakura, Pragna Lakshmi; Tiwary, Basant Kumar; Krishna, Ramadas
2016-05-12
Fusobacterium nucleatum, a well studied bacterium in periodontal diseases, appendicitis, gingivitis, osteomyelitis and pregnancy complications has recently gained attention due to its association with colorectal cancer (CRC) progression. Treatment with berberine was shown to reverse F. nucleatum-induced CRC progression in mice by balancing the growth of opportunistic pathogens in tumor microenvironment. Intestinal microbiota imbalance and the infections caused by F. nucleatum might be regulated by therapeutic intervention. Hence, we aimed to predict drug target proteins in F. nucleatum, through subtractive genomics approach and host-pathogen protein-protein interactions (HP-PPIs). We also carried out enrichment analysis of host interacting partners to hypothesize the possible mechanisms involved in CRC progression due to F. nucleatum. In subtractive genomics approach, the essential, virulence and resistance related proteins were retrieved from RefSeq proteome of F. nucleatum by searching against Database of Essential Genes (DEG), Virulence Factor Database (VFDB) and Antibiotic Resistance Gene-ANNOTation (ARG-ANNOT) tool respectively. A subsequent hierarchical screening to identify non-human homologous, metabolic pathway-independent/pathway-specific and druggable proteins resulted in eight pathway-independent and 27 pathway-specific druggable targets. Co-aggregation of F. nucleatum with host induces proinflammatory gene expression thereby potentiates tumorigenesis. Hence, proteins from IBDsite, a database for inflammatory bowel disease (IBD) research and those involved in colorectal adenocarcinoma as interpreted from The Cancer Genome Atlas (TCGA) were retrieved to predict drug targets based on HP-PPIs with F. nucleatum proteome. Prediction of HP-PPIs exhibited 186 interactions contributed by 103 host and 76 bacterial proteins. Bacterial interacting partners were accounted as putative targets. And enrichment analysis of host interacting partners showed statistically enriched terms that were in positive correlation with CRC, atherosclerosis, cardiovascular, osteoporosis, Alzheimer's and other diseases. Subtractive genomics analysis provided a set of target proteins suggested to be indispensable for survival and pathogenicity of F. nucleatum. These target proteins might be considered for designing potent inhibitors to abrogate F. nucleatum infections. From enrichment analysis, it was hypothesized that F. nucleatum infection might enhance CRC progression by simultaneously regulating multiple signaling cascades which could lead to up-regulation of proinflammatory responses, oncogenes, modulation of host immune defense mechanism and suppression of DNA repair system.
Shah, Anup D; Inder, Kerry L; Shah, Alok K; Cristino, Alexandre S; McKie, Arthur B; Gabra, Hani; Davis, Melissa J; Hill, Michelle M
2016-10-07
Lipid rafts are dynamic membrane microdomains that orchestrate molecular interactions and are implicated in cancer development. To understand the functions of lipid rafts in cancer, we performed an integrated analysis of quantitative lipid raft proteomics data sets modeling progression in breast cancer, melanoma, and renal cell carcinoma. This analysis revealed that cancer development is associated with increased membrane raft-cytoskeleton interactions, with ∼40% of elevated lipid raft proteins being cytoskeletal components. Previous studies suggest a potential functional role for the raft-cytoskeleton in the action of the putative tumor suppressors PTRF/Cavin-1 and Merlin. To extend the observation, we examined lipid raft proteome modulation by an unrelated tumor suppressor opioid binding protein cell-adhesion molecule (OPCML) in ovarian cancer SKOV3 cells. In agreement with the other model systems, quantitative proteomics revealed that 39% of OPCML-depleted lipid raft proteins are cytoskeletal components, with microfilaments and intermediate filaments specifically down-regulated. Furthermore, protein-protein interaction network and simulation analysis showed significantly higher interactions among cancer raft proteins compared with general human raft proteins. Collectively, these results suggest increased cytoskeleton-mediated stabilization of lipid raft domains with greater molecular interactions as a common, functional, and reversible feature of cancer cells.
The Proteome Folding Project: Proteome-scale prediction of structure and function
Drew, Kevin; Winters, Patrick; Butterfoss, Glenn L.; Berstis, Viktors; Uplinger, Keith; Armstrong, Jonathan; Riffle, Michael; Schweighofer, Erik; Bovermann, Bill; Goodlett, David R.; Davis, Trisha N.; Shasha, Dennis; Malmström, Lars; Bonneau, Richard
2011-01-01
The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition, and grid-computing-enabled de novo structure prediction. We predict protein domain boundaries and three-dimensional (3D) structures for protein domains from 94 genomes (including human, Arabidopsis, rice, mouse, fly, yeast, Escherichia coli, and worm). De novo structure predictions were distributed on a grid of more than 1.5 million CPUs worldwide (World Community Grid). We generated significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions. PMID:21824995
Bioinformatics for spermatogenesis: annotation of male reproduction based on proteomics
Zhou, Tao; Zhou, Zuo-Min; Guo, Xue-Jiang
2013-01-01
Proteomics strategies have been widely used in the field of male reproduction, both in basic and clinical research. Bioinformatics methods are indispensable in proteomics-based studies and are used for data presentation, database construction and functional annotation. In the present review, we focus on the functional annotation of gene lists obtained through qualitative or quantitative methods, summarizing the common and male reproduction specialized proteomics databases. We introduce several integrated tools used to find the hidden biological significance from the data obtained. We further describe in detail the information on male reproduction derived from Gene Ontology analyses, pathway analyses and biomedical analyses. We provide an overview of bioinformatics annotations in spermatogenesis, from gene function to biological function and from biological function to clinical application. On the basis of recently published proteomics studies and associated data, we show that bioinformatics methods help us to discover drug targets for sperm motility and to scan for cancer-testis genes. In addition, we summarize the online resources relevant to male reproduction research for the exploration of the regulation of spermatogenesis. PMID:23852026
NASA Astrophysics Data System (ADS)
Geetha, Thangiah; Langlais, Paul; Luo, Moulun; Mapes, Rebekka; Lefort, Natalie; Chen, Shu-Chuan; Mandarino, Lawrence J.; Yi, Zhengping
2011-03-01
Protein-protein interactions are key to most cellular processes. Tandem mass spectrometry (MS/MS)-based proteomics combined with co-immunoprecipitation (CO-IP) has emerged as a powerful approach for studying protein complexes. However, a majority of systematic proteomics studies on protein-protein interactions involve the use of protein overexpression and/or epitope-tagged bait proteins, which might affect binding stoichiometry and lead to higher false positives. Here, we report an application of a straightforward, label-free CO-IP-MS/MS method, without the use of protein overexpression or protein tags, to the investigation of changes in the abundance of endogenous proteins associated with a bait protein, which is in this case insulin receptor substrate-1 (IRS-1), under basal and insulin stimulated conditions. IRS-1 plays a central role in the insulin signaling cascade. Defects in the protein-protein interactions involving IRS-1 may lead to the development of insulin resistance and type 2 diabetes. HPLC-ESI-MS/MS analyses identified eleven novel endogenous insulin-stimulated IRS-1 interaction partners in L6 myotubes reproducibly, including proteins play an important role in protein dephosphorylation [protein phosphatase 1 regulatory subunit 12A, (PPP1R12A)], muscle contraction and actin cytoskeleton rearrangement, endoplasmic reticulum stress, and protein folding, as well as protein synthesis. This novel application of label-free CO-IP-MS/MS quantification to assess endogenous interaction partners of a specific protein will prove useful for understanding how various cell stimuli regulate insulin signal transduction.
Dellaire, G.; Farrall, R.; Bickmore, W.A.
2003-01-01
The Nuclear Protein Database (NPD) is a curated database that contains information on more than 1300 vertebrate proteins that are thought, or are known, to localise to the cell nucleus. Each entry is annotated with information on predicted protein size and isoelectric point, as well as any repeats, motifs or domains within the protein sequence. In addition, information on the sub-nuclear localisation of each protein is provided and the biological and molecular functions are described using Gene Ontology (GO) terms. The database is searchable by keyword, protein name, sub-nuclear compartment and protein domain/motif. Links to other databases are provided (e.g. Entrez, SWISS-PROT, OMIM, PubMed, PubMed Central). Thus, NPD provides a gateway through which the nuclear proteome may be explored. The database can be accessed at http://npd.hgu.mrc.ac.uk and is updated monthly. PMID:12520015
Gacesa, Ranko; Zucko, Jurica; Petursdottir, Solveig K; Gudmundsdottir, Elisabet Eik; Fridjonsson, Olafur H; Diminic, Janko; Long, Paul F; Cullum, John; Hranueli, Daslav; Hreggvidsson, Gudmundur O; Starcevic, Antonio
2017-06-01
The MEGGASENSE platform constructs relational databases of DNA or protein sequences. The default functional analysis uses 14 106 hidden Markov model (HMM) profiles based on sequences in the KEGG database. The Solr search engine allows sophisticated queries and a BLAST search function is also incorporated. These standard capabilities were used to generate the SCATT database from the predicted proteome of Streptomyces cattleya . The implementation of a specialised metagenome database (AMYLOMICS) for bioprospecting of carbohydrate-modifying enzymes is described. In addition to standard assembly of reads, a novel 'functional' assembly was developed, in which screening of reads with the HMM profiles occurs before the assembly. The AMYLOMICS database incorporates additional HMM profiles for carbohydrate-modifying enzymes and it is illustrated how the combination of HMM and BLAST analyses helps identify interesting genes. A variety of different proteome and metagenome databases have been generated by MEGGASENSE.
Quantitative proteomics reveals the central changes of wheat in response to powdery mildew.
Fu, Ying; Zhang, Hong; Mandal, Siddikun Nabi; Wang, Changyou; Chen, Chunhuan; Ji, Wanquan
2016-01-01
Powdery mildew (Pm), caused by Blumeria graminis f. sp. tritici (Bgt), is one of the most important crop diseases, causing severe economic losses to wheat production worldwide. However, there are few reports about the proteomic response to Bgt infection in resistant wheat. Hence, quantitative proteomic analysis of N9134, a resistant wheat line, was performed to explore the molecular mechanism of wheat in defense against Bgt. Comparing the leaf proteins of Bgt-inoculated N9134 with that of mock-inoculated controls, a total of 2182 protein-species were quantified by iTRAQ at 24, 48 and 72h postinoculation (hpi) with Bgt, of which 394 showed differential accumulation. These differentially accumulated protein-species (DAPs) mainly included pathogenesis-related (PR) polypeptides, oxidative stress responsive proteins and components involved in primary metabolic pathways. KEGG enrichment analysis showed that phenylpropanoid biosynthesis, phenylalanine metabolism and photosynthesis-antenna proteins were the key pathways in response to Bgt infection. InterProScan 5 and the Gibbs Motif Sampler cluster 394 DAPs into eight conserved motifs, which shared leucine repeats and histidine sites in the sequence motifs. Moreover, eight separate protein-protein interaction (PPI) networks were predicted from STRING database. This study provides a powerful platform for further exploration of the molecular mechanism underlying resistant wheat responding to Bgt. Powdery mildew, caused by Blumeria graminis f. sp. tritici (Bgt), is a destructive pathogenic disease in wheat-producing regions worldwide, resulting in severe yield reductions. Although many resistant wheat varieties have been cultivated, there are few reports about the proteomic response to Bgt infection in resistant wheat. Therefore, an iTRAQ-based quantitative proteomic analysis of a resistant wheat line (N9134) in response to Bgt infection has been performed. This paper provides new insights into the underlying molecular mechanism of wheat in response to Bgt. The proteomic analysis can significantly narrow the field of potential defense-related protein-species, and is conducive to recognize the critical or effector protein under Bgt infection more precisely. Taken together, large amounts of high-throughput data provide a powerful platform for further exploration of the molecular mechanism on wheat-Bgt interactions. Copyright © 2015 Elsevier B.V. All rights reserved.
A Reference Proteomic Database of Lactobacillus plantarum CMCC-P0002
Tian, Wanhong; Yu, Gang; Liu, Xiankai; Wang, Jie; Feng, Erling; Zhang, Xuemin; Chen, Bei; Zeng, Ming; Wang, Hengliang
2011-01-01
Lactobacillus plantarum is a widespread probiotic bacteria found in many fermented food products. In this study, the whole-cell proteins and secretory proteins of L. plantarum were separated by two-dimensional electrophoresis method. A total of 434 proteins were identified by tandem mass spectrometry, including a plasmid-encoded hypothetical protein pLP9000_05. The information of first 20 highest abundance proteins was listed for the further genetic manipulation of L. plantarum, such as construction of high-level expressions system. Furthermore, the first interaction map of L. plantarum was established by Blue-Native/SDS-PAGE technique. A heterodimeric complex composed of maltose phosphorylase Map3 and Map2, and two homodimeric complexes composed of Map3 and Map2 respectively, were identified at the same time, indicating the important roles of these proteins. These findings provided valuable information for the further proteomic researches of L. plantarum. PMID:21998671
Evidence for a vast peptide overlap between West Nile virus and human proteomes.
Capone, Giovanni; Pagoni, Maria; Delfino, Antonella Pesce; Kanduc, Darja
2013-10-01
The primary amino acid sequence of West Nile virus (WNV) polyprotein, GenBank accession number M12294, was analyzed by computional biology. WNV is a mosquito-borne neurotropic flavivirus that has emerged globally as a significant cause of viral encephalitis in humans. Using pentapeptides as scanning units and the perfect peptide match program from PIR International Protein Sequence Database, we compared the WNV polyprotein and the human proteome. WNV polyprotein showed significant sequence similarities to a number of human proteins. Several of these proteins are involved in embryogenesis, neurite outgrowth, cortical neuron branching, formation of mature synapses, semaphorin interactions, and voltage dependent L-type calcium channel subunits. The biocomputional study suggest that common amino acid segments might represent a potential platform for further studies on the neurological pathophysiology of WNV infections. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Covalent inhibitors: an opportunity for rational target selectivity.
Lagoutte, Roman; Patouret, Remi; Winssinger, Nicolas
2017-08-01
There is a resurging interest in compounds that engage their target through covalent interactions. Cysteine's thiol is endowed with enhanced reactivity, making it the nucleophile of choice for covalent engagement with a ligand aligning an electrophilic trap with a cysteine residue in a target of interest. The paucity of cysteine in the proteome coupled to the fact that closely related proteins do not necessarily share a given cysteine residue enable a level of unprecedented rational target selectivity. The recent demonstration that a lysine's amine can also be engaged covalently with a mild electrophile extends the potential of covalent inhibitors. The growing database of protein structures facilitates the discovery of covalent inhibitors while the advent of proteomic technologies enables a finer resolution in the selectivity of covalently engaged proteins. Here, we discuss recent examples of discovery and design of covalent inhibitors. Copyright © 2017 Elsevier Ltd. All rights reserved.
A reference proteomic database of Lactobacillus plantarum CMCC-P0002.
Zhu, Li; Hu, Wei; Liu, Datao; Tian, Wanhong; Yu, Gang; Liu, Xiankai; Wang, Jie; Feng, Erling; Zhang, Xuemin; Chen, Bei; Zeng, Ming; Wang, Hengliang
2011-01-01
Lactobacillus plantarum is a widespread probiotic bacteria found in many fermented food products. In this study, the whole-cell proteins and secretory proteins of L. plantarum were separated by two-dimensional electrophoresis method. A total of 434 proteins were identified by tandem mass spectrometry, including a plasmid-encoded hypothetical protein pLP9000_05. The information of first 20 highest abundance proteins was listed for the further genetic manipulation of L. plantarum, such as construction of high-level expressions system. Furthermore, the first interaction map of L. plantarum was established by Blue-Native/SDS-PAGE technique. A heterodimeric complex composed of maltose phosphorylase Map3 and Map2, and two homodimeric complexes composed of Map3 and Map2 respectively, were identified at the same time, indicating the important roles of these proteins. These findings provided valuable information for the further proteomic researches of L. plantarum.
Saucereau, Yoann; Valiente Moro, Claire; Dieryckx, Cindy; Dupuy, Jean-William; Tran, Florence-Hélène; Girard, Vincent; Potier, Patrick; Mavingui, Patrick
2017-08-18
Aedes albopictus is a vector of arboviruses that cause severe diseases in humans such as Chikungunya, Dengue and Zika fevers. The vector competence of Ae. albopictus varies depending on the mosquito population involved and the virus transmitted. Wolbachia infection status in believed to be among key elements that determine viral transmission efficiency. Little is known about the cellular functions mobilized in Ae. albopictus during co-infection by Wolbachia and a given arbovirus. To decipher this tripartite interaction at the molecular level, we performed a proteome analysis in Ae. albopictus C6/36 cells mono-infected by Wolbachia wAlbB strain or Chikungunya virus (CHIKV), and bi-infected. We first confirmed significant inhibition of CHIKV by Wolbachia. Using two-dimensional gel electrophoresis followed by nano liquid chromatography coupled with tandem mass spectrometry, we identified 600 unique differentially expressed proteins mostly related to glycolysis, translation and protein metabolism. Wolbachia infection had greater impact on cellular functions than CHIKV infection, inducing either up or down-regulation of proteins associated with metabolic processes such as glycolysis and ATP metabolism, or structural glycoproteins and capsid proteins in the case of bi-infection with CHIKV. CHIKV infection inhibited expression of proteins linked with the processes of transcription, translation, lipid storage and miRNA pathways. The results of our proteome profiling have provided new insights into the molecular pathways involved in tripartite Ae. albopictus-Wolbachia-CHIKV interaction and may help defining targets for the better implementation of Wolbachia-based strategies for disease transmission control.
Broadening the horizon – level 2.5 of the HUPO-PSI format for molecular interactions
Kerrien, Samuel; Orchard, Sandra; Montecchi-Palazzi, Luisa; Aranda, Bruno; Quinn, Antony F; Vinod, Nisha; Bader, Gary D; Xenarios, Ioannis; Wojcik, Jérôme; Sherman, David; Tyers, Mike; Salama, John J; Moore, Susan; Ceol, Arnaud; Chatr-aryamontri, Andrew; Oesterheld, Matthias; Stümpflen, Volker; Salwinski, Lukasz; Nerothin, Jason; Cerami, Ethan; Cusick, Michael E; Vidal, Marc; Gilson, Michael; Armstrong, John; Woollard, Peter; Hogue, Christopher; Eisenberg, David; Cesareni, Gianni; Apweiler, Rolf; Hermjakob, Henning
2007-01-01
Background Molecular interaction Information is a key resource in modern biomedical research. Publicly available data have previously been provided in a broad array of diverse formats, making access to this very difficult. The publication and wide implementation of the Human Proteome Organisation Proteomics Standards Initiative Molecular Interactions (HUPO PSI-MI) format in 2004 was a major step towards the establishment of a single, unified format by which molecular interactions should be presented, but focused purely on protein-protein interactions. Results The HUPO-PSI has further developed the PSI-MI XML schema to enable the description of interactions between a wider range of molecular types, for example nucleic acids, chemical entities, and molecular complexes. Extensive details about each supported molecular interaction can now be captured, including the biological role of each molecule within that interaction, detailed description of interacting domains, and the kinetic parameters of the interaction. The format is supported by data management and analysis tools and has been adopted by major interaction data providers. Additionally, a simpler, tab-delimited format MITAB2.5 has been developed for the benefit of users who require only minimal information in an easy to access configuration. Conclusion The PSI-MI XML2.5 and MITAB2.5 formats have been jointly developed by interaction data producers and providers from both the academic and commercial sector, and are already widely implemented and well supported by an active development community. PSI-MI XML2.5 enables the description of highly detailed molecular interaction data and facilitates data exchange between databases and users without loss of information. MITAB2.5 is a simpler format appropriate for fast Perl parsing or loading into Microsoft Excel. PMID:17925023
Rudnick, Paul A.; Markey, Sanford P.; Roth, Jeri; Mirokhin, Yuri; Yan, Xinjian; Tchekhovskoi, Dmitrii V.; Edwards, Nathan J.; Thangudu, Ratna R.; Ketchum, Karen A.; Kinsinger, Christopher R.; Mesri, Mehdi; Rodriguez, Henry; Stein, Stephen E.
2016-01-01
The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics datasets from the mass spectrometric interrogation of tumor samples previously analyzed by The Cancer Genome Atlas (TCGA) program. The availability of the genomic and proteomic data is enabling proteogenomic study for both reference (i.e., contained in major sequence databases) and non-reference markers of cancer. The CPTAC labs have focused on colon, breast, and ovarian tissues in the first round of analyses; spectra from these datasets were produced from 2D LC-MS/MS analyses and represent deep coverage. To reduce the variability introduced by disparate data analysis platforms (e.g., software packages, versions, parameters, sequence databases, etc.), the CPTAC Common Data Analysis Platform (CDAP) was created. The CDAP produces both peptide-spectrum-match (PSM) reports and gene-level reports. The pipeline processes raw mass spectrometry data according to the following: (1) Peak-picking and quantitative data extraction, (2) database searching, (3) gene-based protein parsimony, and (4) false discovery rate (FDR)-based filtering. The pipeline also produces localization scores for the phosphopeptide enrichment studies using the PhosphoRS program. Quantitative information for each of the datasets is specific to the sample processing, with PSM and protein reports containing the spectrum-level or gene-level (“rolled-up”) precursor peak areas and spectral counts for label-free or reporter ion log-ratios for 4plex iTRAQ™. The reports are available in simple tab-delimited formats and, for the PSM-reports, in mzIdentML. The goal of the CDAP is to provide standard, uniform reports for all of the CPTAC data, enabling comparisons between different samples and cancer types as well as across the major ‘omics fields. PMID:26860878
Rudnick, Paul A; Markey, Sanford P; Roth, Jeri; Mirokhin, Yuri; Yan, Xinjian; Tchekhovskoi, Dmitrii V; Edwards, Nathan J; Thangudu, Ratna R; Ketchum, Karen A; Kinsinger, Christopher R; Mesri, Mehdi; Rodriguez, Henry; Stein, Stephen E
2016-03-04
The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics data sets from the mass spectrometric interrogation of tumor samples previously analyzed by The Cancer Genome Atlas (TCGA) program. The availability of the genomic and proteomic data is enabling proteogenomic study for both reference (i.e., contained in major sequence databases) and nonreference markers of cancer. The CPTAC laboratories have focused on colon, breast, and ovarian tissues in the first round of analyses; spectra from these data sets were produced from 2D liquid chromatography-tandem mass spectrometry analyses and represent deep coverage. To reduce the variability introduced by disparate data analysis platforms (e.g., software packages, versions, parameters, sequence databases, etc.), the CPTAC Common Data Analysis Platform (CDAP) was created. The CDAP produces both peptide-spectrum-match (PSM) reports and gene-level reports. The pipeline processes raw mass spectrometry data according to the following: (1) peak-picking and quantitative data extraction, (2) database searching, (3) gene-based protein parsimony, and (4) false-discovery rate-based filtering. The pipeline also produces localization scores for the phosphopeptide enrichment studies using the PhosphoRS program. Quantitative information for each of the data sets is specific to the sample processing, with PSM and protein reports containing the spectrum-level or gene-level ("rolled-up") precursor peak areas and spectral counts for label-free or reporter ion log-ratios for 4plex iTRAQ. The reports are available in simple tab-delimited formats and, for the PSM-reports, in mzIdentML. The goal of the CDAP is to provide standard, uniform reports for all of the CPTAC data to enable comparisons between different samples and cancer types as well as across the major omics fields.
Hierarchical Control of Semi-Autonomous Teams Under Uncertainty (HICST)
2004-05-01
17 2.4 Module 4: Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.5... Database SoW 1 2 34 5 Txt file: paths Figure 3: Integration of modules 1-5. The modules make provision for human intervention, not indicated in the...figure. SoW is ‘state of the world’. 3. Task execution; 4. Database for state estimation; 5. Java interface to OEP; 6. Robust dynamic programming for
Suk, Hyung; Knipe, David M
2015-06-01
The herpes simplex virus 1 virion protein 16 (VP16) tegument protein forms a transactivation complex with the cellular proteins host cell factor 1 (HCF-1) and octamer-binding transcription factor 1 (Oct-1) upon entry into the host cell. VP16 has also been shown to interact with a number of virion tegument proteins and viral glycoprotein H to promote viral assembly, but no comprehensive study of the VP16 proteome has been performed at early times postinfection. We therefore performed a proteomic analysis of VP16-interacting proteins at 3 h postinfection. We confirmed the interaction of VP16 with HCF-1 and a large number of cellular Mediator complex proteins, but most surprisingly, we found that the major viral protein associating with VP16 is the infected cell protein 4 (ICP4) immediate-early (IE) transactivator protein. These results raise the potential for a new function for VP16 in associating with the IE ICP4 and playing a role in transactivation of early and late gene expression, in addition to its well-documented function in transactivation of IE gene expression. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Proteomic analysis of oil bodies in mature Jatropha curcas seeds with different lipid content.
Liu, Hui; Wang, Cuiping; Chen, Fan; Shen, Shihua
2015-01-15
To reveal the difference among three mature Jatropha curcas seeds (JcVH, variant with high lipid content; JcW, wild type and JcVL, variant with low lipid content) with different lipid content, comparative proteomics was employed to profile the changes of oil body (OB) associated protein species by using gels-based proteomic technique. Eighty-three protein species were successfully identified through LTQ-ES-MS/MS from mature JcW seeds purified OBs. Two-dimensional electrophoresis analysis of J. curcas OB associated protein species revealed they had essential interactions with other organelles and demonstrated that oleosin and caleosin were the most abundant OB structural protein species. Twenty-eight OB associated protein species showed significant difference among JcVH, JcW and JcVL according to statistical analysis. Complementary transient expression analysis revealed that calcium ion binding protein (CalBP) and glycine-rich RNA binding protein (GRP) were well targeted in OBs apart from the oleosins. This study demonstrated that ratio of lipid content to caleosins abundance was involved in the regulation of OB size, and the mutant induced by ethylmethylsulfone treatment might be related to the caleosin like protein species. These findings are important for biotechnological improvement with the aim to alter the lipid content in J. curcas seeds. The economic value of Jatropha curcas largely depends on the lipid content in seeds which are mainly stored in the special organelle called oil bodies (OBs). In consideration of the biological importance and applications of J. curcas OB in seeds, it is necessary to further explore the components and functions of J. curcas OBs. Although a previous study concerning the J. curcas OB proteome revealed oleosins were the major OB protein component and additional protein species were similar to those in other oil seed plants, these identified OB associated protein species were corresponding to the protein bands instead of protein spots in the electrophoresis gels. Furthermore, the interaction of OB associated protein species and their contribution to OB formation and stabilization are still blank. In this study, with the overall object of profiling OB protein species from mature J. curcas seeds with different lipid content, we provided a setting of comparative OB proteomics with biochemical data and transient expression to explore the core of OB associated protein species involved in the regulation of OB size and lipid accumulation. The results were important for biotechnological improvement with the aim to a global modification of lipid storage in J. curcas seeds. Meanwhile, this study gave insight into possible associations between OBs and other organelles in mature J. curcas seeds. It may represent new aspects of the biological functions of the OBs during the oil mobilization. Combined the technique of transient transformation, a newly reported protein species, glycine-rich RNA binding protein (GRP) was successfully targeted in OBs. Therefore, further molecular analysis of these protein species is warranted to verify this association and what role they have in OBs. Copyright © 2014 Elsevier B.V. All rights reserved.
Morisawa, Hiraku; Hirota, Mikako; Toda, Tosifusa
2006-01-01
Background In the post-genome era, most research scientists working in the field of proteomics are confronted with difficulties in management of large volumes of data, which they are required to keep in formats suitable for subsequent data mining. Therefore, a well-developed open source laboratory information management system (LIMS) should be available for their proteomics research studies. Results We developed an open source LIMS appropriately customized for 2-D gel electrophoresis-based proteomics workflow. The main features of its design are compactness, flexibility and connectivity to public databases. It supports the handling of data imported from mass spectrometry software and 2-D gel image analysis software. The LIMS is equipped with the same input interface for 2-D gel information as a clickable map on public 2DPAGE databases. The LIMS allows researchers to follow their own experimental procedures by reviewing the illustrations of 2-D gel maps and well layouts on the digestion plates and MS sample plates. Conclusion Our new open source LIMS is now available as a basic model for proteome informatics, and is accessible for further improvement. We hope that many research scientists working in the field of proteomics will evaluate our LIMS and suggest ways in which it can be improved. PMID:17018156
Systematic Errors in Peptide and Protein Identification and Quantification by Modified Peptides*
Bogdanow, Boris; Zauber, Henrik; Selbach, Matthias
2016-01-01
The principle of shotgun proteomics is to use peptide mass spectra in order to identify corresponding sequences in a protein database. The quality of peptide and protein identification and quantification critically depends on the sensitivity and specificity of this assignment process. Many peptides in proteomic samples carry biochemical modifications, and a large fraction of unassigned spectra arise from modified peptides. Spectra derived from modified peptides can erroneously be assigned to wrong amino acid sequences. However, the impact of this problem on proteomic data has not yet been investigated systematically. Here we use combinations of different database searches to show that modified peptides can be responsible for 20–50% of false positive identifications in deep proteomic data sets. These false positive hits are particularly problematic as they have significantly higher scores and higher intensities than other false positive matches. Furthermore, these wrong peptide assignments lead to hundreds of false protein identifications and systematic biases in protein quantification. We devise a “cleaned search” strategy to address this problem and show that this considerably improves the sensitivity and specificity of proteomic data. In summary, we show that modified peptides cause systematic errors in peptide and protein identification and quantification and should therefore be considered to further improve the quality of proteomic data annotation. PMID:27215553
Lindsey, Merry L; Mayr, Manuel; Gomes, Aldrin V; Delles, Christian; Arrell, D Kent; Murphy, Anne M; Lange, Richard A; Costello, Catherine E; Jin, Yu-Fang; Laskowitz, Daniel T; Sam, Flora; Terzic, Andre; Van Eyk, Jennifer; Srinivas, Pothur R
2015-09-01
The year 2014 marked the 20th anniversary of the coining of the term proteomics. The purpose of this scientific statement is to summarize advances over this period that have catalyzed our capacity to address the experimental, translational, and clinical implications of proteomics as applied to cardiovascular health and disease and to evaluate the current status of the field. Key successes that have energized the field are delineated; opportunities for proteomics to drive basic science research, facilitate clinical translation, and establish diagnostic and therapeutic healthcare algorithms are discussed; and challenges that remain to be solved before proteomic technologies can be readily translated from scientific discoveries to meaningful advances in cardiovascular care are addressed. Proteomics is the result of disruptive technologies, namely, mass spectrometry and database searching, which drove protein analysis from 1 protein at a time to protein mixture analyses that enable large-scale analysis of proteins and facilitate paradigm shifts in biological concepts that address important clinical questions. Over the past 20 years, the field of proteomics has matured, yet it is still developing rapidly. The scope of this statement will extend beyond the reaches of a typical review article and offer guidance on the use of next-generation proteomics for future scientific discovery in the basic research laboratory and clinical settings. © 2015 American Heart Association, Inc.
Differential proteome profiling in the hippocampus of amnesic mice.
Baghel, Meghraj Singh; Thakur, Mahendra Kumar
2017-08-01
Amnesia or memory loss is associated with brain aging and several neurodegenerative pathologies including Alzheimer's disease (AD). This can be induced by a cholinergic antagonist scopolamine but the underlying molecular mechanism is poorly understood. This study of proteome profiling in the hippocampus could provide conceptual insights into the molecular mechanisms involved in amnesia. To reveal this, mice were administered scopolamine to induce amnesia and memory impairment was validated by novel object recognition test. Using two-dimensional gel electrophoresis coupled with MALDI-MS/MS, we have analyzed the hippocampal proteome and identified 18 proteins which were differentially expressed. Out of these proteins, 11 were downregulated and 7 were upregulated in scopolamine-treated mice as compared to control. In silico analysis showed that the majority of identified proteins are involved in metabolism, catalytic activity, and cytoskeleton architectural functions. STRING interaction network analysis revealed that majority of identified proteins exhibit common association with Actg1 cytoskeleton and Vdac1 energy transporter protein. Furthermore, interaction map analysis showed that Fascin1 and Coronin 1b individually interact with Actg1 and regulate the actin filament dynamics. Vdac1 was significantly downregulated in amnesic mice and showed interaction with other proteins in interaction network. Therefore, we silenced Vdac1 in the hippocampus of normal young mice and found similar impairment in recognition memory of Vdac1 silenced and scopolamine-treated mice. Thus, these findings suggest that Vdac1-mediated disruption of energy metabolism and cytoskeleton architecture might be involved in scopolamine-induced amnesia. © 2017 Wiley Periodicals, Inc.
CyanOmics: an integrated database of omics for the model cyanobacterium Synechococcus sp. PCC 7002.
Yang, Yaohua; Feng, Jie; Li, Tao; Ge, Feng; Zhao, Jindong
2015-01-01
Cyanobacteria are an important group of organisms that carry out oxygenic photosynthesis and play vital roles in both the carbon and nitrogen cycles of the Earth. The annotated genome of Synechococcus sp. PCC 7002, as an ideal model cyanobacterium, is available. A series of transcriptomic and proteomic studies of Synechococcus sp. PCC 7002 cells grown under different conditions have been reported. However, no database of such integrated omics studies has been constructed. Here we present CyanOmics, a database based on the results of Synechococcus sp. PCC 7002 omics studies. CyanOmics comprises one genomic dataset, 29 transcriptomic datasets and one proteomic dataset and should prove useful for systematic and comprehensive analysis of all those data. Powerful browsing and searching tools are integrated to help users directly access information of interest with enhanced visualization of the analytical results. Furthermore, Blast is included for sequence-based similarity searching and Cluster 3.0, as well as the R hclust function is provided for cluster analyses, to increase CyanOmics's usefulness. To the best of our knowledge, it is the first integrated omics analysis database for cyanobacteria. This database should further understanding of the transcriptional patterns, and proteomic profiling of Synechococcus sp. PCC 7002 and other cyanobacteria. Additionally, the entire database framework is applicable to any sequenced prokaryotic genome and could be applied to other integrated omics analysis projects. Database URL: http://lag.ihb.ac.cn/cyanomics. © The Author(s) 2015. Published by Oxford University Press.
Gioutlakis, Aris; Klapa, Maria I.
2017-01-01
It has been acknowledged that source databases recording experimentally supported human protein-protein interactions (PPIs) exhibit limited overlap. Thus, the reconstruction of a comprehensive PPI network requires appropriate integration of multiple heterogeneous primary datasets, presenting the PPIs at various genetic reference levels. Existing PPI meta-databases perform integration via normalization; namely, PPIs are merged after converted to a certain target level. Hence, the node set of the integrated network depends each time on the number and type of the combined datasets. Moreover, the irreversible a priori normalization process hinders the identification of normalization artifacts in the integrated network, which originate from the nonlinearity characterizing the genetic information flow. PICKLE (Protein InteraCtion KnowLedgebasE) 2.0 implements a new architecture for this recently introduced human PPI meta-database. Its main novel feature over the existing meta-databases is its approach to primary PPI dataset integration via genetic information ontology. Building upon the PICKLE principles of using the reviewed human complete proteome (RHCP) of UniProtKB/Swiss-Prot as the reference protein interactor set, and filtering out protein interactions with low probability of being direct based on the available evidence, PICKLE 2.0 first assembles the RHCP genetic information ontology network by connecting the corresponding genes, nucleotide sequences (mRNAs) and proteins (UniProt entries) and then integrates PPI datasets by superimposing them on the ontology network without any a priori transformations. Importantly, this process allows the resulting heterogeneous integrated network to be reversibly normalized to any level of genetic reference without loss of the original information, the latter being used for identification of normalization biases, and enables the appraisal of potential false positive interactions through PPI source database cross-checking. The PICKLE web-based interface (www.pickle.gr) allows for the simultaneous query of multiple entities and provides integrated human PPI networks at either the protein (UniProt) or the gene level, at three PPI filtering modes. PMID:29023571
mpMoRFsDB: a database of molecular recognition features in membrane proteins.
Gypas, Foivos; Tsaousis, Georgios N; Hamodrakas, Stavros J
2013-10-01
Molecular recognition features (MoRFs) are small, intrinsically disordered regions in proteins that undergo a disorder-to-order transition on binding to their partners. MoRFs are involved in protein-protein interactions and may function as the initial step in molecular recognition. The aim of this work was to collect, organize and store all membrane proteins that contain MoRFs. Membrane proteins constitute ∼30% of fully sequenced proteomes and are responsible for a wide variety of cellular functions. MoRFs were classified according to their secondary structure, after interacting with their partners. We identified MoRFs in transmembrane and peripheral membrane proteins. The position of transmembrane protein MoRFs was determined in relation to a protein's topology. All information was stored in a publicly available mySQL database with a user-friendly web interface. A Jmol applet is integrated for visualization of the structures. mpMoRFsDB provides valuable information related to disorder-based protein-protein interactions in membrane proteins. http://bioinformatics.biol.uoa.gr/mpMoRFsDB
Educational websites--Bioinformatics Tools II.
Lomberk, Gwen
2009-01-01
In this issue, the highlighted websites are a continuation of a series of educational websites; this one in particular from a couple of years ago, Bioinformatics Tools [Pancreatology 2005;5:314-315]. These include sites that are valuable resources for many research needs in genomics and proteomics. Bioinformatics has become a laboratory tool to map sequences to databases, develop models of molecular interactions, evaluate structural compatibilities, describe differences between normal and disease-associated DNA, identify conserved motifs within proteins, and chart extensive signaling networks, all in silico. Copyright 2008 S. Karger AG, Basel and IAP.
Wang, Da-Zhi; Gao, Yue; Lin, Lin; Hong, Hua-Sheng
2013-01-01
Alexandrium is a neurotoxin-producing dinoflagellate genus resulting in paralytic shellfish poisonings around the world. However, little is known about the toxin biosynthesis mechanism in Alexandrium. This study compared protein profiles of A. catenella collected at different toxin biosynthesis stages (non-toxin synthesis, initial toxin synthesis and toxin synthesizing) coupled with the cell cycle, and identified differentially expressed proteins using 2-DE and MALDI-TOF-TOF mass spectrometry. The results showed that toxin biosynthesis of A. catenella occurred within a defined time frame in the G1 phase of the cell cycle. Proteomic analysis indicated that 102 protein spots altered significantly in abundance (P < 0.05), and 53 proteins were identified using database searching. These proteins were involved in a variety of biological processes, i.e., protein modification and biosynthesis, metabolism, cell division, oxidative stress, transport, signal transduction, and translation. Among them, nine proteins with known functions in paralytic shellfish toxin-producing cyanobacteria, i.e., methionine S-adenosyltransferase, chloroplast ferredoxin-NADP+ reductase, S-adenosylhomocysteinase, adenosylhomocysteinase, ornithine carbamoyltransferase, inorganic pyrophosphatase, sulfotransferase (similar to), alcohol dehydrogenase and arginine deiminase, varied significantly at different toxin biosynthesis stages and formed an interaction network, indicating that they might be involved in toxin biosynthesis in A. catenella. This study is the first step in the dissection of the behavior of the A. catenella proteome during different toxin biosynthesis stages and provides new insights into toxin biosynthesis in dinoflagellates. PMID:23340676
Bini, Andressa Peres; Regiani, Thais; Franceschini, Lívia Maria; Budzinski, Ilara Gabriela Frasson; Marques, Felipe Garbelini; Labate, Mônica Teresa Veneziano; Guidetti-Gonzalez, Simone; Moon, David Henry; Labate, Carlos Alberto
2016-01-01
Puccinia psidii sensu lato (s.l.) is the causal agent of eucalyptus and guava rust, but it also attacks a wide range of plant species from the myrtle family, resulting in a significant genetic and physiological variability among populations accessed from different hosts. The uredospores are crucial to P. psidii dissemination in the field. Although they are important for the fungal pathogenesis, their molecular characterization has been poorly studied. In this work, we report the first in-depth proteomic analysis of P. psidii s.l. uredospores from two contrasting populations: guava fruits (PpGuava) and eucalyptus leaves (PpEucalyptus). NanoUPLC-MSE was used to generate peptide spectra that were matched to the UniProt Puccinia genera sequences (UniProt database) resulting in the first proteomic analysis of the phytopathogenic fungus P. psidii. Three hundred and fourty proteins were detected and quantified using Label free proteomics. A significant number of unique proteins were found for each sample, others were significantly more or less abundant, according to the fungal populations. In PpGuava population, many proteins correlated with fungal virulence, such as malate dehydrogenase, proteossomes subunits, enolases and others were increased. On the other hand, PpEucalyptus proteins involved in biogenesis, protein folding and translocation were increased, supporting the physiological variability of the fungal populations according to their protein reservoirs and specific host interaction strategies. PMID:26731728
Quecine, Maria Carolina; Leite, Thiago Falda; Bini, Andressa Peres; Regiani, Thais; Franceschini, Lívia Maria; Budzinski, Ilara Gabriela Frasson; Marques, Felipe Garbelini; Labate, Mônica Teresa Veneziano; Guidetti-Gonzalez, Simone; Moon, David Henry; Labate, Carlos Alberto
2016-01-01
Puccinia psidii sensu lato (s.l.) is the causal agent of eucalyptus and guava rust, but it also attacks a wide range of plant species from the myrtle family, resulting in a significant genetic and physiological variability among populations accessed from different hosts. The uredospores are crucial to P. psidii dissemination in the field. Although they are important for the fungal pathogenesis, their molecular characterization has been poorly studied. In this work, we report the first in-depth proteomic analysis of P. psidii s.l. uredospores from two contrasting populations: guava fruits (PpGuava) and eucalyptus leaves (PpEucalyptus). NanoUPLC-MSE was used to generate peptide spectra that were matched to the UniProt Puccinia genera sequences (UniProt database) resulting in the first proteomic analysis of the phytopathogenic fungus P. psidii. Three hundred and fourty proteins were detected and quantified using Label free proteomics. A significant number of unique proteins were found for each sample, others were significantly more or less abundant, according to the fungal populations. In PpGuava population, many proteins correlated with fungal virulence, such as malate dehydrogenase, proteossomes subunits, enolases and others were increased. On the other hand, PpEucalyptus proteins involved in biogenesis, protein folding and translocation were increased, supporting the physiological variability of the fungal populations according to their protein reservoirs and specific host interaction strategies.
Investigation of the stallion sperm proteome by mass spectrometry.
Swegen, Aleona; Curry, Benjamin J; Gibb, Zamira; Lambourne, Sarah R; Smith, Nathan D; Aitken, R John
2015-03-01
Stallion spermatozoa continue to present scientific and clinical challenges with regard to the biological mechanisms responsible for their survival and function. In particular, deeper understanding of sperm energy metabolism, defence against oxidative damage and cell-cell interactions should improve fertility assessment and the application of advanced reproductive technologies in the equine species. In this study, we used highly sensitive LC-MS/MS technology and sequence database analysis to identify and characterise the proteome of Percoll-isolated ejaculated equine spermatozoa, with the aim of furthering our understanding of this cell's complex biological machinery. We were able to identify 9883 peptides comprising 1030 proteins, which were subsequently attributed to 975 gene products. Gene ontology analysis for molecular and cellular processes revealed new information about the metabolism, antioxidant defences and receptors of stallion spermatozoa. Mitochondrial proteins and those involved in catabolic processes constituted dominant categories. Several enzymes specific to β-oxidation of fatty acids were identified, and further experiments were carried out to ascertain their functional significance. Inhibition of carnitine palmitoyl transferase 1, a rate-limiting enzyme of β-oxidation, reduced motility parameters, indicating that β-oxidation contributes to maintenance of motility in stallion spermatozoa. © 2015 Society for Reproduction and Fertility.
Wang, Hualin; Sit, Wat-Hung; Tipoe, George Lim; Wan, Jennifer Man-Fan
2014-12-01
Extra virgin olive oil (EVOO) presents benefits against chronic liver injury induced by hepatotoxins such as carbon tetrachloride (CCl4); however, the protective mechanisms remain unclear. In the present study, a two-dimensional gel based proteomic approach was constructed to explore the mechanisms. Rats are injected with CCl4 twice a week for 4 weeks to induce liver fibrosis, and were fed laboratory chow plus 20% (w/w) of either corn oil or EVOO over the entire experimental period. Histological staining, MDA assay and fibrogenesis marker gene analysis illustrate that the CCl4-treated animals fed EVOO have a lower fibrosis and lipid peroxidation level in the liver than the corn oil fed group. The proteomic study indicates that the protein expression of thioredoxin domain-containing protein 12, peroxiredoxin-1, thiosulphate sulphurtransferase, calcium-binding protein 1, Annexin A2 and heat shock cognate 71 kDa protein are higher in livers from EVOO-fed rats with the CCl4 treatment compared with those from rats fed with corn oil, whereas the expression of COQ9, cAMP-dependent protein kinase type I-alpha regulatory subunit, phenylalanine hydroxylase and glycerate kinase are lower. Our findings confirmed the benefits of EVOO against chronic liver injury, which may be attributable to the antioxidant effects, hepatocellular function regulation and hepatic metabolism modification effects of EVOO. Copyright © 2014 Elsevier Ltd. All rights reserved.
Zhang, Yixiang; Gao, Peng; Xing, Zhuo; Jin, Shumei; Chen, Zhide; Liu, Lantao; Constantino, Nasie; Wang, Xinwang; Shi, Weibing; Yuan, Joshua S.; Dai, Susie Y.
2013-01-01
High abundance proteins like ribulose-1,5-bisphosphate carboxylase oxygenase (Rubisco) impose a consistent challenge for the whole proteome characterization using shot-gun proteomics. To address this challenge, we developed and evaluated Polyethyleneimine Assisted Rubisco Cleanup (PARC) as a new method by combining both abundant protein removal and fractionation. The new approach was applied to a plant insect interaction study to validate the platform and investigate mechanisms for plant defense against herbivorous insects. Our results indicated that PARC can effectively remove Rubisco, improve the protein identification, and discover almost three times more differentially regulated proteins. The significantly enhanced shot-gun proteomics performance was translated into in-depth proteomic and molecular mechanisms for plant insect interaction, where carbon re-distribution was used to play an essential role. Moreover, the transcriptomic validation also confirmed the reliability of PARC analysis. Finally, functional studies were carried out for two differentially regulated genes as revealed by PARC analysis. Insect resistance was induced by over-expressing either jacalin-like or cupin-like genes in rice. The results further highlighted that PARC can serve as an effective strategy for proteomics analysis and gene discovery. PMID:23943779
Time Series Proteome Profiling
Formolo, Catherine A.; Mintz, Michelle; Takanohashi, Asako; Brown, Kristy J.; Vanderver, Adeline; Halligan, Brian; Hathout, Yetrib
2014-01-01
This chapter provides a detailed description of a method used to study temporal changes in the endoplasmic reticulum (ER) proteome of fibroblast cells exposed to ER stress agents (tunicamycin and thapsigargin). Differential stable isotope labeling by amino acids in cell culture (SILAC) is used in combination with crude ER fractionation, SDS–PAGE and LC-MS/MS to define altered protein expression in tunicamycin or thapsigargin treated cells versus untreated cells. Treated and untreated cells are harvested at different time points, mixed at a 1:1 ratio and processed for ER fractionation. Samples containing labeled and unlabeled proteins are separated by SDS–PAGE, bands are digested with trypsin and the resulting peptides analyzed by LC-MS/MS. Proteins are identified using Bioworks software and the Swiss-Prot data-base, whereas ratios of protein expression between treated and untreated cells are quantified using ZoomQuant software. Data visualization is facilitated by GeneSpring software. proteomics PMID:21082445
A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data
Chen, Yi-Hau
2017-01-01
Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T2-statistic into an R package T2GA, which is available at https://github.com/roqe/T2GA. PMID:28622336
A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data.
Lai, En-Yu; Chen, Yi-Hau; Wu, Kun-Pin
2017-06-01
Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T2-statistic into an R package T2GA, which is available at https://github.com/roqe/T2GA.
Mao, Song; Chai, Xiaoqiang; Hu, Yuling; Hou, Xugang; Tang, Yiheng; Bi, Cheng; Li, Xiao
2014-01-01
Mitochondrion plays a central role in diverse biological processes in most eukaryotes, and its dysfunctions are critically involved in a large number of diseases and the aging process. A systematic identification of mitochondrial proteomes and characterization of functional linkages among mitochondrial proteins are fundamental in understanding the mechanisms underlying biological functions and human diseases associated with mitochondria. Here we present a database MitProNet which provides a comprehensive knowledgebase for mitochondrial proteome, interactome and human diseases. First an inventory of mammalian mitochondrial proteins was compiled by widely collecting proteomic datasets, and the proteins were classified by machine learning to achieve a high-confidence list of mitochondrial proteins. The current version of MitProNet covers 1124 high-confidence proteins, and the remainders were further classified as middle- or low-confidence. An organelle-specific network of functional linkages among mitochondrial proteins was then generated by integrating genomic features encoded by a wide range of datasets including genomic context, gene expression profiles, protein-protein interactions, functional similarity and metabolic pathways. The functional-linkage network should be a valuable resource for the study of biological functions of mitochondrial proteins and human mitochondrial diseases. Furthermore, we utilized the network to predict candidate genes for mitochondrial diseases using prioritization algorithms. All proteins, functional linkages and disease candidate genes in MitProNet were annotated according to the information collected from their original sources including GO, GEO, OMIM, KEGG, MIPS, HPRD and so on. MitProNet features a user-friendly graphic visualization interface to present functional analysis of linkage networks. As an up-to-date database and analysis platform, MitProNet should be particularly helpful in comprehensive studies of complicated biological mechanisms underlying mitochondrial functions and human mitochondrial diseases. MitProNet is freely accessible at http://bio.scu.edu.cn:8085/MitProNet. PMID:25347823
Yang, Xia; Zhang, Zichang; Gu, Tao; Dong, Mingchao; Peng, Qiong; Bai, Lianyang; Li, Yongfeng
2017-01-06
Barnyardgrass (Echinochloa crus-galli) is one of the top 15 herbicide-resistant weeds around the world that interferes with rice growth, resulting in major losses of rice yield. Thus, multi-herbicide resistance in barnyardgrass presents a major threat, with the underlying mechanisms that contribute to resistance requiring elucidation. In an attempt to characterize this multi-herbicide resistance at the proteomic level, comparative analysis of resistant and susceptible barnyardgrasses was performed using iTRAQ, both with and without quinclorac, bispyribac-sodium and penoxsulam herbicidal treatment. A total of 1342 protein species were identified from 2248 unique peptides by searching the UniProt database and conducting data analysis. Approximately 904 protein species with 4774 Gene Ontology (GO) terms were grouped into the categories of biological process, cellular component and molecular function. Among these, 688 protein species were annotated into 1583 KEGG pathways, with 980 protein species relating to metabolism and 93 relating to environmental information processing. A total of 292 protein species showed more than a 1.2-fold change in abundance in the resistant biotype relative to the susceptible biotype. Furthermore, herbicide treatment resulted in 157 protein species that showed more than a 1.2-fold change in the resistant biotype. Moreover, physiological analyses demonstrated an ecological fitness cost in the resistant biotype. While some studies have shown a fitness cost to be associated with an altered ecological interaction, our understanding of the fitness costs associated with herbicide resistance are limited. Herein, physiological and proteomic analysis demonstrates herbicide resistance associated ecological fitness cost and potential mechanisms of herbicide-resistance in resistant biotypes of E. crus-galli. The results presented herein have revealed differences in ecological adaptation between resistant and susceptible biotypes in E. crus-galli and provide a fundamental basis enabling the development of new strategies for weed control. Lastly, this is the first large-scale proteomics study to examine herbicide stress responses in different barnyardgrass biotypes. Copyright © 2016 Elsevier B.V. All rights reserved.
Verheggen, Kenneth; Raeder, Helge; Berven, Frode S; Martens, Lennart; Barsnes, Harald; Vaudel, Marc
2017-09-13
Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines. © 2017 Wiley Periodicals, Inc.
Kanaujiya, Jitendra Kumar; Lochab, Savita; Kapoor, Isha; Pal, Pooja; Datta, Dipak; Bhatt, Madan L B; Sanyal, Sabyasachi; Behre, Gerhard; Trivedi, Arun Kumar
2013-07-01
Nuclear receptor coregulators play an important role in the transcriptional regulation of nuclear receptors. In the present study, we aimed to identify estrogen receptor α (ERα) interacting proteins in Tamoxifen treated MCF7 cells. Using in vitro GST-pull down assay with ERα ligand-binding domain (ERα-LBD) and MS-based proteomics approach we identified Profilin1 as a novel ERα interacting protein. Profilin1 contains I/LXX/L/H/I amino acid signature motif required for corepressor interaction with ERα. We show that these two proteins physically interact with each other both in vitro as well as in vivo by GST-pull down and coimmunoprecipitation, respectively. We further show that these two proteins also colocalize together in the nucleus. Previous studies have reported reduced expression of Profilin1 in breast cancer; and here we found that Tamoxifen increases Profilin1 expression in MCF7 cells. Our data demonstrate that over expression of Profilin1 inhibits ERα-mediated transcriptional activation as well as its downstream target genes in ERα positive breast cancer cells MCF7. In addition, Profilin1 overexpression in MCF7 cells leads to inhibition of cell proliferation that apparently is due to enhanced apoptosis. In nutshell, these data indicate that MS-based proteomics approach identifies a novel ERα interacting protein Profilin1 that serves as a putative corepressor of ERα functions. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Schilmiller, Anthony L.; Miner, Dennis P.; Larson, Matthew; McDowell, Eric; Gang, David R.; Wilkerson, Curtis; Last, Robert L.
2010-01-01
Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces β-caryophyllene and α-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells. PMID:20431087
Rigbolt, Kristoffer T G; Vanselow, Jens T; Blagoev, Blagoy
2011-08-01
Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics experimenters with a toolbox for bioinformatics analysis of quantitative proteomics data. The program is released as open-source and can be freely downloaded from the project webpage at http://gprox.sourceforge.net.
Rigbolt, Kristoffer T. G.; Vanselow, Jens T.; Blagoev, Blagoy
2011-01-01
Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)1. The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics experimenters with a toolbox for bioinformatics analysis of quantitative proteomics data. The program is released as open-source and can be freely downloaded from the project webpage at http://gprox.sourceforge.net. PMID:21602510
In-depth proteomic analysis of banana (Musa spp.) fruit with combinatorial peptide ligand libraries.
Esteve, Clara; D'Amato, Alfonsina; Marina, María Luisa; García, María Concepción; Righetti, Pier Giorgio
2013-01-01
Musa ssp. is among the world's leading fruit crops. Although a strong interest on banana biochemistry exists in the scientific community, focused on metabolite composition, proteins have been scarcely investigated even if they play an important role in food allergy and stability, are a source of biologically active peptides, and can provide information about nutritional aspects of this fruit. In this work we have employed the combinatorial peptide ligand libraries after different types of protein extractions, for searching the very low-abundance proteins in banana. The use of advanced MS techniques and Musa ssp. mRNAs database in combination with the Uniprot_viridiplantae database allowed us to identify 1131 proteins. Among this huge amount of proteins we found several already known allergens such as Mus a 1, pectinesterase, superoxide dismutase, and potentially new allergens. Additionally several enzymes involved in degradation of starch granules and strictly correlated to ripening stage were identified. This is the first in-depth exploration of the banana fruit proteome and one of the largest descriptions of the proteome of any vegetable system. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Proteomics and Metabolomics: Two Emerging Areas for Legume Improvement
Ramalingam, Abirami; Kudapa, Himabindu; Pazhamala, Lekha T.; Weckwerth, Wolfram; Varshney, Rajeev K.
2015-01-01
The crop legumes such as chickpea, common bean, cowpea, peanut, pigeonpea, soybean, etc. are important sources of nutrition and contribute to a significant amount of biological nitrogen fixation (>20 million tons of fixed nitrogen) in agriculture. However, the production of legumes is constrained due to abiotic and biotic stresses. It is therefore imperative to understand the molecular mechanisms of plant response to different stresses and identify key candidate genes regulating tolerance which can be deployed in breeding programs. The information obtained from transcriptomics has facilitated the identification of candidate genes for the given trait of interest and utilizing them in crop breeding programs to improve stress tolerance. However, the mechanisms of stress tolerance are complex due to the influence of multi-genes and post-transcriptional regulations. Furthermore, stress conditions greatly affect gene expression which in turn causes modifications in the composition of plant proteomes and metabolomes. Therefore, functional genomics involving various proteomics and metabolomics approaches have been obligatory for understanding plant stress tolerance. These approaches have also been found useful to unravel different pathways related to plant and seed development as well as symbiosis. Proteome and metabolome profiling using high-throughput based systems have been extensively applied in the model legume species, Medicago truncatula and Lotus japonicus, as well as in the model crop legume, soybean, to examine stress signaling pathways, cellular and developmental processes and nodule symbiosis. Moreover, the availability of protein reference maps as well as proteomics and metabolomics databases greatly support research and understanding of various biological processes in legumes. Protein-protein interaction techniques, particularly the yeast two-hybrid system have been advantageous for studying symbiosis and stress signaling in legumes. In this review, several studies on proteomics and metabolomics in model and crop legumes have been discussed. Additionally, applications of advanced proteomics and metabolomics approaches have also been included in this review for future applications in legume research. The integration of these “omics” approaches will greatly support the identification of accurate biomarkers in legume smart breeding programs. PMID:26734026
The UniProtKB guide to the human proteome
Breuza, Lionel; Poux, Sylvain; Estreicher, Anne; Famiglietti, Maria Livia; Magrane, Michele; Tognolli, Michael; Bridge, Alan; Baratin, Delphine; Redaschi, Nicole
2016-01-01
Advances in high-throughput and advanced technologies allow researchers to routinely perform whole genome and proteome analysis. For this purpose, they need high-quality resources providing comprehensive gene and protein sets for their organisms of interest. Using the example of the human proteome, we will describe the content of a complete proteome in the UniProt Knowledgebase (UniProtKB). We will show how manual expert curation of UniProtKB/Swiss-Prot is complemented by expert-driven automatic annotation to build a comprehensive, high-quality and traceable resource. We will also illustrate how the complexity of the human proteome is captured and structured in UniProtKB. Database URL: www.uniprot.org PMID:26896845
The November 1, 2017 issue of Cancer Research is dedicated to a collection of computational resource papers in genomics, proteomics, animal models, imaging, and clinical subjects for non-bioinformaticists looking to incorporate computing tools into their work. Scientists at Pacific Northwest National Laboratory have developed P-MartCancer, an open, web-based interactive software tool that enables statistical analyses of peptide or protein data generated from mass-spectrometry (MS)-based global proteomics experiments.
Ichibangase, Tomoko; Sugawara, Yasuhiro; Yamabe, Akio; Koshiyama, Akiyo; Yoshimura, Akari; Enomoto, Takemi; Imai, Kazuhiro
2012-01-01
Systems biology aims to understand biological phenomena in terms of complex biological and molecular interactions, and thus proteomics plays an important role in elucidating protein networks. However, many proteomic methods have suffered from their high variability, resulting in only showing altered protein names. Here, we propose a strategy for elucidating cellular protein networks based on an FD-LC-MS/MS proteomic method. The strategy permits reproducible relative quantitation of differences in protein levels between different cell populations and allows for integration of the data with those obtained through other methods. We demonstrate the validity of the approach through a comparison of differential protein expression in normal and conditional superoxide dismutase 1 gene knockout cells and believe that beginning with an FD-LC-MS/MS proteomic approach will enable researchers to elucidate protein networks more easily and comprehensively. PMID:23029042
Identifying active methane-oxidizers in thawed Arctic permafrost by proteomics
NASA Astrophysics Data System (ADS)
Lau, C. M.; Stackhouse, B. T.; Chourey, K.; Hettich, R. L.; Vishnivetskaya, T. A.; Pfiffner, S. M.; Layton, A. C.; Mykytczuk, N. C.; Whyte, L.; Onstott, T. C.
2012-12-01
The rate of CH4 release from thawing permafrost in the Arctic has been regarded as one of the determining factors on future global climate. It is uncertain how indigenous microorganisms would interact with such changing environmental conditions and hence their impact on the fate of carbon compounds that are sequestered in the cryosol. Multitudinous studies of pristine surface cryosol (top 5 cm) and microcosm experiments have provided growing evidence of effective methanotrophy. Cryosol samples corresponding to active layer were sampled from a sparsely vegetated, ice-wedge polygon at the McGill Arctic Research Station at Axel Heiberg Island, Nunavut, Canada (N79°24, W90°45) before the onset of annual thaw. Pyrosequencing of 16S rRNA gene indicated the occurrence of methanotroph-containing bacterial families as minor components (~5%) in pristine cryosol including Bradyrhizobiaceae, Methylobacteriaceae and Methylocystaceae within alpha-Proteobacteria, and Methylacidiphilaceae within Verrucomicrobia. The potential of methanotrophy is supported by preliminary analysis of metagenome data, which indicated putative methane monooxygenase gene sequences relating to Bradyrhizobium sp. and Pseudonocardia sp. are present. Proteome profiling in general yielded minute traces of proteins, which likely hints at dormant nature of the soil microbial consortia. The lack of specific protein database for permafrost posted additional challenge to protein identification. Only 35 proteins could be identified in the pristine cryosol and of which 60% belonged to Shewanella sp. Most of the identified proteins are known to be involved in energy metabolism or post-translational modification of proteins. Microcosms amended with sodium acetate exhibited a net methane consumption of ~65 ngC-CH4 per gram (fresh weight) of soil over 16 days of aerobic incubation at room temperature. The pH in microcosm materials remained acidic (decreased from initial 4.7 to 4.5). Protein extraction and characterization identified ~350 proteins, confirmed enhanced microbial activities and significant shift in community structure within the microcosms. Although the activity of Shewanella sp. was suppressed by the incubation conditions, other bacteria were activated. This was shown by at least 3-fold increase in the number of identified proteins, which were primarily players in cellular energy metabolism. Among them, Geobacter sp. and methane-oxidizers, Bradyrhizobium sp., Methylosinus sp. and Methylocystis sp. appear dominant. In order to advance the protein database for better biodiversity and functional identification, we are currently using duo extraction protocols and consolidating metagenome data obtained from the same soil samples. A depth profile (from active to permafrost layer) for methanotrophs is being determined by examining pristine cores, thawed cryosols as well as enrichment cultures. The proteome information from these samples will be presented, which will be complemented by molecular studies.
Rasool, Khawaja Ghulam; Khan, Muhammad Altaf; Aldawood, Abdulrahman Saad; Tufail, Muhammad; Mukhtar, Muhammad; Takeda, Makio
2015-01-01
A state of the art proteomic methodology using Matrix Assisted Laser Desorption/Ionization-Time of Flight (MALDI TOF) has been employed to characterize peptides modulated in the date palm stem subsequent to infestation with red palm weevil (RPW). Our analyses revealed 32 differentially expressed peptides associated with RPW infestation in date palm stem. To identify RPW infestation associated peptides (I), artificially wounded plants (W) were used as additional control beside uninfested plants, a conventional control (C). A constant unique pattern of differential expression in infested (I), wounded (W) stem samples compared to control (C) was observed. The upregulated proteins showed relative fold intensity in order of I > W and downregulated spots trend as W > I, a quite interesting pattern. This study also reveals that artificially wounding of date palm stem affects almost the same proteins as infestation; however, relative intensity is quite lower than in infested samples both in up and downregulated spots. All 32 differentially expressed spots were subjected to MALDI-TOF analysis for their identification and we were able to match 21 proteins in the already existing databases. Relatively significant modulated expression pattern of a number of peptides in infested plants predicts the possibility of developing a quick and reliable molecular methodology for detecting plants infested with date palm. PMID:26287180
Rasool, Khawaja Ghulam; Khan, Muhammad Altaf; Aldawood, Abdulrahman Saad; Tufail, Muhammad; Mukhtar, Muhammad; Takeda, Makio
2015-08-17
A state of the art proteomic methodology using Matrix Assisted Laser Desorption/Ionization-Time of Flight (MALDI TOF) has been employed to characterize peptides modulated in the date palm stem subsequent to infestation with red palm weevil (RPW). Our analyses revealed 32 differentially expressed peptides associated with RPW infestation in date palm stem. To identify RPW infestation associated peptides (I), artificially wounded plants (W) were used as additional control beside uninfested plants, a conventional control (C). A constant unique pattern of differential expression in infested (I), wounded (W) stem samples compared to control (C) was observed. The upregulated proteins showed relative fold intensity in order of I > W and downregulated spots trend as W > I, a quite interesting pattern. This study also reveals that artificially wounding of date palm stem affects almost the same proteins as infestation; however, relative intensity is quite lower than in infested samples both in up and downregulated spots. All 32 differentially expressed spots were subjected to MALDI-TOF analysis for their identification and we were able to match 21 proteins in the already existing databases. Relatively significant modulated expression pattern of a number of peptides in infested plants predicts the possibility of developing a quick and reliable molecular methodology for detecting plants infested with date palm.
Proteomic Analysis of the Cell Cycle of Procylic Form Trypanosoma brucei.
Crozier, Thomas W M; Tinti, Michele; Wheeler, Richard J; Ly, Tony; Ferguson, Michael A J; Lamond, Angus I
2018-06-01
We describe a single-step centrifugal elutriation method to produce synchronous Gap1 (G1)-phase procyclic trypanosomes at a scale amenable for proteomic analysis of the cell cycle. Using ten-plex tandem mass tag (TMT) labeling and mass spectrometry (MS)-based proteomics technology, the expression levels of 5325 proteins were quantified across the cell cycle in this parasite. Of these, 384 proteins were classified as cell-cycle regulated and subdivided into nine clusters with distinct temporal regulation. These groups included many known cell cycle regulators in trypanosomes, which validates the approach. In addition, we identify 40 novel cell cycle regulated proteins that are essential for trypanosome survival and thus represent potential future drug targets for the prevention of trypanosomiasis. Through cross-comparison to the TrypTag endogenous tagging microscopy database, we were able to validate the cell-cycle regulated patterns of expression for many of the proteins of unknown function detected in our proteomic analysis. A convenient interface to access and interrogate these data is also presented, providing a useful resource for the scientific community. Data are available via ProteomeXchange with identifier PXD008741 (https://www.ebi.ac.uk/pride/archive/). © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.
Stangeland, Biljana; Mughal, Awais A; Grieg, Zanina; Sandberg, Cecilie Jonsgar; Joel, Mrinal; Nygård, Ståle; Meling, Torstein; Murrell, Wayne; Vik Mo, Einar O; Langmoen, Iver A
2015-09-22
Glioblastoma (GBM) is both the most common and the most lethal primary brain tumor. It is thought that GBM stem cells (GSCs) are critically important in resistance to therapy. Therefore, there is a strong rationale to target these cells in order to develop new molecular therapies.To identify molecular targets in GSCs, we compared gene expression in GSCs to that in neural stem cells (NSCs) from the adult human brain, using microarrays. Bioinformatic filtering identified 20 genes (PBK/TOPK, CENPA, KIF15, DEPDC1, CDC6, DLG7/DLGAP5/HURP, KIF18A, EZH2, HMMR/RHAMM/CD168, NOL4, MPP6, MDM1, RAPGEF4, RHBDD1, FNDC3B, FILIP1L, MCC, ATXN7L4/ATXN7L1, P2RY5/LPAR6 and FAM118A) that were consistently expressed in GSC cultures and consistently not expressed in NSC cultures. The expression of these genes was confirmed in clinical samples (TCGA and REMBRANDT). The first nine genes were highly co-expressed in all GBM subtypes and were part of the same protein-protein interaction network. Furthermore, their combined up-regulation correlated negatively with patient survival in the mesenchymal GBM subtype. Using targeted proteomics and the COGNOSCENTE database we linked these genes to GBM signalling pathways.Nine genes: PBK, CENPA, KIF15, DEPDC1, CDC6, DLG7, KIF18A, EZH2 and HMMR should be further explored as targets for treatment of GBM.
Griss, Johannes; Perez-Riverol, Yasset; Lewis, Steve; Tabb, David L.; Dianes, José A.; del-Toro, Noemi; Rurik, Marc; Walzer, Mathias W.; Kohlbacher, Oliver; Hermjakob, Henning; Wang, Rui; Vizcaíno, Juan Antonio
2016-01-01
Mass spectrometry (MS) is the main technology used in proteomics approaches. However, on average 75% of spectra analysed in an MS experiment remain unidentified. We propose to use spectrum clustering at a large-scale to shed a light on these unidentified spectra. PRoteomics IDEntifications database (PRIDE) Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in PRIDE Archive, coming from hundreds of datasets, we were able to consistently characterize three distinct groups of spectra: 1) incorrectly identified spectra, 2) spectra correctly identified but below the set scoring threshold, and 3) truly unidentified spectra. Using a multitude of complementary analysis approaches, we were able to identify less than 20% of the consistently unidentified spectra. The complete spectrum clustering results are available through the new version of the PRIDE Cluster resource (http://www.ebi.ac.uk/pride/cluster). This resource is intended, among other aims, to encourage and simplify further investigation into these unidentified spectra. PMID:27493588
Griss, Johannes; Perez-Riverol, Yasset; Lewis, Steve; Tabb, David L; Dianes, José A; Del-Toro, Noemi; Rurik, Marc; Walzer, Mathias W; Kohlbacher, Oliver; Hermjakob, Henning; Wang, Rui; Vizcaíno, Juan Antonio
2016-08-01
Mass spectrometry (MS) is the main technology used in proteomics approaches. However, on average 75% of spectra analysed in an MS experiment remain unidentified. We propose to use spectrum clustering at a large-scale to shed a light on these unidentified spectra. PRoteomics IDEntifications database (PRIDE) Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in PRIDE Archive, coming from hundreds of datasets, we were able to consistently characterize three distinct groups of spectra: 1) incorrectly identified spectra, 2) spectra correctly identified but below the set scoring threshold, and 3) truly unidentified spectra. Using a multitude of complementary analysis approaches, we were able to identify less than 20% of the consistently unidentified spectra. The complete spectrum clustering results are available through the new version of the PRIDE Cluster resource (http://www.ebi.ac.uk/pride/cluster). This resource is intended, among other aims, to encourage and simplify further investigation into these unidentified spectra.
Vrana, Julie A.; Theis, Jason D.; Dasari, Surendra; Mereuta, Oana M.; Dispenzieri, Angela; Zeldenrust, Steven R.; Gertz, Morie A.; Kurtin, Paul J.; Grogg, Karen L.; Dogan, Ahmet
2014-01-01
Examination of abdominal subcutaneous fat aspirates is a practical, sensitive and specific method for the diagnosis of systemic amyloidosis. Here we describe the development and implementation of a clinical assay using mass spectrometry-based proteomics to type amyloidosis in subcutaneous fat aspirates. First, we validated the assay comparing amyloid-positive (n=43) and -negative (n=26) subcutaneous fat aspirates. The assay classified amyloidosis with 88% sensitivity and 96% specificity. We then implemented the assay as a clinical test, and analyzed 366 amyloid-positive subcutaneous fat aspirates in a 4-year period as part of routine clinical care. The assay had a sensitivity of 90%, and diverse amyloid types, including immunoglobulin light chain (74%), transthyretin (13%), serum amyloid A (%1), gelsolin (1%), and lysozyme (1%), were identified. Using bioinformatics, we identified a universal amyloid proteome signature, which has high sensitivity and specificity for amyloidosis similar to that of Congo red staining. We curated proteome databases which included variant proteins associated with systemic amyloidosis, and identified clonotypic immunoglobulin variable gene usage in immunoglobulin light chain amyloidosis, and the variant peptides in hereditary transthyretin amyloidosis. In conclusion, mass spectrometry-based proteomic analysis of subcutaneous fat aspirates offers a powerful tool for the diagnosis and typing of systemic amyloidosis. The assay reveals the underlying pathogenesis by identifying variable gene usage in immunoglobulin light chains and the variant peptides in hereditary amyloidosis. PMID:24747948
Adaptation of Decoy Fusion Strategy for Existing Multi-Stage Search Workflows
NASA Astrophysics Data System (ADS)
Ivanov, Mark V.; Levitsky, Lev I.; Gorshkov, Mikhail V.
2016-09-01
A number of proteomic database search engines implement multi-stage strategies aiming at increasing the sensitivity of proteome analysis. These approaches often employ a subset of the original database for the secondary stage of analysis. However, if target-decoy approach (TDA) is used for false discovery rate (FDR) estimation, the multi-stage strategies may violate the underlying assumption of TDA that false matches are distributed uniformly across the target and decoy databases. This violation occurs if the numbers of target and decoy proteins selected for the second search are not equal. Here, we propose a method of decoy database generation based on the previously reported decoy fusion strategy. This method allows unbiased TDA-based FDR estimation in multi-stage searches and can be easily integrated into existing workflows utilizing popular search engines and post-search algorithms.
Friso, Giulia; Giacomelli, Lisa; Ytterberg, A Jimmy; Peltier, Jean-Benoit; Rudella, Andrea; Sun, Qi; Wijk, Klaas J van
2004-02-01
An extensive analysis of the Arabidopsis thaliana peripheral and integral thylakoid membrane proteome was performed by sequential extractions with salt, detergent, and organic solvents, followed by multidimensional protein separation steps (reverse-phase HPLC and one- and two-dimensional electrophoresis gels), different enzymatic and nonenzymatic protein cleavage techniques, mass spectrometry, and bioinformatics. Altogether, 154 proteins were identified, of which 76 (49%) were alpha-helical integral membrane proteins. Twenty-seven new proteins without known function but with predicted chloroplast transit peptides were identified, of which 17 (63%) are integral membrane proteins. These new proteins, likely important in thylakoid biogenesis, include two rubredoxins, a potential metallochaperone, and a new DnaJ-like protein. The data were integrated with our analysis of the lumenal-enriched proteome. We identified 83 out of 100 known proteins of the thylakoid localized photosynthetic apparatus, including several new paralogues and some 20 proteins involved in protein insertion, assembly, folding, or proteolysis. An additional 16 proteins are involved in translation, demonstrating that the thylakoid membrane surface is an important site for protein synthesis. The high coverage of the photosynthetic apparatus and the identification of known hydrophobic proteins with low expression levels, such as cpSecE, Ohp1, and Ohp2, indicate an excellent dynamic resolution of the analysis. The sequential extraction process proved very helpful to validate transmembrane prediction. Our data also were cross-correlated to chloroplast subproteome analyses by other laboratories. All data are deposited in a new curated plastid proteome database (PPDB) with multiple search functions (http://cbsusrv01.tc.cornell.edu/users/ppdb/). This PPDB will serve as an expandable resource for the plant community.
Shteynberg, David; Deutsch, Eric W.; Lam, Henry; Eng, Jimmy K.; Sun, Zhi; Tasman, Natalie; Mendoza, Luis; Moritz, Robert L.; Aebersold, Ruedi; Nesvizhskii, Alexey I.
2011-01-01
The combination of tandem mass spectrometry and sequence database searching is the method of choice for the identification of peptides and the mapping of proteomes. Over the last several years, the volume of data generated in proteomic studies has increased dramatically, which challenges the computational approaches previously developed for these data. Furthermore, a multitude of search engines have been developed that identify different, overlapping subsets of the sample peptides from a particular set of tandem mass spectrometry spectra. We present iProphet, the new addition to the widely used open-source suite of proteomic data analysis tools Trans-Proteomics Pipeline. Applied in tandem with PeptideProphet, it provides more accurate representation of the multilevel nature of shotgun proteomic data. iProphet combines the evidence from multiple identifications of the same peptide sequences across different spectra, experiments, precursor ion charge states, and modified states. It also allows accurate and effective integration of the results from multiple database search engines applied to the same data. The use of iProphet in the Trans-Proteomics Pipeline increases the number of correctly identified peptides at a constant false discovery rate as compared with both PeptideProphet and another state-of-the-art tool Percolator. As the main outcome, iProphet permits the calculation of accurate posterior probabilities and false discovery rate estimates at the level of sequence identical peptide identifications, which in turn leads to more accurate probability estimates at the protein level. Fully integrated with the Trans-Proteomics Pipeline, it supports all commonly used MS instruments, search engines, and computer platforms. The performance of iProphet is demonstrated on two publicly available data sets: data from a human whole cell lysate proteome profiling experiment representative of typical proteomic data sets, and from a set of Streptococcus pyogenes experiments more representative of organism-specific composite data sets. PMID:21876204
Analysis of the pumpkin phloem proteome provides insights into angiosperm sieve tube function.
Lin, Ming-Kuem; Lee, Young-Jin; Lough, Tony J; Phinney, Brett S; Lucas, William J
2009-02-01
Increasing evidence suggests that proteins present in the angiosperm sieve tube system play an important role in the long distance signaling system of plants. To identify the nature of these putatively non-cell-autonomous proteins, we adopted a large scale proteomics approach to analyze pumpkin phloem exudates. Phloem proteins were fractionated by fast protein liquid chromatography using both anion and cation exchange columns and then either in-solution or in-gel digested following further separation by SDS-PAGE. A total of 345 LC-MS/MS data sets were analyzed using a combination of Mascot and X!Tandem against the NCBI non-redundant green plant database and an extensive Cucurbit maxima expressed sequence tag database. In this analysis, 1,209 different consensi were obtained of which 1,121 could be annotated from GenBank and BLAST search analyses against three plant species, Arabidopsis thaliana, rice (Oryza sativa), and poplar (Populus trichocarpa). Gene ontology (GO) enrichment analyses identified sets of phloem proteins that function in RNA binding, mRNA translation, ubiquitin-mediated proteolysis, and macromolecular and vesicle trafficking. Our findings indicate that protein synthesis and turnover, processes that were thought to be absent in enucleate sieve elements, likely occur within the angiosperm phloem translocation stream. In addition, our GO analysis identified a set of phloem proteins that are associated with the GO term "embryonic development ending in seed dormancy"; this finding raises the intriguing question as to whether the phloem may exert some level of control over seed development. The universal significance of the phloem proteome was highlighted by conservation of the phloem proteome in species as diverse as monocots (rice), eudicots (Arabidopsis and pumpkin), and trees (poplar). These results are discussed from the perspective of the role played by the phloem proteome as an integral component of the whole plant communication system.
Putim, Chanyanuch; Phaonakrop, Narumon; Jaresitthikunchai, Janthima; Gamngoen, Ratikorn; Tragoolpua, Khajornsak; Intorasoot, Sorasak; Anukool, Usanee; Tharincharoen, Chayada Sitthidet; Phunpae, Ponrut; Tayapiwatana, Chatchai; Kasinrerk, Watchara; Roytrakul, Sittiruk; Butr-Indr, Bordin
2018-03-01
The emergence of drug-resistant tuberculosis has generated great concern in the control of tuberculosis and HIV/TB patients have established severe complications that are difficult to treat. Although, the gold standard of drug-susceptibility testing is highly accurate and efficient, it is time-consuming. Diagnostic biomarkers are, therefore, necessary in discriminating between infection from drug-resistant and drug-susceptible strains. One strategy that aids to effectively control tuberculosis is understanding the function of secreting proteins that mycobacteria use to manipulate the host cellular defenses. In this study, culture filtrate proteins from Mycobacterium tuberculosis H37Rv, isoniazid-resistant, rifampicin-resistant and multidrug-resistant strains were gathered and profiled by shotgun-proteomics technique. Mass spectrometric analysis of the secreted proteome identified several proteins, of which 837, 892, 838 and 850 were found in M. tuberculosis H37Rv, isoniazid-resistant, rifampicin-resistant and multidrug-resistant strains, respectively. These proteins have been implicated in various cellular processes, including biological adhesion, biological regulation, developmental process, immune system process localization, cellular process, cellular component organization or biogenesis, metabolic process, and response to stimulus. Analysis based on STITCH database predicted the interaction of DNA topoisomerase I, 3-oxoacyl-(acyl-carrier protein) reductase, ESAT-6-like protein, putative prophage phiRv2 integrase, and 3-phosphoshikimate 1-carboxyvinyltransferase with isoniazid, rifampicin, pyrazinamide, ethambutol and streptomycin, suggesting putative roles in controlling the anti-tuberculosis ability. However, several proteins with no interaction with all first-line anti-tuberculosis drugs might be used as markers for mycobacterial identification.
Woo, Sunghee; Cha, Seong Won; Na, Seungjin; ...
2014-11-17
Cancer is driven by the acquisition of somatic DNA lesions. Distinguishing the early driver mutations from subsequent passenger mutations is key to molecular sub-typing of cancers, and the discovery of novel biomarkers. The availability of genomics technologies (mainly wholegenome and exome sequencing, and transcript sampling via RNA-seq, collectively referred to as NGS) have fueled recent studies on somatic mutation discovery. However, the vision is challenged by the complexity, redundancy, and errors in genomic data, and the difficulty of investigating the proteome using only genomic approaches. Recently, combination of proteomic and genomic technologies are increasingly employed. However, the complexity and redundancymore » of NGS data remains a challenge for proteogenomics, and various trade-offs must be made to allow for the searches to take place. This paperprovides a discussion of two such trade-offs, relating to large database search, and FDR calculations, and their implication to cancer proteogenomics. Moreover, it extends and develops the idea of a unified genomic variant database that can be searched by any mass spectrometry sample. A total of 879 BAM files downloaded from TCGA repository were used to create a 4.34 GB unified FASTA database which contained 2,787,062 novel splice junctions, 38,464 deletions, 1105 insertions, and 182,302 substitutions. Proteomic data from a single ovarian carcinoma sample (439,858 spectra) was searched against the database. By applying the most conservative FDR measure, we have identified 524 novel peptides and 65,578 known peptides at 1% FDR threshold. The novel peptides include interesting examples of doubly mutated peptides, frame-shifts, and non-sample-recruited mutations, which emphasize the strength of our approach.« less
RaftProt: mammalian lipid raft proteome database.
Shah, Anup; Chen, David; Boda, Akash R; Foster, Leonard J; Davis, Melissa J; Hill, Michelle M
2015-01-01
RaftProt (http://lipid-raft-database.di.uq.edu.au/) is a database of mammalian lipid raft-associated proteins as reported in high-throughput mass spectrometry studies. Lipid rafts are specialized membrane microdomains enriched in cholesterol and sphingolipids thought to act as dynamic signalling and sorting platforms. Given their fundamental roles in cellular regulation, there is a plethora of information on the size, composition and regulation of these membrane microdomains, including a large number of proteomics studies. To facilitate the mining and analysis of published lipid raft proteomics studies, we have developed a searchable database RaftProt. In addition to browsing the studies, performing basic queries by protein and gene names, searching experiments by cell, tissue and organisms; we have implemented several advanced features to facilitate data mining. To address the issue of potential bias due to biochemical preparation procedures used, we have captured the lipid raft preparation methods and implemented advanced search option for methodology and sample treatment conditions, such as cholesterol depletion. Furthermore, we have identified a list of high confidence proteins, and enabled searching only from this list of likely bona fide lipid raft proteins. Given the apparent biological importance of lipid raft and their associated proteins, this database would constitute a key resource for the scientific community. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Ivarsson, Ylva; Arnold, Roland; McLaughlin, Megan; Nim, Satra; Joshi, Rakesh; Ray, Debashish; Liu, Bernard; Teyra, Joan; Pawson, Tony; Moffat, Jason; Li, Shawn Shun-Cheng; Sidhu, Sachdev S; Kim, Philip M
2014-02-18
The human proteome contains a plethora of short linear motifs (SLiMs) that serve as binding interfaces for modular protein domains. Such interactions are crucial for signaling and other cellular processes, but are difficult to detect because of their low to moderate affinities. Here we developed a dedicated approach, proteomic peptide-phage display (ProP-PD), to identify domain-SLiM interactions. Specifically, we generated phage libraries containing all human and viral C-terminal peptides using custom oligonucleotide microarrays. With these libraries we screened the nine PSD-95/Dlg/ZO-1 (PDZ) domains of human Densin-180, Erbin, Scribble, and Disks large homolog 1 for peptide ligands. We identified several known and putative interactions potentially relevant to cellular signaling pathways and confirmed interactions between full-length Scribble and the target proteins β-PIX, plakophilin-4, and guanylate cyclase soluble subunit α-2 using colocalization and coimmunoprecipitation experiments. The affinities of recombinant Scribble PDZ domains and the synthetic peptides representing the C termini of these proteins were in the 1- to 40-μM range. Furthermore, we identified several well-established host-virus protein-protein interactions, and confirmed that PDZ domains of Scribble interact with the C terminus of Tax-1 of human T-cell leukemia virus with micromolar affinity. Previously unknown putative viral protein ligands for the PDZ domains of Scribble and Erbin were also identified. Thus, we demonstrate that our ProP-PD libraries are useful tools for probing PDZ domain interactions. The method can be extended to interrogate all potential eukaryotic, bacterial, and viral SLiMs and we suggest it will be a highly valuable approach for studying cellular and pathogen-host protein-protein interactions.
Chabi, Malika; Goulas, Estelle; Leclercq, Celine C; de Waele, Isabelle; Rihouey, Christophe; Cenci, Ugo; Day, Arnaud; Blervacq, Anne-Sophie; Neutelings, Godfrey; Duponchel, Ludovic; Lerouge, Patrice; Hausman, Jean-François; Renaut, Jenny; Hawkins, Simon
2017-09-01
Experimentally-generated (nanoLC-MS/MS) proteomic analyses of four different flax organs/tissues (inner-stem, outer-stem, leaves and roots) enriched in proteins from 3 different sub-compartments (soluble-, membrane-, and cell wall-proteins) was combined with publically available data on flax seed and whole-stem proteins to generate a flax protein database containing 2996 nonredundant total proteins. Subsequent multiple analyses (MapMan, CAZy, WallProtDB and expert curation) of this database were then used to identify a flax cell wall proteome consisting of 456 nonredundant proteins localized in the cell wall and/or associated with cell wall biosynthesis, remodeling and other cell wall related processes. Examination of the proteins present in different flax organs/tissues provided a detailed overview of cell wall metabolism and highlighted the importance of hemicellulose and pectin remodeling in stem tissues. Phylogenetic analyses of proteins in the cell wall proteome revealed an important paralogy in the class IIIA xyloglucan endo-transglycosylase/hydrolase (XTH) family associated with xyloglucan endo-hydrolase activity.Immunolocalisation, FT-IR microspectroscopy, and enzymatic fingerprinting indicated that flax fiber primary/S1 cell walls contained xyloglucans with typical substituted side chains as well as glucuronoxylans in much lower quantities. These results suggest a likely central role of xyloglucans and endotransglucosylase/hydrolase activity in flax fiber formation and cell wall remodeling processes. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
NASA Astrophysics Data System (ADS)
Jabbour, Rabih E.; Wade, Mary; Deshpande, Samir V.; McCubbin, Patrick; Snyder, A. Peter; Bevilacqua, Vicky
2012-06-01
Mass spectrometry based proteomic approaches are showing promising capabilities in addressing various biological and biochemical issues. Outer membrane proteins (OMPs) are often associated with virulence in gram-negative pathogens and could prove to be excellent model biomarkers for strain level differentiation among bacteria. Whole cells and OMP extracts were isolated from pathogenic and non-pathogenic strains of Francisella tularensis, Burkholderia thailandensis, and Burkholderia mallei. OMP extracts were compared for their ability to differentiate and delineate the correct database organism to an experimental sample and for the degree of dissimilarity to the nearest-neighbor database strains. This study addresses the comparative experimental proteome analyses of OMPs vs. whole cell lysates on the strain-level discrimination among gram negative pathogenic and non-pathogenic strains.
Matsumoto, Takayuki; Hess, Sonja; Kajiyama, Hiroshi; Sakairi, Toru; Saleem, Moin A; Mathieson, Peter W; Nojima, Yoshihisa; Kopp, Jeffrey B
2010-10-01
The podocyte secretory proteome may influence the phenotype of adjacent podocytes, endothelial cells, parietal epithelial cells, and tubular epithelial cells but has not been systematically characterized. We have initiated studies to characterize this proteome, with the goal of further understanding the podocyte cell biology. We cultured differentiated conditionally immortalized human podocytes and subjected the proteins in conditioned medium to mass spectrometry. At a false discovery rate of <3%, we identified 111 candidates from conditioned medium, including 44 proteins that have signal peptides or are described as secreted proteins in the UniProt database. As validation, we confirmed that one of these proteins, insulin-like growth factor-binding protein-related protein-1 (IGFBP-rP1), was expressed in mRNA and protein of cultured podocytes. In addition, transforming growth factor-β1 stimulation increased IGFBP-rP1 in conditioned medium. We analyzed IGFBP-rP1 glomerular expression in a mouse model of human immunodeficiency virus-associated nephropathy. IGFBP-rP1 was absent from podocytes of normal mice and was expressed in podocytes and pseudocrescents of transgenic mice, where it was coexpressed with desmin, a podocyte injury marker. We conclude that IGFBP-rP1 may be a product of injured podocytes. Further analysis of the podocyte secretory proteome may identify biomarkers of podocyte injury.
PatternLab for proteomics 4.0: A one-stop shop for analyzing shotgun proteomic data
Carvalho, Paulo C; Lima, Diogo B; Leprevost, Felipe V; Santos, Marlon D M; Fischer, Juliana S G; Aquino, Priscila F; Moresco, James J; Yates, John R; Barbosa, Valmir C
2017-01-01
PatternLab for proteomics is an integrated computational environment that unifies several previously published modules for analyzing shotgun proteomic data. PatternLab contains modules for formatting sequence databases, performing peptide spectrum matching, statistically filtering and organizing shotgun proteomic data, extracting quantitative information from label-free and chemically labeled data, performing statistics for differential proteomics, displaying results in a variety of graphical formats, performing similarity-driven studies with de novo sequencing data, analyzing time-course experiments, and helping with the understanding of the biological significance of data in the light of the Gene Ontology. Here we describe PatternLab for proteomics 4.0, which closely knits together all of these modules in a self-contained environment, covering the principal aspects of proteomic data analysis as a freely available and easily installable software package. All updates to PatternLab, as well as all new features added to it, have been tested over the years on millions of mass spectra. PMID:26658470
Chen, Jing; Han, Guiqing; Shang, Chen; Li, Jikai; Zhang, Hailing; Liu, Fengqi; Wang, Jianli; Liu, Huiying; Zhang, Yuexue
2015-01-01
Cold acclimation in alfalfa (Medicago sativa L.) plays a crucial role in cold tolerance to harsh winters. To examine the cold acclimation mechanisms in freezing-tolerant alfalfa (ZD) and freezing-sensitive alfalfa (W5), holoproteins, and low-abundance proteins (after the removal of RuBisCO) from leaves were extracted to analyze differences at the protein level. A total of 84 spots were selected, and 67 spots were identified. Of these, the abundance of 49 spots and 24 spots in ZD and W5, respectively, were altered during adaptation to chilling stress. Proteomic results revealed that proteins involved in photosynthesis, protein metabolism, energy metabolism, stress and redox and other proteins were mobilized in adaptation to chilling stress. In ZD, a greater number of changes were observed in proteins, and autologous metabolism and biosynthesis were slowed in response to chilling stress, thereby reducing consumption, allowing for homeostasis. The capability for protein folding and protein biosynthesis in W5 was enhanced, which allows protection against chilling stress. The ability to perceive low temperatures was more sensitive in freezing-tolerant alfalfa compared to freezing-sensitive alfalfa. This proteomics study provides new insights into the cold acclimation mechanism in alfalfa. PMID:25774161
Schokraie, Elham; Hotz-Wagenblatt, Agnes; Warnken, Uwe; Mali, Brahim; Frohme, Marcus; Förster, Frank; Dandekar, Thomas; Hengherr, Steffen; Schill, Ralph O; Schnölzer, Martina
2010-03-03
Tardigrades are small, multicellular invertebrates which are able to survive times of unfavourable environmental conditions using their well-known capability to undergo cryptobiosis at any stage of their life cycle. Milnesium tardigradum has become a powerful model system for the analysis of cryptobiosis. While some genetic information is already available for Milnesium tardigradum the proteome is still to be discovered. Here we present to the best of our knowledge the first comprehensive study of Milnesium tardigradum on the protein level. To establish a proteome reference map we developed optimized protocols for protein extraction from tardigrades in the active state and for separation of proteins by high resolution two-dimensional gel electrophoresis. Since only limited sequence information of M. tardigradum on the genome and gene expression level is available to date in public databases we initiated in parallel a tardigrade EST sequencing project to allow for protein identification by electrospray ionization tandem mass spectrometry. 271 out of 606 analyzed protein spots could be identified by searching against the publicly available NCBInr database as well as our newly established tardigrade protein database corresponding to 144 unique proteins. Another 150 spots could be identified in the tardigrade clustered EST database corresponding to 36 unique contigs and ESTs. Proteins with annotated function were further categorized in more detail by their molecular function, biological process and cellular component. For the proteins of unknown function more information could be obtained by performing a protein domain annotation analysis. Our results include proteins like protein member of different heat shock protein families and LEA group 3, which might play important roles in surviving extreme conditions. The proteome reference map of Milnesium tardigradum provides the basis for further studies in order to identify and characterize the biochemical mechanisms of tolerance to extreme desiccation. The optimized proteomics workflow will enable application of sensitive quantification techniques to detect differences in protein expression, which are characteristic of the active and anhydrobiotic states of tardigrades.
Schokraie, Elham; Hotz-Wagenblatt, Agnes; Warnken, Uwe; Mali, Brahim; Frohme, Marcus; Förster, Frank; Dandekar, Thomas; Hengherr, Steffen; Schill, Ralph O.; Schnölzer, Martina
2010-01-01
Background Tardigrades are small, multicellular invertebrates which are able to survive times of unfavourable environmental conditions using their well-known capability to undergo cryptobiosis at any stage of their life cycle. Milnesium tardigradum has become a powerful model system for the analysis of cryptobiosis. While some genetic information is already available for Milnesium tardigradum the proteome is still to be discovered. Principal Findings Here we present to the best of our knowledge the first comprehensive study of Milnesium tardigradum on the protein level. To establish a proteome reference map we developed optimized protocols for protein extraction from tardigrades in the active state and for separation of proteins by high resolution two-dimensional gel electrophoresis. Since only limited sequence information of M. tardigradum on the genome and gene expression level is available to date in public databases we initiated in parallel a tardigrade EST sequencing project to allow for protein identification by electrospray ionization tandem mass spectrometry. 271 out of 606 analyzed protein spots could be identified by searching against the publicly available NCBInr database as well as our newly established tardigrade protein database corresponding to 144 unique proteins. Another 150 spots could be identified in the tardigrade clustered EST database corresponding to 36 unique contigs and ESTs. Proteins with annotated function were further categorized in more detail by their molecular function, biological process and cellular component. For the proteins of unknown function more information could be obtained by performing a protein domain annotation analysis. Our results include proteins like protein member of different heat shock protein families and LEA group 3, which might play important roles in surviving extreme conditions. Conclusions The proteome reference map of Milnesium tardigradum provides the basis for further studies in order to identify and characterize the biochemical mechanisms of tolerance to extreme desiccation. The optimized proteomics workflow will enable application of sensitive quantification techniques to detect differences in protein expression, which are characteristic of the active and anhydrobiotic states of tardigrades. PMID:20224743
Analysis of human serum phosphopeptidome by a focused database searching strategy.
Zhu, Jun; Wang, Fangjun; Cheng, Kai; Song, Chunxia; Qin, Hongqiang; Hu, Lianghai; Figeys, Daniel; Ye, Mingliang; Zou, Hanfa
2013-01-14
As human serum is an important source for early diagnosis of many serious diseases, analysis of serum proteome and peptidome has been extensively performed. However, the serum phosphopeptidome was less explored probably because the effective method for database searching is lacking. Conventional database searching strategy always uses the whole proteome database, which is very time-consuming for phosphopeptidome search due to the huge searching space resulted from the high redundancy of the database and the setting of dynamic modifications during searching. In this work, a focused database searching strategy using an in-house collected human serum pro-peptidome target/decoy database (HuSPep) was established. It was found that the searching time was significantly decreased without compromising the identification sensitivity. By combining size-selective Ti (IV)-MCM-41 enrichment, RP-RP off-line separation, and complementary CID and ETD fragmentation with the new searching strategy, 143 unique endogenous phosphopeptides and 133 phosphorylation sites (109 novel sites) were identified from human serum with high reliability. Copyright © 2012 Elsevier B.V. All rights reserved.
Sun, Hongyan; Cao, Fangbin; Wang, Nanbo; Zhang, Mian; Mosaddek Ahmed, Imrul; Zhang, Guoping; Wu, Feibo
2013-01-01
To reveal grain physio-chemical and proteomic differences between two barley genotypes, Zhenong8 and W6nk2 of high- and low- grain-Cd-accumulation, grain profiles of ultrastructure, amino acid and proteins were compared. Results showed that W6nk2 possesses significantly lower protein content, with hordein depicting the greatest genotypic difference, compared with Zhenong8, and lower amino acid contents with especially lower proportion of Glu, Tyr, Phe and Pro. Both scanning and transmission electron microscopy observation declared that the size of A-type starch molecule in W6nk2 was considerably larger than that of Zhenong8. Grains of Zhenong8 exhibited more protein-rich deposits around starch granules, with some A-type granules having surface pits. Seventeen proteins were identified in grains, using 2-DE coupled with mass spectrometry, with higher expression in Zhenong8 than that in W6nk2; including z-type serpin, serpin-Z7 and alpha-amylase/trypsin inhibitor CM, carbohydrate metabolism, protein synthesis and signal transduction related proteins. Twelve proteins were less expressed in Zhenong8 than that in W6nk2; including barley trypsin inhibitor chloroform/methanol-soluble protein (BTI-CMe2.1, BTI-CMe2.2), trypsin inhibitor, dehydroascorbate reductase (DHAR), pericentrin, dynein heavy chain and some antiviral related proteins. The data extend our understanding of mechanisms underlying Cd accumulation/tolerance and provides possible utilization of elite genetic resources in developing low-grain-Cd barley cultivars. PMID:24260165
Environmental Microbial Community Proteomics: Status, Challenges and Perspectives.
Wang, Da-Zhi; Kong, Ling-Fen; Li, Yuan-Yuan; Xie, Zhang-Xian
2016-08-05
Microbial community proteomics, also termed metaproteomics, is an emerging field within the area of microbiology, which studies the entire protein complement recovered directly from a complex environmental microbial community at a given point in time. Although it is still in its infancy, microbial community proteomics has shown its powerful potential in exploring microbial diversity, metabolic potential, ecological function and microbe-environment interactions. In this paper, we review recent advances achieved in microbial community proteomics conducted in diverse environments, such as marine and freshwater, sediment and soil, activated sludge, acid mine drainage biofilms and symbiotic communities. The challenges facing microbial community proteomics are also discussed, and we believe that microbial community proteomics will greatly enhance our understanding of the microbial world and its interactions with the environment.
Audain, Enrique; Uszkoreit, Julian; Sachsenberg, Timo; Pfeuffer, Julianus; Liang, Xiao; Hermjakob, Henning; Sanchez, Aniel; Eisenacher, Martin; Reinert, Knut; Tabb, David L; Kohlbacher, Oliver; Perez-Riverol, Yasset
2017-01-06
In mass spectrometry-based shotgun proteomics, protein identifications are usually the desired result. However, most of the analytical methods are based on the identification of reliable peptides and not the direct identification of intact proteins. Thus, assembling peptides identified from tandem mass spectra into a list of proteins, referred to as protein inference, is a critical step in proteomics research. Currently, different protein inference algorithms and tools are available for the proteomics community. Here, we evaluated five software tools for protein inference (PIA, ProteinProphet, Fido, ProteinLP, MSBayesPro) using three popular database search engines: Mascot, X!Tandem, and MS-GF+. All the algorithms were evaluated using a highly customizable KNIME workflow using four different public datasets with varying complexities (different sample preparation, species and analytical instruments). We defined a set of quality control metrics to evaluate the performance of each combination of search engines, protein inference algorithm, and parameters on each dataset. We show that the results for complex samples vary not only regarding the actual numbers of reported protein groups but also concerning the actual composition of groups. Furthermore, the robustness of reported proteins when using databases of differing complexities is strongly dependant on the applied inference algorithm. Finally, merging the identifications of multiple search engines does not necessarily increase the number of reported proteins, but does increase the number of peptides per protein and thus can generally be recommended. Protein inference is one of the major challenges in MS-based proteomics nowadays. Currently, there are a vast number of protein inference algorithms and implementations available for the proteomics community. Protein assembly impacts in the final results of the research, the quantitation values and the final claims in the research manuscript. Even though protein inference is a crucial step in proteomics data analysis, a comprehensive evaluation of the many different inference methods has never been performed. Previously Journal of proteomics has published multiple studies about other benchmark of bioinformatics algorithms (PMID: 26585461; PMID: 22728601) in proteomics studies making clear the importance of those studies for the proteomics community and the journal audience. This manuscript presents a new bioinformatics solution based on the KNIME/OpenMS platform that aims at providing a fair comparison of protein inference algorithms (https://github.com/KNIME-OMICS). Six different algorithms - ProteinProphet, MSBayesPro, ProteinLP, Fido and PIA- were evaluated using the highly customizable workflow on four public datasets with varying complexities. Five popular database search engines Mascot, X!Tandem, MS-GF+ and combinations thereof were evaluated for every protein inference tool. In total >186 proteins lists were analyzed and carefully compare using three metrics for quality assessments of the protein inference results: 1) the numbers of reported proteins, 2) peptides per protein, and the 3) number of uniquely reported proteins per inference method, to address the quality of each inference method. We also examined how many proteins were reported by choosing each combination of search engines, protein inference algorithms and parameters on each dataset. The results show that using 1) PIA or Fido seems to be a good choice when studying the results of the analyzed workflow, regarding not only the reported proteins and the high-quality identifications, but also the required runtime. 2) Merging the identifications of multiple search engines gives almost always more confident results and increases the number of peptides per protein group. 3) The usage of databases containing not only the canonical, but also known isoforms of proteins has a small impact on the number of reported proteins. The detection of specific isoforms could, concerning the question behind the study, compensate for slightly shorter reports using the parsimonious reports. 4) The current workflow can be easily extended to support new algorithms and search engine combinations. Copyright © 2016. Published by Elsevier B.V.
PROTICdb: a web-based application to store, track, query, and compare plant proteome data.
Ferry-Dumazet, Hélène; Houel, Gwenn; Montalent, Pierre; Moreau, Luc; Langella, Olivier; Negroni, Luc; Vincent, Delphine; Lalanne, Céline; de Daruvar, Antoine; Plomion, Christophe; Zivy, Michel; Joets, Johann
2005-05-01
PROTICdb is a web-based application, mainly designed to store and analyze plant proteome data obtained by two-dimensional polyacrylamide gel electrophoresis (2-D PAGE) and mass spectrometry (MS). The purposes of PROTICdb are (i) to store, track, and query information related to proteomic experiments, i.e., from tissue sampling to protein identification and quantitative measurements, and (ii) to integrate information from the user's own expertise and other sources into a knowledge base, used to support data interpretation (e.g., for the determination of allelic variants or products of post-translational modifications). Data insertion into the relational database of PROTICdb is achieved either by uploading outputs of image analysis and MS identification software, or by filling web forms. 2-D PAGE annotated maps can be displayed, queried, and compared through a graphical interface. Links to external databases are also available. Quantitative data can be easily exported in a tabulated format for statistical analyses. PROTICdb is based on the Oracle or the PostgreSQL Database Management System and is freely available upon request at the following URL: http://moulon.inra.fr/ bioinfo/PROTICdb.
Kamal, Abu Hena M; Fessler, Michael B; Chowdhury, Saiful M
2018-01-01
Macrophages are specialized phagocytes that play an essential role in inflammation, immunity, and tissue repair. Profiling the global proteomic response of macrophages to microbial molecules such as bacterial lipopolysaccharide is key to understanding fundamental mechanisms of inflammatory disease. Ethanol is a widely abused substance that has complex effects on inflammation. Reports have indicated that ethanol can activate or inhibit the lipopolysaccharide receptor, Toll-like Receptor 4, in different settings, with important consequences for liver and neurologic inflammation, but the underlying mechanisms are poorly understood. To profile the sequential effect of low dose ethanol and lipopolysaccharide on macrophages, a gel-free proteomic technique was applied to RAW 264.7 macrophages. Five hundred four differentially expressed proteins were identified and quantified with high confidence using ≥ 5 peptide spectral matches. Among these, 319 proteins were shared across all treatment conditions, and 69 proteins were exclusively identified in ethanol-treated or lipopolysaccharide-stimulated cells. The interactive impact of ethanol and lipopolysaccharide on the macrophage proteome was evaluated using bioinformatics tools, enabling identification of differentially responsive proteins, protein interaction networks, disease- and function-based networks, canonical pathways, and upstream regulators. Five candidate protein coding genes (PGM2, ISYNA1, PARP1, and PSAP) were further validated by qRT-PCR that mostly related to glucose metabolism and fatty acid synthesis pathways. Taken together, this study describes for the first time at a systems level the interaction between ethanol and lipopolysaccharide in the proteomic programming of macrophages, and offers new mechanistic insights into the biology that may underlie the impact of ethanol on infectious and inflammatory disease in humans.
Min, Li; Cheng, Jianbo; Zhao, Shengguo; Tian, He; Zhang, Yangdong; Li, Songli; Yang, Hongjian; Zheng, Nan; Wang, Jiaqi
2016-09-02
Heat stress (HS) has an enormous economic impact on the dairy industry. In recent years, many researchers have investigated changes in the gene expression and metabolomics profiles in dairy cows caused by HS. However, the proteomics profiles of heat-stressed dairy cows have not yet been completely elucidated. We compared plasma proteomics from HS-free and heat-stressed dairy cows using an iTRAQ labeling approach. After the depletion of high abundant proteins in the plasma, 1472 proteins were identified. Of these, 85 proteins were differentially abundant in cows exposed to HS relative to HS-free. Database searches combined with GO and KEGG pathway enrichment analyses revealed that many components of the complement and coagulation cascades were altered in heat-stressed cows compared with HS-free cows. Of these, many factors in the complement system (including complement components C1, C3, C5, C6, C7, C8, and C9, complement factor B, and factor H) were down-regulated by HS, while components of the coagulation system (including coagulation factors, vitamin K-dependent proteins, and fibrinogens) were up-regulated by HS. In conclusion, our results indicate that HS decreases plasma levels of complement system proteins, suggesting that immune function is impaired in dairy cows exposed to HS. Though many aspects of heat stress (HS) have been extensively researched, relatively little is known about the proteomics profile changes that occur during heat exposure. In this work, we employed a proteomics approach to investigate differential abundance of plasma proteins in HS-free and heat-stressed dairy cows. Database searches combined with GO and KEGG pathway enrichment analyses revealed that HS resulted in a decrease in complement components, suggesting that heat-stressed dairy cows have impaired immune function. In addition, through integrative analyses of proteomics and previous metabolomics, we showed enhanced glycolysis, lipid metabolic pathway shifts, and nitrogen repartitioning in dairy cows exposed to HS. Our findings expand our current knowledge on the effects of HS on plasma proteomics in dairy cows and offer a new perspective for future research. Copyright © 2016 Elsevier B.V. All rights reserved.
Microbial Interactions in Plants: Perspectives and Applications of Proteomics.
Imam, Jahangir; Shukla, Pratyoosh; Mandal, Nimai Prasad; Variar, Mukund
2017-01-01
The structure and function of proteins involved in plant-microbe interactions is investigated through large-scale proteomics technology in a complex biological sample. Since the whole genome sequences are now available for several plant species and microbes, proteomics study has become easier, accurate and huge amount of data can be generated and analyzed during plant-microbe interactions. Proteomics approaches are highly important and relevant in many studies and showed that only genomics approaches are not sufficient enough as much significant information are lost as the proteins and not the genes coding them are final product that is responsible for the observed phenotype. Novel approaches in proteomics are developing continuously enabling the study of the various aspects in arrangements and configuration of proteins and its functions. Its application is becoming more common and frequently used in plant-microbe interactions with the advancement in new technologies. They are more used for the portrayal of cell and extracellular destructiveness and pathogenicity variables delivered by pathogens. This distinguishes the protein level adjustments in host plants when infected with pathogens and advantageous partners. This review provides a brief overview of different proteomics technology which is currently available followed by their exploitation to study the plant-microbe interaction. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Helsens, Kenny; Colaert, Niklaas; Barsnes, Harald; Muth, Thilo; Flikka, Kristian; Staes, An; Timmerman, Evy; Wortelkamp, Steffi; Sickmann, Albert; Vandekerckhove, Joël; Gevaert, Kris; Martens, Lennart
2010-03-01
MS-based proteomics produces large amounts of mass spectra that require processing, identification and possibly quantification before interpretation can be undertaken. High-throughput studies require automation of these various steps, and management of the data in association with the results obtained. We here present ms_lims (http://genesis.UGent.be/ms_lims), a freely available, open-source system based on a central database to automate data management and processing in MS-driven proteomics analyses.
Newborn Mouse Lens Proteome and Its Alteration by Lysine 6 Mutant Ubiquitin
2015-01-01
Ubiquitin is a tag that often initiates degradation of proteins by the proteasome in the ubiquitin proteasome system. Targeted expression of K6W mutant ubiquitin (K6W-Ub) in the lens results in defects in lens development and cataract formation, suggesting critical functions for ubiquitin in lens. To study the developmental processes that require intact ubiquitin, we executed the most extensive characterization of the lens proteome to date. We quantified lens protein expression changes in multiple replicate pools of P1 wild-type and K6W-Ub-expressing mouse lenses. Lens proteins were digested with trypsin, peptides were separated using strong cation exchange and reversed-phase liquid chromatography, and tandem mass (MS/MS) spectra were collected with a linear ion trap. Transgenic mice that expressed low levels of K6W-Ub (low expressers) had normal, clear lenses at birth, whereas the lenses that expressed high levels of K6W-Ub (higher expressers) had abnormal lenses and cataracts at birth. A total of 2052 proteins were identified, of which 996 were reliably quantified and compared between wild-type and K6W-Ub transgenic mice. Consistent with a delayed developmental program, fiber-cell-specific proteins, such as γ-crystallins (γA, γB, γC, and γE), were down-regulated in K6W-Ub higher expressers. Up-regulated proteins were involved in energy metabolism, signal transduction, and proteolysis. The K6W-Ub low expressers exhibited delayed onset and milder cataract consistent with smaller changes in protein expression. Because lens protein expression changes occurred prior to lens morphological abnormalities and cataract formation in K6W-Ub low expressers, it appears that expression of K6W-Ub sets in motion a process of altered protein expression that results in developmental defects and cataract. PMID:24450463
Krüger, Thomas; Luo, Ting; Schmidt, Hella; Shopova, Iordana; Kniemeyer, Olaf
2015-12-14
Opportunistic human pathogenic fungi including the saprotrophic mold Aspergillus fumigatus and the human commensal Candida albicans can cause severe fungal infections in immunocompromised or critically ill patients. The first line of defense against opportunistic fungal pathogens is the innate immune system. Phagocytes such as macrophages, neutrophils and dendritic cells are an important pillar of the innate immune response and have evolved versatile defense strategies against microbial pathogens. On the other hand, human-pathogenic fungi have sophisticated virulence strategies to counteract the innate immune defense. In this context, proteomic approaches can provide deeper insights into the molecular mechanisms of the interaction of host immune cells with fungal pathogens. This is crucial for the identification of both diagnostic biomarkers for fungal infections and therapeutic targets. Studying host-fungal interactions at the protein level is a challenging endeavor, yet there are few studies that have been undertaken. This review draws attention to proteomic techniques and their application to fungal pathogens and to challenges, difficulties, and limitations that may arise in the course of simultaneous dual proteome analysis of host immune cells interacting with diverse morphotypes of fungal pathogens. On this basis, we discuss strategies to overcome these multifaceted experimental and analytical challenges including the viability of immune cells during co-cultivation, the increased and heterogeneous protein complexity of the host proteome dynamically interacting with the fungal proteome, and the demands on normalization strategies in terms of relative quantitative proteome analysis.
The Challenge of Human Spermatozoa Proteome: A Systematic Review.
Gilany, Kambiz; Minai-Tehrani, Arash; Amini, Mehdi; Agharezaee, Niloofar; Arjmand, Babak
2017-01-01
Currently, there are 20,197 human protein-coding genes in the most expertly curated database (UniProtKB/Swiss-Pro). Big efforts have been made by the international consortium, the Chromosome-Centric Human Proteome Project (C-HPP) and independent researchers, to map human proteome. In brief, anno 2017 the human proteome was outlined. The male factor contributes to 50% of infertility in couples. However, there are limited human spermatozoa proteomic studies. Firstly, the development of the mapping of the human spermatozoa was analyzed. The human spermatozoa have been used as a model for missing proteins. It has been shown that human spermatozoa are excellent sources for finding missing proteins. Y chromosome proteome mapping is led by Iran. However, it seems that it is extremely challenging to map the human spermatozoa Y chromosome proteins based on current mass spectrometry-based proteomics technology. Post-translation modifications (PTMs) of human spermatozoa proteome are the most unexplored area and currently the exact role of PTMs in male infertility is unknown. Additionally, the clinical human spermatozoa proteomic analysis, anno 2017 was done in this study.
Erban, Tomas; Rybanska, Dagmar; Harant, Karel; Hortova, Bronislava; Hubert, Jan
2016-01-01
Tyrophagus putrescentiae (Schrank, 1781) is an emerging source of allergens in stored products and homes. Feces proteases are the major allergens of astigmatid mites (Acari: Acaridida). In addition, the mites are carriers of microorganisms and microbial adjuvant compounds that stimulate innate signaling pathways. We sought to analyze the mite feces proteome, proteolytic activities, and mite-bacterial interaction in dry dog food (DDF). Proteomic methods comprising enzymatic and zymographic analysis of proteases and 2D-E-MS/MS were performed. The highest protease activity was assigned to trypsin-like proteases; lower activity was assigned to chymotrypsin-like proteases, and the cysteine protease cathepsin B-like had very low activity. The 2D-E-MS/MS proteomic analysis identified mite trypsin allergen Tyr p3, fatty acid-binding protein Tyr p13 and putative mite allergens ferritin (Grp 30) and (poly)ubiquitins. Tyr p3 was detected at different positions of the 2D-E. It indicates presence of zymogen at basic pI, and mature-enzyme form and enzyme fragment at acidic pI. Bacillolysins (neutral and alkaline proteases) of Bacillus cereus symbiont can contribute to the protease activity of the mite extract. The bacterial exo-chitinases likely contribute to degradation of mite exuviae, mite bodies or food boluses consisting of chitin, including the peritrophic membrane. Thus, the chitinases disrupt the feces and facilitate release of the allergens. B. cereus was isolated and identified based on amplification and sequencing of 16S rRNA and motB genes. B. cereus was added into high-fat, high-protein (DDF) and low-fat, low-protein (flour) diets to 1 and 5% (w/w), and the diets palatability was evaluated in 21-day population growth test. The supplementation of diet with B. cereus significantly suppressed population growth and the suppressive effect was higher in the high-fat, high-protein diet than in the low-fat, low-protein food. Thus, B. cereus has to coexist with the mite in balance to be beneficial for the mite. The mite-B. cereus symbiosis can be beneficial-suppressive at some level. The results increase the veterinary and medical importance of the allergens detected in feces. The B. cereus enzymes/toxins are important components of mite allergens. The strong symbiotic association of T. putrescentiae with B. cereus in DDF was indicated. PMID:26941650
Evaluation of Proteomic Search Engines for the Analysis of Histone Modifications
2015-01-01
Identification of histone post-translational modifications (PTMs) is challenging for proteomics search engines. Including many histone PTMs in one search increases the number of candidate peptides dramatically, leading to low search speed and fewer identified spectra. To evaluate database search engines on identifying histone PTMs, we present a method in which one kind of modification is searched each time, for example, unmodified, individually modified, and multimodified, each search result is filtered with false discovery rate less than 1%, and the identifications of multiple search engines are combined to obtain confident results. We apply this method for eight search engines on histone data sets. We find that two search engines, pFind and Mascot, identify most of the confident results at a reasonable speed, so we recommend using them to identify histone modifications. During the evaluation, we also find some important aspects for the analysis of histone modifications. Our evaluation of different search engines on identifying histone modifications will hopefully help those who are hoping to enter the histone proteomics field. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD001118. PMID:25167464
Evaluation of proteomic search engines for the analysis of histone modifications.
Yuan, Zuo-Fei; Lin, Shu; Molden, Rosalynn C; Garcia, Benjamin A
2014-10-03
Identification of histone post-translational modifications (PTMs) is challenging for proteomics search engines. Including many histone PTMs in one search increases the number of candidate peptides dramatically, leading to low search speed and fewer identified spectra. To evaluate database search engines on identifying histone PTMs, we present a method in which one kind of modification is searched each time, for example, unmodified, individually modified, and multimodified, each search result is filtered with false discovery rate less than 1%, and the identifications of multiple search engines are combined to obtain confident results. We apply this method for eight search engines on histone data sets. We find that two search engines, pFind and Mascot, identify most of the confident results at a reasonable speed, so we recommend using them to identify histone modifications. During the evaluation, we also find some important aspects for the analysis of histone modifications. Our evaluation of different search engines on identifying histone modifications will hopefully help those who are hoping to enter the histone proteomics field. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD001118.
Identification of Maturation-Specific Proteins by Single-Cell Proteomics of Human Oocytes
Virant-Klun, Irma; Leicht, Stefan; Hughes, Christopher; Krijgsveld, Jeroen
2016-01-01
Oocytes undergo a range of complex processes via oogenesis, maturation, fertilization, and early embryonic development, eventually giving rise to a fully functioning organism. To understand proteome composition and diversity during maturation of human oocytes, here we have addressed crucial aspects of oocyte collection and proteome analysis, resulting in the first proteome and secretome maps of human oocytes. Starting from 100 oocytes collected via a novel serum-free hanging drop culture system, we identified 2,154 proteins, whose function indicate that oocytes are largely resting cells with a proteome that is tailored for homeostasis, cellular attachment, and interaction with its environment via secretory factors. In addition, we have identified 158 oocyte-enriched proteins (such as ECAT1, PIWIL3, NLRP7)1 not observed in high-coverage proteomics studies of other human cell lines or tissues. Exploiting SP3, a novel technology for proteomic sample preparation using magnetic beads, we scaled down proteome analysis to single cells. Despite the low protein content of only ∼100 ng per cell, we consistently identified ∼450 proteins from individual oocytes. When comparing individual oocytes at the germinal vesicle (GV) and metaphase II (MII) stage, we found that the Tudor and KH domain-containing protein (TDRKH) is preferentially expressed in immature oocytes, while Wee2, PCNA, and DNMT1 were enriched in mature cells, collectively indicating that maintenance of genome integrity is crucial during oocyte maturation. This study demonstrates that an innovative proteomics workflow facilitates analysis of single human oocytes to investigate human oocyte biology and preimplantation development. The approach presented here paves the way for quantitative proteomics in other quantity-limited tissues and cell types. Data associated with this study are available via ProteomeXchange with identifier PXD004142. PMID:27215607
Identification of Maturation-Specific Proteins by Single-Cell Proteomics of Human Oocytes.
Virant-Klun, Irma; Leicht, Stefan; Hughes, Christopher; Krijgsveld, Jeroen
2016-08-01
Oocytes undergo a range of complex processes via oogenesis, maturation, fertilization, and early embryonic development, eventually giving rise to a fully functioning organism. To understand proteome composition and diversity during maturation of human oocytes, here we have addressed crucial aspects of oocyte collection and proteome analysis, resulting in the first proteome and secretome maps of human oocytes. Starting from 100 oocytes collected via a novel serum-free hanging drop culture system, we identified 2,154 proteins, whose function indicate that oocytes are largely resting cells with a proteome that is tailored for homeostasis, cellular attachment, and interaction with its environment via secretory factors. In addition, we have identified 158 oocyte-enriched proteins (such as ECAT1, PIWIL3, NLRP7)(1) not observed in high-coverage proteomics studies of other human cell lines or tissues. Exploiting SP3, a novel technology for proteomic sample preparation using magnetic beads, we scaled down proteome analysis to single cells. Despite the low protein content of only ∼100 ng per cell, we consistently identified ∼450 proteins from individual oocytes. When comparing individual oocytes at the germinal vesicle (GV) and metaphase II (MII) stage, we found that the Tudor and KH domain-containing protein (TDRKH) is preferentially expressed in immature oocytes, while Wee2, PCNA, and DNMT1 were enriched in mature cells, collectively indicating that maintenance of genome integrity is crucial during oocyte maturation. This study demonstrates that an innovative proteomics workflow facilitates analysis of single human oocytes to investigate human oocyte biology and preimplantation development. The approach presented here paves the way for quantitative proteomics in other quantity-limited tissues and cell types. Data associated with this study are available via ProteomeXchange with identifier PXD004142. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Weckwerth, Wolfram; Wienkoop, Stefanie; Hoehenwarter, Wolfgang; Egelhofer, Volker; Sun, Xiaoliang
2014-01-01
Genome sequencing and systems biology are revolutionizing life sciences. Proteomics emerged as a fundamental technique of this novel research area as it is the basis for gene function analysis and modeling of dynamic protein networks. Here a complete proteomics platform suited for functional genomics and systems biology is presented. The strategy includes MAPA (mass accuracy precursor alignment; http://www.univie.ac.at/mosys/software.html ) as a rapid exploratory analysis step; MASS WESTERN for targeted proteomics; COVAIN ( http://www.univie.ac.at/mosys/software.html ) for multivariate statistical analysis, data integration, and data mining; and PROMEX ( http://www.univie.ac.at/mosys/databases.html ) as a database module for proteogenomics and proteotypic peptides for targeted analysis. Moreover, the presented platform can also be utilized to integrate metabolomics and transcriptomics data for the analysis of metabolite-protein-transcript correlations and time course analysis using COVAIN. Examples for the integration of MAPA and MASS WESTERN data, proteogenomic and metabolic modeling approaches for functional genomics, phosphoproteomics by integration of MOAC (metal-oxide affinity chromatography) with MAPA, and the integration of metabolomics, transcriptomics, proteomics, and physiological data using this platform are presented. All software and step-by-step tutorials for data processing and data mining can be downloaded from http://www.univie.ac.at/mosys/software.html.
Boyanova, Desislava; Nilla, Santosh; Klau, Gunnar W.; Dandekar, Thomas; Müller, Tobias; Dittrich, Marcus
2014-01-01
The continuously evolving field of proteomics produces increasing amounts of data while improving the quality of protein identifications. Albeit quantitative measurements are becoming more popular, many proteomic studies are still based on non-quantitative methods for protein identification. These studies result in potentially large sets of identified proteins, where the biological interpretation of proteins can be challenging. Systems biology develops innovative network-based methods, which allow an integrated analysis of these data. Here we present a novel approach, which combines prior knowledge of protein-protein interactions (PPI) with proteomics data using functional similarity measurements of interacting proteins. This integrated network analysis exactly identifies network modules with a maximal consistent functional similarity reflecting biological processes of the investigated cells. We validated our approach on small (H9N2 virus-infected gastric cells) and large (blood constituents) proteomic data sets. Using this novel algorithm, we identified characteristic functional modules in virus-infected cells, comprising key signaling proteins (e.g. the stress-related kinase RAF1) and demonstrate that this method allows a module-based functional characterization of cell types. Analysis of a large proteome data set of blood constituents resulted in clear separation of blood cells according to their developmental origin. A detailed investigation of the T-cell proteome further illustrates how the algorithm partitions large networks into functional subnetworks each representing specific cellular functions. These results demonstrate that the integrated network approach not only allows a detailed analysis of proteome networks but also yields a functional decomposition of complex proteomic data sets and thereby provides deeper insights into the underlying cellular processes of the investigated system. PMID:24807868
Tomazic, Peter Valentin; Birner-Gruenberger, Ruth; Leitner, Anita; Obrist, Britta; Spoerk, Stefan; Lang-Loidolt, Doris
2014-03-01
Nasal mucus is the first-line defense barrier against (aero-) allergens. However, its proteome and function have not been clearly investigated. The role of nasal mucus in the pathophysiology of allergic rhinitis was investigated by analyzing its proteome in patients with allergic rhinitis (n = 29) and healthy control subjects (n = 29). Nasal mucus was collected with a suction device, tryptically digested, and analyzed by using liquid chromatography-tandem mass spectrometry. Proteins were identified by searching the SwissProt database and annotated by collecting gene ontology data from databases and existing literature. Gene enrichment analysis was performed by using Cytoscape/BINGO software tools. Proteins were quantified with spectral counting, and selected proteins were confirmed by means of Western blotting. In total, 267 proteins were identified, with 20 (7.5%) found exclusively in patients with allergic rhinitis and 25 (9.5%) found exclusively in healthy control subjects. Five proteins were found to be significantly upregulated in patients with allergic rhinitis (apolipoprotein A-2 [APOA2], 9.7-fold; α2-macroglobulin [A2M], 4.5-fold; apolipoprotein A-1 [APOA1], 3.2-fold; α1-antitrypsin [SERPINA1], 2.5-fold; and complement C3 [C3], 2.3-fold) and 5 were found to be downregulated (antileukoproteinase [SLPI], 0.6-fold; WAP 4-disulfide core domain protein [WFDC2], 0.5-fold; haptoglobin [HP], 0.7-fold; IgJ chain [IGJ], 0.7-fold; and Ig hc V-III region BRO, 0.8-fold) compared with levels seen in healthy control subjects. The allergic rhinitis mucus proteome shows an enhanced immune response in which apolipoproteins might play an important role. Furthermore, an imbalance between cysteine proteases and antiproteases could be seen, which negatively affects epithelial integrity on exposure to pollen protease activity. This reflects the important role of mucus as the first-line defense barrier against allergens. Copyright © 2013 American Academy of Allergy, Asthma & Immunology. Published by Mosby, Inc. All rights reserved.
Seo, Moon-Hyeong; Nim, Satra; Jeon, Jouhyun; Kim, Philip M
2017-01-01
Protein-protein interactions are essential to cellular functions and signaling pathways. We recently combined bioinformatics and custom oligonucleotide arrays to construct custom-made peptide-phage libraries for screening peptide-protein interactions, an approach we call proteomic peptide-phage display (ProP-PD). In this chapter, we describe protocols for phage display for the identification of natural peptide binders for a given protein. We finally describe deep sequencing for the analysis of the proteomic peptide-phage display.
Bernal, Dolores; Trelis, Maria; Montaner, Sergio; Cantalapiedra, Fernando; Galiano, Alicia; Hackenberg, Michael; Marcilla, Antonio
2014-06-13
With the aim of characterizing the molecules involved in the interaction of Dicrocoelium dendriticum adults and the host, we have performed proteomic analyses of the external surface of the parasite using the currently available datasets including the transcriptome of the related species Echinostoma caproni. We have identified 182 parasite proteins on the outermost surface of D. dendriticum. The presence of exosome-like vesicles in the ESP of D. dendriticum and their components has also been characterized. Using proteomic approaches, we have characterized 84 proteins in these vesicles. Interestingly, we have detected miRNA in D. dendriticum exosomes, thus representing the first report of miRNA in helminth exosomes. In order to identify potential targets for intervention against parasitic helminths, we have analyzed the surface of the parasitic helminth Dicrocoelium dendriticum. Along with the proteomic analyses of the outermost layer of the parasite, our work describes the molecular characterization of the exosomes of D. dendriticum. Our proteomic data confirm the improvement of protein identification from "non-model organisms" like helminths, when using different search engines against a combination of available databases. In addition, this work represents the first report of miRNAs in parasitic helminth exosomes. These vesicles can pack specific proteins and RNAs providing stability and resistance to RNAse digestion in body fluids, and provide a way to regulate host-parasite interplay. The present data should provide a solid foundation for the development of novel methods to control this non-model organism and related parasites. This article is part of a Special Issue entitled: Proteomics of non-model organisms. Copyright © 2014 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mu, Jun; Institute of Neuroscience and the Collaborative Innovation Center for Brain Science, Chongqing Medical University, Chongqing; Chongqing Key Laboratory of Neurobiology, Chongqing
Purpose: Tuberculous meningitis (TBM) remains to be one of the most deadly infectious diseases. The pathogen interacts with the host immune system, the process of which is largely unknown. Various cellular processes of Mycobacterium tuberculosis (MTB) centers around lipid metabolism. To determine the lipid metabolism related proteins, a quantitative proteomic study was performed here to identify differential proteins in the cerebrospinal fluid (CSF) obtained from TBM patients (n = 12) and healthy controls (n = 12). Methods: CSF samples were desalted, concentrated, labelled with isobaric tags for relative and absolute quantitation (iTRAQ™), and analyzed by multi-dimensional liquid chromatography-tandem mass spectrometry (LC-MS/MS). Gene ontology andmore » proteomic phenotyping analysis of the differential proteins were conducted using Database for Annotation, Visualization, and Integrated Discovery (DAVID) Bioinformatics Resources. ApoE and ApoB were selected for validation by ELISA. Results: Proteomic phenotyping of the 4 differential proteins was invloved in the lipid metabolism. ELISA showed significantly increased ApoB levels in TBM subjects compared to healthy controls. Area under the receiver operating characteristic curve analysis demonstrated ApoB levels could distinguish TBM subjects from healthy controls and viral meningitis subjects with 89.3% sensitivity and 92% specificity. Conclusions: CSF lipid metabolism disregulation, especially elevated expression of ApoB, gives insights into the pathogenesis of TBM. Further evaluation of these findings in larger studies including anti-tuberculosis medicated and unmedicated patient cohorts with other center nervous system infectious diseases is required for successful clinical translation. - Highlights: • The first proteomic study on the cerebrospinal fluid of tuberculous meningitis patients using iTRAQ. • Identify 4 differential proteins invloved in the lipid metabolism. • Elevated expression of ApoB gives insights into the pathogenesis of TBM.« less
Chen, Yi-Wen; Teng, Ching-Hao; Ho, Yu-Hsuan; Jessica Ho, Tien Yu; Huang, Wen-Chun; Hashimoto, Masayuki; Chiang, I-Yuan; Chen, Chien-Sheng
2014-01-01
Type 1 fimbriae are filamentous structures on Escherichia coli. These structures are important adherence factors. Because binding to the host cells is the first step of infection, type 1 fimbria is an important virulence factor of pathogenic E. coli. Expression of type 1 fimbria is regulated by a phase variation in which each individual bacterium can alternate between fimbriated (phase-ON) and nonfimbriated (phase-OFF) states. The phase variation is regulated by the flipping of the 314-bp fimS fragment, which contains the promoter driving the expression of the genes required for the synthesis of type 1 fimbria. Thus, the bacterial proteins able to interact with fimS are likely to be involved in regulating the expression of type 1 fimbria. To identify novel type 1 fimbria-regulating factors, we used an E. coli K12 proteome chip to screen for the bacterial factors able to interact with a 602-bp DNA fragment containing fimS and its adjacent regions. The Spr protein was identified by the proteome chip-based screening and further confirmed to be able to interact with fimS by electrophoretic mobility shift assay. Deletion of spr in the neonatal meningitis E. coli strain RS218 significantly increased the ratio of the bacterial colonies that contained the type 1 fimbria phase-ON cells on agar plates. In addition, Spr interfered with the interactions of fimS with the site-specific recombinases, FimB and FimE, which are responsible for mediating the flipping of fimS. These results suggest that Spr is involved in the regulation of type 1 fimbria expression through direct interaction with the invertible element fimS. These findings facilitate our understanding of the regulation of type 1 fimbria. PMID:24692643
NASA Astrophysics Data System (ADS)
Sharma, Om Prakash; Kumar, Muthuvel Suresh
2016-01-01
Lymphatic filariasis (Lf) is one of the oldest and most debilitating tropical diseases. Millions of people are suffering from this prevalent disease. It is estimated to infect over 120 million people in at least 80 nations of the world through the tropical and subtropical regions. More than one billion people are in danger of getting affected with this life-threatening disease. Several studies were suggested its emerging limitations and resistance towards the available drugs and therapeutic targets for Lf. Therefore, better medicine and drug targets are in demand. We took an initiative to identify the essential proteins of Wolbachia endosymbiont of Brugia malayi, which are indispensable for their survival and non-homologous to human host proteins. In this current study, we have used proteome subtractive approach to screen the possible therapeutic targets for wBm. In addition, numerous literatures were mined in the hunt for potential drug targets, drugs, epitopes, crystal structures, and expressed sequence tag (EST) sequences for filarial causing nematodes. Data obtained from our study were presented in a user friendly database named FiloBase. We hope that information stored in this database may be used for further research and drug development process against filariasis. URL: http://filobase.bicpu.edu.in.
Rubiano-Labrador, Carolina; Bland, Céline; Miotello, Guylaine; Guérin, Philippe; Pible, Olivier; Baena, Sandra; Armengaud, Jean
2014-01-31
Tistlia consotensis is a halotolerant Rhodospirillaceae that was isolated from a saline spring located in the Colombian Andes with a salt concentration close to seawater (4.5%w/vol). We cultivated this microorganism in three NaCl concentrations, i.e. optimal (0.5%), without (0.0%) and high (4.0%) salt concentration, and analyzed its cellular proteome. For assigning tandem mass spectrometry data, we first sequenced its genome and constructed a six reading frame ORF database from the draft sequence. We annotated only the genes whose products (872) were detected. We compared the quantitative proteome data sets recorded for the three different growth conditions. At low salinity general stress proteins (chaperons, proteases and proteins associated with oxidative stress protection), were detected in higher amounts, probably linked to difficulties for proper protein folding and metabolism. Proteogenomics and comparative genomics pointed at the CrgA transcriptional regulator as a key-factor for the proteome remodeling upon low osmolarity. In hyper-osmotic condition, T. consotensis produced in larger amounts proteins involved in the sensing of changes in salt concentration, as well as a wide panel of transport systems for the transport of organic compatible solutes such as glutamate. We have described here a straightforward procedure in making a new environmental isolate quickly amenable to proteomics. The bacterium Tistlia consotensis was isolated from a saline spring in the Colombian Andes and represents an interesting environmental model to be compared with extremophiles or other moderate organisms. To explore the halotolerance molecular mechanisms of the bacterium T. consotensis, we developed an innovative proteogenomic strategy consisting of i) genome sequencing, ii) quick annotation of the genes whose products were detected by mass spectrometry, and iii) comparative proteomics of cells grown in three salt conditions. We highlighted in this manuscript how efficient such an approach can be compared to time-consuming genome annotation when pointing at the key proteins of a given biological question. We documented a large number of proteins found produced in greater amounts when cells are cultivated in either hypo-osmotic or hyper-osmotic conditions. This article is part of a Special Issue entitled: Trends in Microbial Proteomics. Copyright © 2013 Elsevier B.V. All rights reserved.
Proteome Studies of Filamentous Fungi
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Scott E.; Panisko, Ellen A.
2011-04-20
The continued fast pace of fungal genome sequence generation has enabled proteomic analysis of a wide breadth of organisms that span the breadth of the Kingdom Fungi. There is some phylogenetic bias to the current catalog of fungi with reasonable DNA sequence databases (genomic or EST) that could be analyzed at a global proteomic level. However, the rapid development of next generation sequencing platforms has lowered the cost of genome sequencing such that in the near future, having a genome sequence will no longer be a time or cost bottleneck for downstream proteomic (and transcriptomic) analyses. High throughput, non-gel basedmore » proteomics offers a snapshot of proteins present in a given sample at a single point in time. There are a number of different variations on the general method and technologies for identifying peptides in a given sample. We present a method that can serve as a “baseline” for proteomic studies of fungi.« less
Proteome studies of filamentous fungi.
Baker, Scott E; Panisko, Ellen A
2011-01-01
The continued fast pace of fungal genome sequence generation has enabled proteomic analysis of a wide variety of organisms that span the breadth of the Kingdom Fungi. There is some phylogenetic bias to the current catalog of fungi with reasonable DNA sequence databases (genomic or EST) that could be analyzed at a global proteomic level. However, the rapid development of next generation sequencing platforms has lowered the cost of genome sequencing such that in the near future, having a genome sequence will no longer be a time or cost bottleneck for downstream proteomic (and transcriptomic) analyses. High throughput, nongel-based proteomics offers a snapshot of proteins present in a given sample at a single point in time. There are a number of variations on the general methods and technologies for identifying peptides in a given sample. We present a method that can serve as a "baseline" for proteomic studies of fungi.
Proteomic analysis on roots of Oenothera glazioviana under copper-stress conditions.
Wang, Chong; Wang, Jie; Wang, Xiao; Xia, Yan; Chen, Chen; Shen, Zhenguo; Chen, Yahua
2017-09-06
Proteomic studies were performed to identify proteins involved in the response of Oenothera glazioviana seedlings under Cu stress. Exposure of 28-d-old seedlings to 50 μM CuSO4 for 3 d led to inhibition of shoot and root growth as well as a considerable increase in the level of lipid peroxidation in the roots. Cu absorbed by O. glazioviana accumulated more easily in the root than in the shoot. Label-free proteomic analysis indicated 58 differentially abundant proteins (DAPs) of the total 3,149 proteins in the roots of O. glazioviana seedlings, of which 36 were upregulated and 22 were downregulated under Cu stress conditions. Gene Ontology analysis showed that most of the identified proteins could be annotated to signal transduction, detoxification, stress defence, carbohydrate, energy, and protein metabolism, development, and oxidoreduction. We also retrieved 13 proteins from the enriched Kyoto Encyclopaedia of Genes and Genomes and the protein-protein interaction databases related to various pathways, including the citric acid (CA) cycle. Application of exogenous CA to O. glazioviana seedlings exposed to Cu alleviated the stress symptoms. Overall, this study provided new insights into the molecular mechanisms of plant response to Cu at the protein level in relation to soil properties.
MinOmics, an Integrative and Immersive Tool for Multi-Omics Analysis.
Maes, Alexandre; Martinez, Xavier; Druart, Karen; Laurent, Benoist; Guégan, Sean; Marchand, Christophe H; Lemaire, Stéphane D; Baaden, Marc
2018-06-21
Proteomic and transcriptomic technologies resulted in massive biological datasets, their interpretation requiring sophisticated computational strategies. Efficient and intuitive real-time analysis remains challenging. We use proteomic data on 1417 proteins of the green microalga Chlamydomonas reinhardtii to investigate physicochemical parameters governing selectivity of three cysteine-based redox post translational modifications (PTM): glutathionylation (SSG), nitrosylation (SNO) and disulphide bonds (SS) reduced by thioredoxins. We aim to understand underlying molecular mechanisms and structural determinants through integration of redox proteome data from gene- to structural level. Our interactive visual analytics approach on an 8.3 m2 display wall of 25 MPixel resolution features stereoscopic three dimensions (3D) representation performed by UnityMol WebGL. Virtual reality headsets complement the range of usage configurations for fully immersive tasks. Our experiments confirm that fast access to a rich cross-linked database is necessary for immersive analysis of structural data. We emphasize the possibility to display complex data structures and relationships in 3D, intrinsic to molecular structure visualization, but less common for omics-network analysis. Our setup is powered by MinOmics, an integrated analysis pipeline and visualization framework dedicated to multi-omics analysis. MinOmics integrates data from various sources into a materialized physical repository. We evaluate its performance, a design criterion for the framework.
Proteomic analysis of human follicular fluid associated with successful in vitro fertilization.
Shen, Xiaofang; Liu, Xin; Zhu, Peng; Zhang, Yuhua; Wang, Jiahui; Wang, Yanwei; Wang, Wenting; Liu, Juan; Li, Ning; Liu, Fujun
2017-07-27
Human follicular fluid (HFF) provides a key environment for follicle development and oocyte maturation, and contributes to oocyte quality and in vitro fertilization (IVF) outcome. To better understand folliculogenesis in the ovary, a proteomic strategy based on dual reverse phase high performance liquid chromatography (RP-HPLC) coupled to matrix-assisted laser desorption/ionization time-of-flight tandem mass spectrometry (LC-MALDI TOF/TOF MS) was used to investigate the protein profile of HFF from women undergoing successful IVF. A total of 219 unique high-confidence (False Discovery Rate (FDR) < 0.01) HFF proteins were identified by searching the reviewed Swiss-Prot human database (20,183 sequences), and MS data were further verified by western blot. PANTHER showed HFF proteins were involved in complement and coagulation cascade, growth factor and hormone, immunity, and transportation, KEGG indicated their pathway, and STRING demonstrated their interaction networks. In comparison, 32% and 50% of proteins have not been reported in previous human follicular fluid and plasma. Our HFF proteome research provided a new complementary high-confidence dataset of folliculogenesis and oocyte maturation environment. Those proteins associated with innate immunity, complement cascade, blood coagulation, and angiogenesis might serve as the biomarkers of female infertility and IVF outcome, and their pathways facilitated a complete exhibition of reproductive process.
M2Lite: An Open-source, Light-weight, Pluggable and Fast Proteome Discoverer MSF to mzIdentML Tool.
Aiyetan, Paul; Zhang, Bai; Chen, Lily; Zhang, Zhen; Zhang, Hui
2014-04-28
Proteome Discoverer is one of many tools used for protein database search and peptide to spectrum assignment in mass spectrometry-based proteomics. However, the inadequacy of conversion tools makes it challenging to compare and integrate its results to those of other analytical tools. Here we present M2Lite, an open-source, light-weight, easily pluggable and fast conversion tool. M2Lite converts proteome discoverer derived MSF files to the proteomics community defined standard - the mzIdentML file format. M2Lite's source code is available as open-source at https://bitbucket.org/paiyetan/m2lite/src and its compiled binaries and documentation can be freely downloaded at https://bitbucket.org/paiyetan/m2lite/downloads.
Davidi, Lital; Levin, Yishai; Ben-Dor, Shifra; Pick, Uri
2015-01-01
The halotolerant green alga Dunaliella bardawil is unique in that it accumulates under stress two types of lipid droplets: cytoplasmatic lipid droplets (CLD) and β-carotene-rich (βC) plastoglobuli. Recently, we isolated and analyzed the lipid and pigment compositions of these lipid droplets. Here, we describe their proteome analysis. A contamination filter and an enrichment filter were utilized to define core proteins. A proteome database of Dunaliella salina/D. bardawil was constructed to aid the identification of lipid droplet proteins. A total of 124 and 42 core proteins were identified in βC-plastoglobuli and CLD, respectively, with only eight common proteins. Dunaliella spp. CLD resemble cytoplasmic droplets from Chlamydomonas reinhardtii and contain major lipid droplet-associated protein and enzymes involved in lipid and sterol metabolism. The βC-plastoglobuli proteome resembles the C. reinhardtii eyespot and Arabidopsis (Arabidopsis thaliana) plastoglobule proteomes and contains carotene-globule-associated protein, plastid-lipid-associated protein-fibrillins, SOUL heme-binding proteins, phytyl ester synthases, β-carotene biosynthesis enzymes, and proteins involved in membrane remodeling/lipid droplet biogenesis: VESICLE-INDUCING PLASTID PROTEIN1, synaptotagmin, and the eyespot assembly proteins EYE3 and SOUL3. Based on these and previous results, we propose models for the biogenesis of βC-plastoglobuli and the biosynthesis of β-carotene within βC-plastoglobuli and hypothesize that βC-plastoglobuli evolved from eyespot lipid droplets. PMID:25404729
Verberkmoes, Nathan C; Hervey, W Judson; Shah, Manesh; Land, Miriam; Hauser, Loren; Larimer, Frank W; Van Berkel, Gary J; Goeringer, Douglas E
2005-02-01
There is currently a great need for rapid detection and positive identification of biological threat agents, as well as microbial species in general, directly from complex environmental samples. This need is most urgent in the area of homeland security, but also extends into medical, environmental, and agricultural sciences. Mass-spectrometry-based analysis is one of the leading technologies in the field with a diversity of different methodologies for biothreat detection. Over the past few years, "shotgun"proteomics has become one method of choice for the rapid analysis of complex protein mixtures by mass spectrometry. Recently, it was demonstrated that this methodology is capable of distinguishing a target species against a large database of background species from a single-component sample or dual-component mixtures with relatively the same concentration. Here, we examine the potential of shotgun proteomics to analyze a target species in a background of four contaminant species. We tested the capability of a common commercial mass-spectrometry-based shotgun proteomics platform for the detection of the target species (Escherichia coli) at four different concentrations and four different time points of analysis. We also tested the effect of database size on positive identification of the four microbes used in this study by testing a small (13-species) database and a large (261-species) database. The results clearly indicated that this technology could easily identify the target species at 20% in the background mixture at a 60, 120, 180, or 240 min analysis time with the small database. The results also indicated that the target species could easily be identified at 20% or 6% but could not be identified at 0.6% or 0.06% in either a 240 min analysis or a 30 h analysis with the small database. The effects of the large database were severe on the target species where detection above the background at any concentration used in this study was impossible, though the three other microbes used in this study were clearly identified above the background when analyzed with the large database. This study points to the potential application of this technology for biological threat agent detection but highlights many areas of needed research before the technology will be useful in real world samples.
In silico Analysis of Toxins of Staphylococcus aureus for Validating Putative Drug Targets.
Mohana, Ramadevi; Venugopal, Subhashree
2017-01-01
Toxins are one among the numerous virulence factors produced by the bacteria. These are powerful poisonous substances enabling the bacteria to encounter the defense mechanism of human body. The pathogenic system of Staphylococcus aureus is evolved with various exotoxins that cause detrimental effects on human immune system. Four toxins namely enterotoxin A, exfoliative toxin A, TSST-1 and γ-hemolysin were downloaded from Uniprot database and were analyzed to understand the nature of the toxins and for drug target validation. The results inferred that the toxins were found to interact with many protein partners and no homologous sequences for human proteome were found, and based on similarity search in Drugbank, the targets were identified as novel drug targets. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Halligan, Brian D.; Geiger, Joey F.; Vallejos, Andrew K.; Greene, Andrew S.; Twigger, Simon N.
2009-01-01
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step by step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center website (http://proteomics.mcw.edu/vipdac). PMID:19358578
Halligan, Brian D; Geiger, Joey F; Vallejos, Andrew K; Greene, Andrew S; Twigger, Simon N
2009-06-01
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step-by-step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center Web site ( http://proteomics.mcw.edu/vipdac ).
Bhardwaj, Jyoti; Gangwar, Indu; Panzade, Ganesh; Shankar, Ravi; Yadav, Sudesh Kumar
2016-06-03
Inspired by the availability of de novo transcriptome of horse gram (Macrotyloma uniflorum) and recent developments in systems biology studies, the first ever global protein-protein interactome (PPI) map was constructed for this highly drought-tolerant legume. Large-scale studies of PPIs and the constructed database would provide rationale behind the interplay at cascading translational levels for drought stress-adaptive mechanisms in horse gram. Using a bidirectional approach (interolog and domain-based), a high-confidence interactome map and database for horse gram was constructed. Available transcriptomic information for shoot and root tissues of a sensitive (M-191; genotype 1) and a drought-tolerant (M-249; genotype 2) genotype of horse gram was utilized to draw comparative PPI subnetworks under drought stress. High-confidence 6804 interactions were predicted among 1812 proteins covering about one-fourth of the horse gram proteome. The highest number of interactions (33.86%) in horse gram interactome matched with Arabidopsis PPI data. The top five hub nodes mostly included ubiquitin and heat-shock-related proteins. Higher numbers of PPIs were found to be responsive in shoot tissue (416) and root tissue (2228) of genotype 2 compared with shoot tissue (136) and root tissue (579) of genotype 1. Characterization of PPIs using gene ontology analysis revealed that kinase and transferase activities involved in signal transduction, cellular processes, nucleocytoplasmic transport, protein ubiquitination, and localization of molecules were most responsive to drought stress. Hence, these could be framed in stress adaptive mechanisms of horse gram. Being the first legume global PPI map, it would provide new insights into gene and protein regulatory networks for drought stress tolerance mechanisms in horse gram. Information compiled in the form of database (MauPIR) will provide the much needed high-confidence systems biology information for horse gram genes, proteins, and involved processes. This information would ease the effort and increase the efficacy for similar studies on other legumes. Public access is available at http://14.139.59.221/MauPIR/ .
Ma, Lu; Hatlen, Andrea; Kelly, Laura J; Becher, Hannes; Wang, Wencai; Kovarik, Ales; Leitch, Ilia J; Leitch, Andrew R
2015-09-02
The RNA-directed DNA methylation (RdDM) pathway can be divided into three phases: 1) small interfering RNA biogenesis, 2) de novo methylation, and 3) chromatin modification. To determine the degree of conservation of this pathway we searched for key genes among land plants. We used OrthoMCL and the OrthoMCL Viridiplantae database to analyze proteomes of species in bryophytes, lycophytes, monilophytes, gymnosperms, and angiosperms. We also analyzed small RNA size categories and, in two gymnosperms, cytosine methylation in ribosomal DNA. Six proteins were restricted to angiosperms, these being NRPD4/NRPE4, RDM1, DMS3 (defective in meristem silencing 3), SHH1 (SAWADEE homeodomain homolog 1), KTF1, and SUVR2, although we failed to find the latter three proteins in Fritillaria persica, a species with a giant genome. Small RNAs of 24 nt in length were abundant only in angiosperms. Phylogenetic analyses of Dicer-like (DCL) proteins showed that DCL2 was restricted to seed plants, although it was absent in Gnetum gnemon and Welwitschia mirabilis. The data suggest that phases (1) and (2) of the RdDM pathway, described for model angiosperms, evolved with angiosperms. The absence of some features of RdDM in F. persica may be associated with its large genome. Phase (3) is probably the most conserved part of the pathway across land plants. DCL2, involved in virus defense and interaction with the canonical RdDM pathway to facilitate methylation of CHH, is absent outside seed plants. Its absence in G. gnemon, and W. mirabilis coupled with distinctive patterns of CHH methylation, suggest a secondary loss of DCL2 following the divergence of Gnetales. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Pattin, Kristine A.; Moore, Jason H.
2009-01-01
One of the central goals of human genetics is the identification of loci with alleles or genotypes that confer increased susceptibility. The availability of dense maps of single-nucleotide polymorphisms (SNPs) along with high-throughput genotyping technologies has set the stage for routine genome-wide association studies that are expected to significantly improve our ability to identify susceptibility loci. Before this promise can be realized, there are some significant challenges that need to be addressed. We address here the challenge of detecting epistasis or gene-gene interactions in genome-wide association studies. Discovering epistatic interactions in high dimensional datasets remains a challenge due to the computational complexity resulting from the analysis of all possible combinations of SNPs. One potential way to overcome the computational burden of a genome-wide epistasis analysis would be to devise a logical way to prioritize the many SNPs in a dataset so that the data may be analyzed more efficiently and yet still retain important biological information. One of the strongest demonstrations of the functional relationship between genes is protein-protein interaction. Thus, it is plausible that the expert knowledge extracted from protein interaction databases may allow for a more efficient analysis of genome-wide studies as well as facilitate the biological interpretation of the data. In this review we will discuss the challenges of detecting epistasis in genome-wide genetic studies and the means by which we propose to apply expert knowledge extracted from protein interaction databases to facilitate this process. We explore some of the fundamentals of protein interactions and the databases that are publicly available. PMID:18551320
Kamal, Abu Hena M.; Fessler, Michael B.
2018-01-01
Macrophages are specialized phagocytes that play an essential role in inflammation, immunity, and tissue repair. Profiling the global proteomic response of macrophages to microbial molecules such as bacterial lipopolysaccharide is key to understanding fundamental mechanisms of inflammatory disease. Ethanol is a widely abused substance that has complex effects on inflammation. Reports have indicated that ethanol can activate or inhibit the lipopolysaccharide receptor, Toll-like Receptor 4, in different settings, with important consequences for liver and neurologic inflammation, but the underlying mechanisms are poorly understood. To profile the sequential effect of low dose ethanol and lipopolysaccharide on macrophages, a gel-free proteomic technique was applied to RAW 264.7 macrophages. Five hundred four differentially expressed proteins were identified and quantified with high confidence using ≥ 5 peptide spectral matches. Among these, 319 proteins were shared across all treatment conditions, and 69 proteins were exclusively identified in ethanol-treated or lipopolysaccharide-stimulated cells. The interactive impact of ethanol and lipopolysaccharide on the macrophage proteome was evaluated using bioinformatics tools, enabling identification of differentially responsive proteins, protein interaction networks, disease- and function-based networks, canonical pathways, and upstream regulators. Five candidate protein coding genes (PGM2, ISYNA1, PARP1, and PSAP) were further validated by qRT-PCR that mostly related to glucose metabolism and fatty acid synthesis pathways. Taken together, this study describes for the first time at a systems level the interaction between ethanol and lipopolysaccharide in the proteomic programming of macrophages, and offers new mechanistic insights into the biology that may underlie the impact of ethanol on infectious and inflammatory disease in humans. PMID:29481576
Transcriptome and proteomic analysis of mango (Mangifera indica Linn) fruits.
Wu, Hong-xia; Jia, Hui-min; Ma, Xiao-wei; Wang, Song-biao; Yao, Quan-sheng; Xu, Wen-tian; Zhou, Yi-gang; Gao, Zhong-shan; Zhan, Ru-lin
2014-06-13
Here we used Illumina RNA-seq technology for transcriptome sequencing of a mixed fruit sample from 'Zill' mango (Mangifera indica Linn) fruit pericarp and pulp during the development and ripening stages. RNA-seq generated 68,419,722 sequence reads that were assembled into 54,207 transcripts with a mean length of 858bp, including 26,413 clusters and 27,794 singletons. A total of 42,515(78.43%) transcripts were annotated using public protein databases, with a cut-off E-value above 10(-5), of which 35,198 and 14,619 transcripts were assigned to gene ontology terms and clusters of orthologous groups respectively. Functional annotation against the Kyoto Encyclopedia of Genes and Genomes database identified 23,741(43.79%) transcripts which were mapped to 128 pathways. These pathways revealed many previously unknown transcripts. We also applied mass spectrometry-based transcriptome data to characterize the proteome of ripe fruit. LC-MS/MS analysis of the mango fruit proteome was using tandem mass spectrometry (MS/MS) in an LTQ Orbitrap Velos (Thermo) coupled online to the HPLC. This approach enabled the identification of 7536 peptides that matched 2754 proteins. Our study provides a comprehensive sequence for a systemic view of transcriptome during mango fruit development and the most comprehensive fruit proteome to date, which are useful for further genomics research and proteomic studies. Our study provides a comprehensive sequence for a systemic view of both the transcriptome and proteome of mango fruit, and a valuable reference for further research on gene expression and protein identification. This article is part of a Special Issue entitled: Proteomics of non-model organisms. Copyright © 2014 Elsevier B.V. All rights reserved.
Computational approaches to protein inference in shotgun proteomics
2012-01-01
Shotgun proteomics has recently emerged as a powerful approach to characterizing proteomes in biological samples. Its overall objective is to identify the form and quantity of each protein in a high-throughput manner by coupling liquid chromatography with tandem mass spectrometry. As a consequence of its high throughput nature, shotgun proteomics faces challenges with respect to the analysis and interpretation of experimental data. Among such challenges, the identification of proteins present in a sample has been recognized as an important computational task. This task generally consists of (1) assigning experimental tandem mass spectra to peptides derived from a protein database, and (2) mapping assigned peptides to proteins and quantifying the confidence of identified proteins. Protein identification is fundamentally a statistical inference problem with a number of methods proposed to address its challenges. In this review we categorize current approaches into rule-based, combinatorial optimization and probabilistic inference techniques, and present them using integer programing and Bayesian inference frameworks. We also discuss the main challenges of protein identification and propose potential solutions with the goal of spurring innovative research in this area. PMID:23176300
Durbin, Kenneth R.; Tran, John C.; Zamdborg, Leonid; Sweet, Steve M. M.; Catherman, Adam D.; Lee, Ji Eun; Li, Mingxi; Kellie, John F.; Kelleher, Neil L.
2011-01-01
Applying high-throughput Top-Down MS to an entire proteome requires a yet-to-be-established model for data processing. Since Top-Down is becoming possible on a large scale, we report our latest software pipeline dedicated to capturing the full value of intact protein data in automated fashion. For intact mass detection, we combine algorithms for processing MS1 data from both isotopically resolved (FT) and charge-state resolved (ion trap) LC-MS data, which are then linked to their fragment ions for database searching using ProSight. Automated determination of human keratin and tubulin isoforms is one result. Optimized for the intricacies of whole proteins, new software modules visualize proteome-scale data based on the LC retention time and intensity of intact masses and enable selective detection of PTMs to automatically screen for acetylation, phosphorylation, and methylation. Software functionality was demonstrated using comparative LC-MS data from yeast strains in addition to human cells undergoing chemical stress. We further these advances as a key aspect of realizing Top-Down MS on a proteomic scale. PMID:20848673
Histoplasma capsulatum proteome response to decreased iron availability
Winters, Michael S; Spellman, Daniel S; Chan, Qilin; Gomez, Francisco J; Hernandez, Margarita; Catron, Brittany; Smulian, Alan G; Neubert, Thomas A; Deepe, George S
2008-01-01
Background A fundamental pathogenic feature of the fungus Histoplasma capsulatum is its ability to evade innate and adaptive immune defenses. Once ingested by macrophages the organism is faced with several hostile environmental conditions including iron limitation. H. capsulatum can establish a persistent state within the macrophage. A gap in knowledge exists because the identities and number of proteins regulated by the organism under host conditions has yet to be defined. Lack of such knowledge is an important problem because until these proteins are identified it is unlikely that they can be targeted as new and innovative treatment for histoplasmosis. Results To investigate the proteomic response by H. capsulatum to decreasing iron availability we have created H. capsulatum protein/genomic databases compatible with current mass spectrometric (MS) search engines. Databases were assembled from the H. capsulatum G217B strain genome using gene prediction programs and expressed sequence tag (EST) libraries. Searching these databases with MS data generated from two dimensional (2D) in-gel digestions of proteins resulted in over 50% more proteins identified compared to searching the publicly available fungal databases alone. Using 2D gel electrophoresis combined with statistical analysis we discovered 42 H. capsulatum proteins whose abundance was significantly modulated when iron concentrations were lowered. Altered proteins were identified by mass spectrometry and database searching to be involved in glycolysis, the tricarboxylic acid cycle, lysine metabolism, protein synthesis, and one protein sequence whose function was unknown. Conclusion We have created a bioinformatics platform for H. capsulatum and demonstrated the utility of a proteomic approach by identifying a shift in metabolism the organism utilizes to cope with the hostile conditions provided by the host. We have shown that enzyme transcripts regulated by other fungal pathogens in response to lowering iron availability are also regulated in H. capsulatum at the protein level. We also identified H. capsulatum proteins sensitive to iron level reductions which have yet to be connected to iron availability in other pathogens. These data also indicate the complexity of the response by H. capsulatum to nutritional deprivation. Finally, we demonstrate the importance of a strain specific gene/protein database for H. capsulatum proteomic analysis. PMID:19108728
Maize-Pathogen Interactions: An Ongoing Combat from a Proteomics Perspective.
Pechanova, Olga; Pechan, Tibor
2015-11-30
Maize (Zea mays L.) is a host to numerous pathogenic species that impose serious diseases to its ear and foliage, negatively affecting the yield and the quality of the maize crop. A considerable amount of research has been carried out to elucidate mechanisms of maize-pathogen interactions with a major goal to identify defense-associated proteins. In this review, we summarize interactions of maize with its agriculturally important pathogens that were assessed at the proteome level. Employing differential analyses, such as the comparison of pathogen-resistant and susceptible maize varieties, as well as changes in maize proteomes after pathogen challenge, numerous proteins were identified as possible candidates in maize resistance. We describe findings of various research groups that used mainly mass spectrometry-based, high through-put proteomic tools to investigate maize interactions with fungal pathogens Aspergillus flavus, Fusarium spp., and Curvularia lunata, and viral agents Rice Black-streaked Dwarf Virus and Sugarcane Mosaic Virus.
Maize-Pathogen Interactions: An Ongoing Combat from a Proteomics Perspective
Pechanova, Olga; Pechan, Tibor
2015-01-01
Maize (Zea mays L.) is a host to numerous pathogenic species that impose serious diseases to its ear and foliage, negatively affecting the yield and the quality of the maize crop. A considerable amount of research has been carried out to elucidate mechanisms of maize-pathogen interactions with a major goal to identify defense-associated proteins. In this review, we summarize interactions of maize with its agriculturally important pathogens that were assessed at the proteome level. Employing differential analyses, such as the comparison of pathogen-resistant and susceptible maize varieties, as well as changes in maize proteomes after pathogen challenge, numerous proteins were identified as possible candidates in maize resistance. We describe findings of various research groups that used mainly mass spectrometry-based, high through-put proteomic tools to investigate maize interactions with fungal pathogens Aspergillus flavus, Fusarium spp., and Curvularia lunata, and viral agents Rice Black-streaked Dwarf Virus and Sugarcane Mosaic Virus. PMID:26633370
The 2018 Nucleic Acids Research database issue and the online molecular biology database collection.
Rigden, Daniel J; Fernández, Xosé M
2018-01-04
The 2018 Nucleic Acids Research Database Issue contains 181 papers spanning molecular biology. Among them, 82 are new and 84 are updates describing resources that appeared in the Issue previously. The remaining 15 cover databases most recently published elsewhere. Databases in the area of nucleic acids include 3DIV for visualisation of data on genome 3D structure and RNArchitecture, a hierarchical classification of RNA families. Protein databases include the established SMART, ELM and MEROPS while GPCRdb and the newcomer STCRDab cover families of biomedical interest. In the area of metabolism, HMDB and Reactome both report new features while PULDB appears in NAR for the first time. This issue also contains reports on genomics resources including Ensembl, the UCSC Genome Browser and ENCODE. Update papers from the IUPHAR/BPS Guide to Pharmacology and DrugBank are highlights of the drug and drug target section while a number of proteomics databases including proteomicsDB are also covered. The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). The NAR online Molecular Biology Database Collection has been updated, reviewing 138 entries, adding 88 new resources and eliminating 47 discontinued URLs, bringing the current total to 1737 databases. It is available at http://www.oxfordjournals.org/nar/database/c/. © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.
A proteomics study of barley powdery mildew haustoria.
Godfrey, Dale; Zhang, Ziguo; Saalbach, Gerhard; Thordal-Christensen, Hans
2009-06-01
A number of fungal and oomycete plant pathogens of major economic importance feed on their hosts by means of haustoria, which they place inside living plant cells. The underlying mechanisms are poorly understood, partly due to difficulty in preparing haustoria. We have therefore developed a procedure for isolating haustoria from the barley powdery mildew fungus (Blumeria graminis f.sp. hordei, Bgh). We subsequently aimed to understand the molecular mechanisms of haustoria through a study of their proteome. Extracted proteins were digested using trypsin, separated by LC, and analysed by MS/MS. Searches of a custom Bgh EST sequence database and the NCBI-NR fungal protein database, using the MS/MS data, identified 204 haustoria proteins. The majority of the proteins appear to have roles in protein metabolic pathways and biological energy production. Surprisingly, pyruvate decarboxylase (PDC), involved in alcoholic fermentation and commonly abundant in fungi and plants, was absent in our Bgh proteome data set. A sequence encoding this enzyme was also absent in our EST sequence database. Significantly, BLAST searches of the recently available Bgh genome sequence data also failed to identify a sequence encoding this enzyme, strongly indicating that Bgh does not have a gene for PDC.
Clinical veterinary proteomics: Techniques and approaches to decipher the animal plasma proteome.
Ghodasara, P; Sadowski, P; Satake, N; Kopp, S; Mills, P C
2017-12-01
Over the last two decades, technological advancements in the field of proteomics have advanced our understanding of the complex biological systems of living organisms. Techniques based on mass spectrometry (MS) have emerged as powerful tools to contextualise existing genomic information and to create quantitative protein profiles from plasma, tissues or cell lines of various species. Proteomic approaches have been used increasingly in veterinary science to investigate biological processes responsible for growth, reproduction and pathological events. However, the adoption of proteomic approaches by veterinary investigators lags behind that of researchers in the human medical field. Furthermore, in contrast to human proteomics studies, interpretation of veterinary proteomic data is difficult due to the limited protein databases available for many animal species. This review article examines the current use of advanced proteomics techniques for evaluation of animal health and welfare and covers the current status of clinical veterinary proteomics research, including successful protein identification and data interpretation studies. It includes a description of an emerging tool, sequential window acquisition of all theoretical fragment ion mass spectra (SWATH-MS), available on selected mass spectrometry instruments. This newly developed data acquisition technique combines advantages of discovery and targeted proteomics approaches, and thus has the potential to advance the veterinary proteomics field by enhancing identification and reproducibility of proteomics data. Copyright © 2017 Elsevier Ltd. All rights reserved.
Kirkwood, Kathryn J.; Ahmad, Yasmeen; Larance, Mark; Lamond, Angus I.
2013-01-01
Proteins form a diverse array of complexes that mediate cellular function and regulation. A largely unexplored feature of such protein complexes is the selective participation of specific protein isoforms and/or post-translationally modified forms. In this study, we combined native size-exclusion chromatography (SEC) with high-throughput proteomic analysis to characterize soluble protein complexes isolated from human osteosarcoma (U2OS) cells. Using this approach, we have identified over 71,500 peptides and 1,600 phosphosites, corresponding to over 8,000 proteins, distributed across 40 SEC fractions. This represents >50% of the predicted U2OS cell proteome, identified with a mean peptide sequence coverage of 27% per protein. Three biological replicates were performed, allowing statistical evaluation of the data and demonstrating a high degree of reproducibility in the SEC fractionation procedure. Specific proteins were detected interacting with multiple independent complexes, as typified by the separation of distinct complexes for the MRFAP1-MORF4L1-MRGBP interaction network. The data also revealed protein isoforms and post-translational modifications that selectively associated with distinct subsets of protein complexes. Surprisingly, there was clear enrichment for specific Gene Ontology terms associated with differential size classes of protein complexes. This study demonstrates that combined SEC/MS analysis can be used for the system-wide annotation of protein complexes and to predict potential isoform-specific interactions. All of these SEC data on the native separation of protein complexes have been integrated within the Encyclopedia of Proteome Dynamics, an online, multidimensional data-sharing resource available to the community. PMID:24043423
Kirkwood, Kathryn J; Ahmad, Yasmeen; Larance, Mark; Lamond, Angus I
2013-12-01
Proteins form a diverse array of complexes that mediate cellular function and regulation. A largely unexplored feature of such protein complexes is the selective participation of specific protein isoforms and/or post-translationally modified forms. In this study, we combined native size-exclusion chromatography (SEC) with high-throughput proteomic analysis to characterize soluble protein complexes isolated from human osteosarcoma (U2OS) cells. Using this approach, we have identified over 71,500 peptides and 1,600 phosphosites, corresponding to over 8,000 proteins, distributed across 40 SEC fractions. This represents >50% of the predicted U2OS cell proteome, identified with a mean peptide sequence coverage of 27% per protein. Three biological replicates were performed, allowing statistical evaluation of the data and demonstrating a high degree of reproducibility in the SEC fractionation procedure. Specific proteins were detected interacting with multiple independent complexes, as typified by the separation of distinct complexes for the MRFAP1-MORF4L1-MRGBP interaction network. The data also revealed protein isoforms and post-translational modifications that selectively associated with distinct subsets of protein complexes. Surprisingly, there was clear enrichment for specific Gene Ontology terms associated with differential size classes of protein complexes. This study demonstrates that combined SEC/MS analysis can be used for the system-wide annotation of protein complexes and to predict potential isoform-specific interactions. All of these SEC data on the native separation of protein complexes have been integrated within the Encyclopedia of Proteome Dynamics, an online, multidimensional data-sharing resource available to the community.
Unraveling snake venom complexity with 'omics' approaches: challenges and perspectives.
Zelanis, André; Tashima, Alexandre Keiji
2014-09-01
The study of snake venom proteomes (venomics) has been experiencing a burst of reports, however the comprehensive knowledge of the dynamic range of proteins present within a single venom, the set of post-translational modifications (PTMs) as well as the lack of a comprehensive database related to venom proteins are among the main challenges in venomics research. The phenotypic plasticity in snake venom proteomes together with their inherent toxin proteoform diversity, points out to the use of integrative analysis in order to better understand their actual complexity. In this regard, such a systems venomics task should encompass the integration of data from transcriptomic and proteomic studies (specially the venom gland proteome), the identification of biological PTMs, and the estimation of artifactual proteomes and peptidomes generated by sample handling procedures. Copyright © 2014 Elsevier Ltd. All rights reserved.
Olinares, Paul Dominic B.; Ponnala, Lalit; van Wijk, Klaas J.
2010-01-01
To characterize MDa-sized macromolecular chloroplast stroma protein assemblies and to extend coverage of the chloroplast stroma proteome, we fractionated soluble chloroplast stroma in the non-denatured state by size exclusion chromatography with a size separation range up to ∼5 MDa. To maximize protein complex stability and resolution of megadalton complexes, ionic strength and composition were optimized. Subsequent high accuracy tandem mass spectrometry analysis (LTQ-Orbitrap) identified 1081 proteins across the complete native mass range. Protein complexes and assembly states above 0.8 MDa were resolved using hierarchical clustering, and protein heat maps were generated from normalized protein spectral counts for each of the size exclusion chromatography fractions; this complemented previous analysis of stromal complexes up to 0.8 MDa (Peltier, J. B., Cai, Y., Sun, Q., Zabrouskov, V., Giacomelli, L., Rudella, A., Ytterberg, A. J., Rutschow, H., and van Wijk, K. J. (2006) The oligomeric stromal proteome of Arabidopsis thaliana chloroplasts. Mol. Cell. Proteomics 5, 114–133). This combined experimental and bioinformatics analyses resolved chloroplast ribosomes in different assembly and functional states (e.g. 30, 50, and 70 S), which enabled the identification of plastid homologues of prokaryotic ribosome assembly factors as well as proteins involved in co-translational modifications, targeting, and folding. The roles of these ribosome-associating proteins will be discussed. Known RNA splice factors (e.g. CAF1/WTF1/RNC1) as well as uncharacterized proteins with RNA-binding domains (pentatricopeptide repeat, RNA recognition motif, and chloroplast ribosome maturation), RNases, and DEAD box helicases were found in various sized complexes. Chloroplast DNA (>3 MDa) was found in association with the complete heteromeric plastid-encoded DNA polymerase complex, and a dozen other DNA-binding proteins, e.g. DNA gyrase, topoisomerase, and various DNA repair enzymes. The heteromeric ≥5-MDa pyruvate dehydrogenase complex and the 0.8–1-MDa acetyl-CoA carboxylase complex associated with uncharacterized biotin carboxyl carrier domain proteins constitute the entry point to fatty acid metabolism in leaves; we suggest that their large size relates to the need for metabolic channeling. Protein annotations and identification data are available through the Plant Proteomics Database, and mass spectrometry data are available through Proteomics Identifications database. PMID:20423899
Motohashi, Reiko; Rödiger, Anja; Agne, Birgit; Baerenfaller, Katja; Baginsky, Sacha
2012-01-01
Research interest in proteomics is increasingly shifting toward the reverse genetic characterization of gene function at the proteome level. In plants, several distinct gene defects perturb photosynthetic capacity, resulting in the loss of chlorophyll and an albino or pale-green phenotype. Because photosynthesis is interconnected with the entire plant metabolism and its regulation, all albino plants share common characteristics that are determined by the switch from autotrophic to heterotrophic growth. Reverse genetic characterizations of such plants often cannot distinguish between specific consequences of a gene defect from generic effects in response to perturbations in photosynthetic capacity. Here, we set out to define common and specific features of protein accumulation in three different albino/pale-green plant lines. Using quantitative proteomics, we report a common molecular phenotype that connects the loss of photosynthetic capacity with other chloroplast and cellular functions, such as protein folding and stability, plastid protein import, and the expression of stress-related genes. Surprisingly, we do not find significant differences in the expression of key transcriptional regulators, suggesting that substantial regulation occurs at the posttranscriptional level. We examine the influence of different normalization schemes on the quantitative proteomics data and report all identified proteins along with their fold changes and P values in albino plants in comparison with the wild type. Our analysis provides initial guidance for the distinction between general and specific adaptations of the proteome in photosynthesis-impaired plants. PMID:23027667
Proteomics Analysis of Bladder Cancer Exosomes*
Welton, Joanne L.; Khanna, Sanjay; Giles, Peter J.; Brennan, Paul; Brewis, Ian A.; Staffurth, John; Mason, Malcolm D.; Clayton, Aled
2010-01-01
Exosomes are nanometer-sized vesicles, secreted by various cell types, present in biological fluids that are particularly rich in membrane proteins. Ex vivo analysis of exosomes may provide biomarker discovery platforms and form non-invasive tools for disease diagnosis and monitoring. These vesicles have never before been studied in the context of bladder cancer, a major malignancy of the urological tract. We present the first proteomics analysis of bladder cancer cell exosomes. Using ultracentrifugation on a sucrose cushion, exosomes were highly purified from cultured HT1376 bladder cancer cells and verified as low in contaminants by Western blotting and flow cytometry of exosome-coated beads. Solubilization in a buffer containing SDS and DTT was essential for achieving proteomics analysis using an LC-MALDI-TOF/TOF MS approach. We report 353 high quality identifications with 72 proteins not previously identified by other human exosome proteomics studies. Overrepresentation analysis to compare this data set with previous exosome proteomics studies (using the ExoCarta database) revealed that the proteome was consistent with that of various exosomes with particular overlap with exosomes of carcinoma origin. Interrogating the Gene Ontology database highlighted a strong association of this proteome with carcinoma of bladder and other sites. The data also highlighted how homology among human leukocyte antigen haplotypes may confound MASCOT designation of major histocompatability complex Class I nomenclature, requiring data from PCR-based human leukocyte antigen haplotyping to clarify anomalous identifications. Validation of 18 MS protein identifications (including basigin, galectin-3, trophoblast glycoprotein (5T4), and others) was performed by a combination of Western blotting, flotation on linear sucrose gradients, and flow cytometry, confirming their exosomal expression. Some were confirmed positive on urinary exosomes from a bladder cancer patient. In summary, the exosome proteomics data set presented is of unrivaled quality. The data will aid in the development of urine exosome-based clinical tools for monitoring disease and will inform follow-up studies into varied aspects of exosome manufacture and function. PMID:20224111
Generation of comprehensive thoracic oncology database--tool for translational research.
Surati, Mosmi; Robinson, Matthew; Nandi, Suvobroto; Faoro, Leonardo; Demchuk, Carley; Kanteti, Rajani; Ferguson, Benjamin; Gangadhar, Tara; Hensing, Thomas; Hasina, Rifat; Husain, Aliya; Ferguson, Mark; Karrison, Theodore; Salgia, Ravi
2011-01-22
The Thoracic Oncology Program Database Project was created to serve as a comprehensive, verified, and accessible repository for well-annotated cancer specimens and clinical data to be available to researchers within the Thoracic Oncology Research Program. This database also captures a large volume of genomic and proteomic data obtained from various tumor tissue studies. A team of clinical and basic science researchers, a biostatistician, and a bioinformatics expert was convened to design the database. Variables of interest were clearly defined and their descriptions were written within a standard operating manual to ensure consistency of data annotation. Using a protocol for prospective tissue banking and another protocol for retrospective banking, tumor and normal tissue samples from patients consented to these protocols were collected. Clinical information such as demographics, cancer characterization, and treatment plans for these patients were abstracted and entered into an Access database. Proteomic and genomic data have been included in the database and have been linked to clinical information for patients described within the database. The data from each table were linked using the relationships function in Microsoft Access to allow the database manager to connect clinical and laboratory information during a query. The queried data can then be exported for statistical analysis and hypothesis generation.
Utility of proteomics in obstetric disorders: a review
Hernández-Núñez, Jónathan; Valdés-Yong, Magel
2015-01-01
The study of proteomics could explain many aspects of obstetric disorders. We undertook this review with the aim of assessing the utility of proteomics in the specialty of obstetrics. We searched the electronic databases of MEDLINE, EBSCOhost, BVS Bireme, and SciELO, using various search terms with the assistance of a librarian. We considered cohort studies, case-control studies, case series, and systematic review articles published until October 2014 in the English or Spanish language, and evaluated their quality and the internal validity of the evidence provided. Two reviewers extracted the data independently, then both researchers simultaneously revised the data later, to arrive at a consensus. The search retrieved 1,158 papers, of which 965 were excluded for being duplicates, not relevant, or unrelated studies. A further 86 papers were excluded for being guidelines, protocols, or case reports, along with another 64 that did not contain relevant information, leaving 43 studies for inclusion. Many of these studies showed the utility of proteomic techniques for prediction, pathophysiology, diagnosis, management, monitoring, and prognosis of pre-eclampsia, perinatal infection, premature rupture of membranes, preterm birth, intrauterine growth restriction, and ectopic pregnancy. Proteomic techniques have enormous clinical significance and constitute an invaluable weapon in the management of obstetric disorders that increase maternal and perinatal morbidity and mortality. PMID:25926758
Mu, Jun; Yang, Yongtao; Chen, Jin; Cheng, Ke; Li, Qi; Wei, Yongdong; Zhu, Dan; Shao, Weihua; Zheng, Peng; Xie, Peng
2015-10-30
Tuberculous meningitis (TBM) remains to be one of the most deadly infectious diseases. The pathogen interacts with the host immune system, the process of which is largely unknown. Various cellular processes of Mycobacterium tuberculosis (MTB) centers around lipid metabolism. To determine the lipid metabolism related proteins, a quantitative proteomic study was performed here to identify differential proteins in the cerebrospinal fluid (CSF) obtained from TBM patients (n = 12) and healthy controls (n = 12). CSF samples were desalted, concentrated, labelled with isobaric tags for relative and absolute quantitation (iTRAQ™), and analyzed by multi-dimensional liquid chromatography-tandem mass spectrometry (LC-MS/MS). Gene ontology and proteomic phenotyping analysis of the differential proteins were conducted using Database for Annotation, Visualization, and Integrated Discovery (DAVID) Bioinformatics Resources. ApoE and ApoB were selected for validation by ELISA. Proteomic phenotyping of the 4 differential proteins was invloved in the lipid metabolism. ELISA showed significantly increased ApoB levels in TBM subjects compared to healthy controls. Area under the receiver operating characteristic curve analysis demonstrated ApoB levels could distinguish TBM subjects from healthy controls and viral meningitis subjects with 89.3% sensitivity and 92% specificity. CSF lipid metabolism disregulation, especially elevated expression of ApoB, gives insights into the pathogenesis of TBM. Further evaluation of these findings in larger studies including anti-tuberculosis medicated and unmedicated patient cohorts with other center nervous system infectious diseases is required for successful clinical translation. Copyright © 2015 Elsevier Inc. All rights reserved.
Newborn mouse lens proteome and its alteration by lysine 6 mutant ubiquitin
USDA-ARS?s Scientific Manuscript database
Ubiquitin is a tag that often initiates degradation of proteins by the proteasome in the ubiquitin proteasome system. Targeted expression of K6W mutant ubiquitin (K6W-Ub) in the lens results in defects in lens development and cataract formation, suggesting critical functions for ubiquitin in lens. T...
Wang, Guifeng; Wang, Gang; Wang, Jiajia; Du, Yulong; Yao, Dongsheng; Shuai, Bilian; Han, Liang; Tang, Yuanping; Song, Rentao
2016-12-01
Prolamins, the major cereal seed storage proteins, are sequestered and accumulated in the lumen of the endoplasmic reticulum (ER), and are directly assembled into protein bodies (PBs). The content and composition of prolamins are the key determinants for protein quality and texture-related traits of the grain. Concomitantly, the PB-inducing fusion system provides an efficient target to produce therapeutic and industrial products in plants. However, the proteome of the native PB and the detailed mechanisms underlying its formation still need to be determined. We developed a method to isolate highly purified and intact PBs from developing maize endosperm and conducted proteomic analysis of intact PBs of zein, a class of prolamine protein found in maize. We thus identified 1756 proteins, which fall into five major categories: metabolic pathways, response to stimulus, transport, development, and growth, as well as regulation. By comparing the proteomes of crude and enriched extractions of PBs, we found substantial evidence for the following conclusions: (i) ribosomes, ER membranes, and the cytoskeleton are tightly associated with zein PBs, which form the peripheral border; (ii) zein RNAs are probably transported and localized to the PB-ER subdomain; and (iii) ER chaperones are essential for zein folding, quality control, and assembly into PBs. We futher confirmed that OPAQUE1 (O1) cannot directly interact with FLOURY1 (FL1) in yeast, suggesting that the interaction between myosins XI and DUF593-containing proteins is isoform-specific. This study provides a proteomic roadmap for dissecting zein PB biogenesis and reveals an unexpected diversity and complexity of proteins in PBs. © The Author 2016. Published by Oxford University Press on behalf of the Society for Experimental Biology.
Responses of Nannochloropsis oceanica IMET1 to Long-Term Nitrogen Starvation and Recovery1[C][W][OA
Dong, Hong-Po; Williams, Ernest; Wang, Da-zhi; Xie, Zhang-Xian; Hsia, Ru-ching; Jenck, Alizée; Halden, Rolf; Li, Jing; Chen, Feng; Place, Allen R.
2013-01-01
The Nannochloropsis genus contains oleaginous microalgae that have served as model systems for developing renewable biodiesel. Recent genomic and transcriptomic studies on Nannochloropsis species have provided insights into the regulation of lipid production in response to nitrogen stress. Previous studies have focused on the responses of Nannochloropsis species to short-term nitrogen stress, but the effect of long-term nitrogen deprivation remains largely unknown. In this study, physiological and proteomic approaches were combined to understand the mechanisms by which Nannochloropsis oceanica IMET1 is able to endure long-term nitrate deprivation and its ability to recover homeostasis when nitrogen is amended. Changes of the proteome during chronic nitrogen starvation espoused the physiological changes observed, and there was a general trend toward recycling nitrogen and storage of lipids. This was evidenced by a global down-regulation of protein expression, a retained expression of proteins involved in glycolysis and the synthesis of fatty acids, as well as an up-regulation of enzymes used in nitrogen scavenging and protein turnover. Also, lipid accumulation and autophagy of plastids may play a key role in maintaining cell vitality. Following the addition of nitrogen, there were proteomic changes and metabolic changes observed within 24 h, which resulted in a return of the culture to steady state within 4 d. These results demonstrate the ability of N. oceanica IMET1 to recover from long periods of nitrate deprivation without apparent detriment to the culture and provide proteomic markers for genetic modification. PMID:23637339
Bensaddek, Dalila; Narayan, Vikram; Nicolas, Armel; Murillo, Alejandro Brenes; Gartner, Anton; Kenyon, Cynthia J; Lamond, Angus I
2016-02-01
Proteomics studies typically analyze proteins at a population level, using extracts prepared from tens of thousands to millions of cells. The resulting measurements correspond to average values across the cell population and can mask considerable variation in protein expression and function between individual cells or organisms. Here, we report the development of micro-proteomics for the analysis of Caenorhabditis elegans, a eukaryote composed of 959 somatic cells and ∼1500 germ cells, measuring the worm proteome at a single organism level to a depth of ∼3000 proteins. This includes detection of proteins across a wide dynamic range of expression levels (>6 orders of magnitude), including many chromatin-associated factors involved in chromosome structure and gene regulation. We apply the micro-proteomics workflow to measure the global proteome response to heat-shock in individual nematodes. This shows variation between individual animals in the magnitude of proteome response following heat-shock, including variable induction of heat-shock proteins. The micro-proteomics pipeline thus facilitates the investigation of stochastic variation in protein expression between individuals within an isogenic population of C. elegans. All data described in this study are available online via the Encyclopedia of Proteome Dynamics (http://www.peptracker.com/epd), an open access, searchable database resource. © 2015 The Authors. PROTEOMICS Published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
dbHiMo: a web-based epigenomics platform for histone-modifying enzymes
Choi, Jaeyoung; Kim, Ki-Tae; Huh, Aram; Kwon, Seomun; Hong, Changyoung; Asiegbu, Fred O.; Jeon, Junhyun; Lee, Yong-Hwan
2015-01-01
Over the past two decades, epigenetics has evolved into a key concept for understanding regulation of gene expression. Among many epigenetic mechanisms, covalent modifications such as acetylation and methylation of lysine residues on core histones emerged as a major mechanism in epigenetic regulation. Here, we present the database for histone-modifying enzymes (dbHiMo; http://hme.riceblast.snu.ac.kr/) aimed at facilitating functional and comparative analysis of histone-modifying enzymes (HMEs). HMEs were identified by applying a search pipeline built upon profile hidden Markov model (HMM) to proteomes. The database incorporates 11 576 HMEs identified from 603 proteomes including 483 fungal, 32 plants and 51 metazoan species. The dbHiMo provides users with web-based personalized data browsing and analysis tools, supporting comparative and evolutionary genomics. With comprehensive data entries and associated web-based tools, our database will be a valuable resource for future epigenetics/epigenomics studies. Database URL: http://hme.riceblast.snu.ac.kr/ PMID:26055100
Glytsou, Christina; Calvo, Enrique; Cogliati, Sara; Mehrotra, Arpit; Anastasia, Irene; Rigoni, Giovanni; Raimondi, Andrea; Shintani, Norihito; Loureiro, Marta; Vazquez, Jesùs; Pellegrini, Luca; Enriquez, Jose Antonio; Scorrano, Luca; Soriano, Maria Eugenia
2016-12-13
The mitochondrial contact site and cristae organizing system (MICOS) and Optic atrophy 1 (OPA1) control cristae shape, thus affecting mitochondrial function and apoptosis. Whether and how they physically and functionally interact is unclear. Here, we provide evidence that OPA1 is epistatic to MICOS in the regulation of cristae shape. Proteomic analysis identifies multiple MICOS components in native OPA1-containing high molecular weight complexes disrupted during cristae remodeling. MIC60, a core MICOS protein, physically interacts with OPA1, and together, they control cristae junction number and stability, OPA1 being epistatic to MIC60. OPA1 defines cristae width and junction diameter independently of MIC60. Our combination of proteomics, biochemistry, genetics, and electron tomography provides a unifying model for mammalian cristae biogenesis by OPA1 and MICOS. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Nur Afifah, Diana; Rustanti, Ninik; Anjani, Gemala; Syah, Dahrul; Yanti; Suhartono, Maggy T.
2017-02-01
This paper presents the proteomics study which includes separation, identification and characterization of proteins. The experiment on Indonesian fermented food such as extracellular fibrinolytic protease from Bacillus licheniformis RO3 and Bacillus pumilus 2.g isolated from red oncom and tempeh gembus was conducted. The experimental works comprise the following steps: (1) a combination of one- and two-dimensional electrophoresis analysis, (2) mass spectrometry analysis using MALDI-TOF-MS and (3) investigation using protein database. The result suggested that there were new two protein fractions of B. licheniformis RO3 and three protein fractions of B. pumilus 2.g. These result has not been previously reported.
Defining the proteome of human iris, ciliary body, retinal pigment epithelium, and choroid.
Zhang, Pingbo; Kirby, David; Dufresne, Craig; Chen, Yan; Turner, Randi; Ferri, Sara; Edward, Deepak P; Van Eyk, Jennifer E; Semba, Richard D
2016-04-01
The iris is a fine structure that controls the amount of light that enters the eye. The ciliary body controls the shape of the lens and produces aqueous humor. The retinal pigment epithelium and choroid (RPE/choroid) are essential in supporting the retina and absorbing light energy that enters the eye. Proteins were extracted from iris, ciliary body, and RPE/choroid tissues of eyes from five individuals and fractionated using SDS-PAGE. After in-gel digestion, peptides were analyzed using LC-MS/MS on an Orbitrap Elite mass spectrometer. In iris, ciliary body, and RPE/choroid, we identified 2959, 2867, and 2755 nonredundant proteins with peptide and protein false-positive rates of <0.1% and <1%, respectively. Forty-three unambiguous protein isoforms were identified in iris, ciliary body, and RPE/choroid. Four "missing proteins" were identified in ciliary body based on ≥2 proteotypic peptides. The mass spectrometric proteome database of the human iris, ciliary body, and RPE/choroid may serve as a valuable resource for future investigations of the eye in health and disease. The MS proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifiers PXD001424 and PXD002194. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Kim, Young-Ha; slam, Mohammad Saiful; You, Myung-Jo
2015-01-01
Proteomic tools allow large-scale, high-throughput analyses for the detection, identification, and functional investigation of proteome. For detection of antigens from Haemaphysalis longicornis, 1-dimensional electrophoresis (1-DE) quantitative immunoblotting technique combined with 2-dimensional electrophoresis (2-DE) immunoblotting was used for whole body proteins from unfed and partially fed female ticks. Reactivity bands and 2-DE immunoblotting were performed following 2-DE electrophoresis to identify protein spots. The proteome of the partially fed female had a larger number of lower molecular weight proteins than that of the unfed female tick. The total number of detected spots was 818 for unfed and 670 for partially fed female ticks. The 2-DE immunoblotting identified 10 antigenic spots from unfed females and 8 antigenic spots from partially fed females. Matrix Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry (MALDI-TOF) of relevant spots identified calreticulin, putative secreted WC salivary protein, and a conserved hypothetical protein from the National Center for Biotechnology Information and Swiss Prot protein sequence databases. These findings indicate that most of the whole body components of these ticks are non-immunogenic. The data reported here will provide guidance in the identification of antigenic proteins to prevent infestation and diseases transmitted by H. longicornis. PMID:25748713
A standardized framing for reporting protein identifications in mzIdentML 1.2
Seymour, Sean L.; Farrah, Terry; Binz, Pierre-Alain; Chalkley, Robert J.; Cottrell, John S.; Searle, Brian C.; Tabb, David L.; Vizcaíno, Juan Antonio; Prieto, Gorka; Uszkoreit, Julian; Eisenacher, Martin; Martínez-Bartolomé, Salvador; Ghali, Fawaz; Jones, Andrew R.
2015-01-01
Inferring which protein species have been detected in bottom-up proteomics experiments has been a challenging problem for which solutions have been maturing over the past decade. While many inference approaches now function well in isolation, comparing and reconciling the results generated across different tools remains difficult. It presently stands as one of the greatest barriers in collaborative efforts such as the Human Proteome Project and public repositories like the PRoteomics IDEntifications (PRIDE) database. Here we present a framework for reporting protein identifications that seeks to improve capabilities for comparing results generated by different inference tools. This framework standardizes the terminology for describing protein identification results, associated with the HUPO-Proteomics Standards Initiative (PSI) mzIdentML standard, while still allowing for differing methodologies to reach that final state. It is proposed that developers of software for reporting identification results will adopt this terminology in their outputs. While the new terminology does not require any changes to the core mzIdentML model, it represents a significant change in practice, and, as such, the rules will be released via a new version of the mzIdentML specification (version 1.2) so that consumers of files are able to determine whether the new guidelines have been adopted by export software. PMID:25092112
Proteomic identification of fat-browning markers in cultured white adipocytes treated with curcumin.
Kim, Sang Woo; Choi, Jae Heon; Mukherjee, Rajib; Hwang, Ki-Chul; Yun, Jong Won
2016-04-01
We previously reported that curcumin induces browning of primary white adipocytes via enhanced expression of brown adipocyte-specific genes. In this study, we attempted to identify target proteins responsible for this fat-browning effect by analyzing proteomic changes in cultured white adipocytes in response to curcumin treatment. To elucidate the role of curcumin in fat-browning, we conducted comparative proteomic analysis of primary adipocytes between control and curcumin-treated cells using two-dimensional electrophoresis combined with MALDI-TOF-MS. We also investigated fatty acid metabolic targets, mitochondrial biogenesis, and fat-browning-associated proteins using combined proteomic and network analyses. Proteomic analysis revealed that 58 protein spots from a total of 325 matched spots showed differential expression between control and curcumin-treated adipocytes. Using network analysis, most of the identified proteins were proven to be involved in various metabolic and cellular processes based on the PANTHER classification system. One of the most striking findings is that hormone-sensitive lipase (HSL) was highly correlated with main browning markers based on the STRING database. HSL and two browning markers (UCP1, PGC-1α) were co-immunoprecipitated with these markers, suggesting that HSL possibly plays a role in fat-browning of white adipocytes. Our results suggest that curcumin increased HSL levels and other browning-specific markers, suggesting its possible role in augmentation of lipolysis and suppression of lipogenesis by trans-differentiation from white adipocytes into brown adipocytes (beige).
The SH2 domain interaction landscape.
Tinti, Michele; Kiemer, Lars; Costa, Stefano; Miller, Martin L; Sacco, Francesca; Olsen, Jesper V; Carducci, Martina; Paoluzi, Serena; Langone, Francesca; Workman, Christopher T; Blom, Nikolaj; Machida, Kazuya; Thompson, Christopher M; Schutkowski, Mike; Brunak, Søren; Mann, Matthias; Mayer, Bruce J; Castagnoli, Luisa; Cesareni, Gianni
2013-04-25
Members of the SH2 domain family modulate signal transduction by binding to short peptides containing phosphorylated tyrosines. Each domain displays a distinct preference for the sequence context of the phosphorylated residue. We have developed a high-density peptide chip technology that allows for probing of the affinity of most SH2 domains for a large fraction of the entire complement of tyrosine phosphopeptides in the human proteome. Using this technique, we have experimentally identified thousands of putative SH2-peptide interactions for more than 70 different SH2 domains. By integrating this rich data set with orthogonal context-specific information, we have assembled an SH2-mediated probabilistic interaction network, which we make available as a community resource in the PepspotDB database. A predicted dynamic interaction between the SH2 domains of the tyrosine phosphatase SHP2 and the phosphorylated tyrosine in the extracellular signal-regulated kinase activation loop was validated by experiments in living cells. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Caterino, Marianna; Ruoppolo, Margherita; Fulcoli, Gabriella; Huynth, Tuong; Orrù, Stefania; Baldini, Antonio; Salvatore, Francesco
2009-01-01
TBX1 haploinsufficiency is considered a major contributor to the del22q11.2/DiGeorge syndrome (DGS) phenotype. We have used proteomic tools to look at all the major proteins involved in the TBX1-mediated pathways in an attempt to better understand the molecular interactions instrumental to its cellular functions. We found more than 90 proteins that could be targeted by TBX1 through different mechanisms. The most interesting observation is that overexpression of TBX1 results in down-regulation of two proteins involved in retinoic acid metabolism. PMID:19178302
Feng, Yun; Li, Xiaomin; Zou, Chenggang; Xu, Jianping; Ren, Yan; Mi, Qili; Wu, Junli; Liu, Shuqun; Liu, Yu; Huang, Xiaowei; Wang, Haiyan; Niu, Xuemei; Li, Juan; Liang, Lianming; Luo, Yanlu; Ji, Kaifang; Zhou, Wei; Yu, Zefen; Li, Guohong; Liu, Yajun; Li, Lei; Qiao, Min; Feng, Lu; Zhang, Ke-Qin
2011-01-01
Nematode-trapping fungi are “carnivorous” and attack their hosts using specialized trapping devices. The morphological development of these traps is the key indicator of their switch from saprophytic to predacious lifestyles. Here, the genome of the nematode-trapping fungus Arthrobotrys oligospora Fres. (ATCC24927) was reported. The genome contains 40.07 Mb assembled sequence with 11,479 predicted genes. Comparative analysis showed that A. oligospora shared many more genes with pathogenic fungi than with non-pathogenic fungi. Specifically, compared to several sequenced ascomycete fungi, the A. oligospora genome has a larger number of pathogenicity-related genes in the subtilisin, cellulase, cellobiohydrolase, and pectinesterase gene families. Searching against the pathogen-host interaction gene database identified 398 homologous genes involved in pathogenicity in other fungi. The analysis of repetitive sequences provided evidence for repeat-induced point mutations in A. oligospora. Proteomic and quantitative PCR (qPCR) analyses revealed that 90 genes were significantly up-regulated at the early stage of trap-formation by nematode extracts and most of these genes were involved in translation, amino acid metabolism, carbohydrate metabolism, cell wall and membrane biogenesis. Based on the combined genomic, proteomic and qPCR data, a model for the formation of nematode trapping device in this fungus was proposed. In this model, multiple fungal signal transduction pathways are activated by its nematode prey to further regulate downstream genes associated with diverse cellular processes such as energy metabolism, biosynthesis of the cell wall and adhesive proteins, cell division, glycerol accumulation and peroxisome biogenesis. This study will facilitate the identification of pathogenicity-related genes and provide a broad foundation for understanding the molecular and evolutionary mechanisms underlying fungi-nematodes interactions. PMID:21909256
Yang, Jinkui; Wang, Lei; Ji, Xinglai; Feng, Yun; Li, Xiaomin; Zou, Chenggang; Xu, Jianping; Ren, Yan; Mi, Qili; Wu, Junli; Liu, Shuqun; Liu, Yu; Huang, Xiaowei; Wang, Haiyan; Niu, Xuemei; Li, Juan; Liang, Lianming; Luo, Yanlu; Ji, Kaifang; Zhou, Wei; Yu, Zefen; Li, Guohong; Liu, Yajun; Li, Lei; Qiao, Min; Feng, Lu; Zhang, Ke-Qin
2011-09-01
Nematode-trapping fungi are "carnivorous" and attack their hosts using specialized trapping devices. The morphological development of these traps is the key indicator of their switch from saprophytic to predacious lifestyles. Here, the genome of the nematode-trapping fungus Arthrobotrys oligospora Fres. (ATCC24927) was reported. The genome contains 40.07 Mb assembled sequence with 11,479 predicted genes. Comparative analysis showed that A. oligospora shared many more genes with pathogenic fungi than with non-pathogenic fungi. Specifically, compared to several sequenced ascomycete fungi, the A. oligospora genome has a larger number of pathogenicity-related genes in the subtilisin, cellulase, cellobiohydrolase, and pectinesterase gene families. Searching against the pathogen-host interaction gene database identified 398 homologous genes involved in pathogenicity in other fungi. The analysis of repetitive sequences provided evidence for repeat-induced point mutations in A. oligospora. Proteomic and quantitative PCR (qPCR) analyses revealed that 90 genes were significantly up-regulated at the early stage of trap-formation by nematode extracts and most of these genes were involved in translation, amino acid metabolism, carbohydrate metabolism, cell wall and membrane biogenesis. Based on the combined genomic, proteomic and qPCR data, a model for the formation of nematode trapping device in this fungus was proposed. In this model, multiple fungal signal transduction pathways are activated by its nematode prey to further regulate downstream genes associated with diverse cellular processes such as energy metabolism, biosynthesis of the cell wall and adhesive proteins, cell division, glycerol accumulation and peroxisome biogenesis. This study will facilitate the identification of pathogenicity-related genes and provide a broad foundation for understanding the molecular and evolutionary mechanisms underlying fungi-nematodes interactions.
Budayeva, Hanna G; Cristea, Ileana M
2016-10-01
Human sirtuin 2 (SIRT2) is an NAD + -dependent deacetylase that primarily functions in the cytoplasm, where it can regulate α-tubulin acetylation levels. SIRT2 is linked to cancer progression, neurodegeneration, and infection with bacteria or viruses. However, the current knowledge about its interactions and the means through which it exerts its functions has remained limited. Here, we aimed to gain a better understanding of its cellular functions by characterizing SIRT2 subcellular localization, the identity and relative stability of its protein interactions, and its impact on the proteome of primary human fibroblasts. To assess the relative stability of SIRT2 interactions, we used immunoaffinity purification in conjunction with both label-free and metabolic labeling quantitative mass spectrometry. In addition to the expected associations with cytoskeleton proteins, including its known substrate TUBA1A, our results reveal that SIRT2 specifically interacts with proteins functioning in membrane trafficking, secretory processes, and transcriptional regulation. By quantifying their relative stability, we found most interactions to be transient, indicating a dynamic SIRT2 environment. We discover that SIRT2 localizes to the ER-Golgi intermediate compartment (ERGIC), and that this recruitment requires an intact ER-Golgi trafficking pathway. Further expanding these findings, we used microscopy and interaction assays to establish the interaction and coregulation of SIRT2 with liprin-β1 scaffolding protein (PPFiBP1), a protein with roles in focal adhesions disassembly. As SIRT2 functions may be accomplished via interactions, enzymatic activity, and transcriptional regulation, we next assessed the impact of SIRT2 levels on the cellular proteome. SIRT2 knockdown led to changes in the levels of proteins functioning in membrane trafficking, including some of its interaction partners. Altogether, our study expands the knowledge of SIRT2 cytoplasmic functions to define a previously unrecognized involvement in intracellular trafficking pathways, which may contribute to its roles in cellular homeostasis and human diseases. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Budayeva, Hanna G.; Cristea, Ileana M.
2016-01-01
Human sirtuin 2 (SIRT2) is an NAD+-dependent deacetylase that primarily functions in the cytoplasm, where it can regulate α-tubulin acetylation levels. SIRT2 is linked to cancer progression, neurodegeneration, and infection with bacteria or viruses. However, the current knowledge about its interactions and the means through which it exerts its functions has remained limited. Here, we aimed to gain a better understanding of its cellular functions by characterizing SIRT2 subcellular localization, the identity and relative stability of its protein interactions, and its impact on the proteome of primary human fibroblasts. To assess the relative stability of SIRT2 interactions, we used immunoaffinity purification in conjunction with both label-free and metabolic labeling quantitative mass spectrometry. In addition to the expected associations with cytoskeleton proteins, including its known substrate TUBA1A, our results reveal that SIRT2 specifically interacts with proteins functioning in membrane trafficking, secretory processes, and transcriptional regulation. By quantifying their relative stability, we found most interactions to be transient, indicating a dynamic SIRT2 environment. We discover that SIRT2 localizes to the ER-Golgi intermediate compartment (ERGIC), and that this recruitment requires an intact ER-Golgi trafficking pathway. Further expanding these findings, we used microscopy and interaction assays to establish the interaction and coregulation of SIRT2 with liprin-β1 scaffolding protein (PPFiBP1), a protein with roles in focal adhesions disassembly. As SIRT2 functions may be accomplished via interactions, enzymatic activity, and transcriptional regulation, we next assessed the impact of SIRT2 levels on the cellular proteome. SIRT2 knockdown led to changes in the levels of proteins functioning in membrane trafficking, including some of its interaction partners. Altogether, our study expands the knowledge of SIRT2 cytoplasmic functions to define a previously unrecognized involvement in intracellular trafficking pathways, which may contribute to its roles in cellular homeostasis and human diseases. PMID:27503897
Evidence for a strong sulfur-aromatic interaction derived from crystallographic data.
Zauhar, R J; Colbert, C L; Morgan, R S; Welsh, W J
2000-03-01
We have uncovered new evidence for a significant interaction between divalent sulfur atoms and aromatic rings. Our study involves a statistical analysis of interatomic distances and other geometric descriptors derived from entries in the Cambridge Crystallographic Database (F. H. Allen and O. Kennard, Chem. Design Auto. News, 1993, Vol. 8, pp. 1 and 31-37). A set of descriptors was defined sufficient in number and type so as to elucidate completely the preferred geometry of interaction between six-membered aromatic carbon rings and divalent sulfurs for all crystal structures of nonmetal-bearing organic compounds present in the database. In order to test statistical significance, analogous probability distributions for the interaction of the moiety X-CH(2)-X with aromatic rings were computed, and taken a priori to correspond to the null hypothesis of no significant interaction. Tests of significance were carried our pairwise between probability distributions of sulfur-aromatic interaction descriptors and their CH(2)-aromatic analogues using the Smirnov-Kolmogorov nonparametric test (W. W. Daniel, Applied Nonparametric Statistics, Houghton-Mifflin: Boston, New York, 1978, pp. 276-286), and in all cases significance at the 99% confidence level or better was observed. Local maxima of the probability distributions were used to define a preferred geometry of interaction between the divalent sulfur moiety and the aromatic ring. Molecular mechanics studies were performed in an effort to better understand the physical basis of the interaction. This study confirms observations based on statistics of interaction of amino acids in protein crystal structures (R. S. Morgan, C. E. Tatsch, R. H. Gushard, J. M. McAdon, and P. K. Warme, International Journal of Peptide Protein Research, 1978, Vol. 11, pp. 209-217; R. S. Morgan and J. M. McAdon, International Journal of Peptide Protein Research, 1980, Vol. 15, pp. 177-180; K. S. C. Reid, P. F. Lindley, and J. M. Thornton, FEBS Letters, 1985, Vol. 190, pp. 209-213), as well as studies involving molecular mechanics (G. Nemethy and H. A. Scheraga, Biochemistry and Biophysics Research Communications, 1981, Vol. 98, pp. 482-487) and quantum chemical calculations (B. V. Cheney, M. W. Schulz, and J. Cheney, Biochimica Biophysica Acta, 1989, Vol. 996, pp.116-124; J. Pranata, Bioorganic Chemistry, 1997, Vol. 25, pp. 213-219)-all of which point to the possible importance of the sulfur-aromatic interaction. However, the preferred geometry of the interaction, as determined from our analysis of the small-molecule crystal data, differs significantly from that found by other approaches. Copyright 2000 John Wiley & Sons, Inc.
Comparative bioinformatics analyses and profiling of lysosome-related organelle proteomes
NASA Astrophysics Data System (ADS)
Hu, Zhang-Zhi; Valencia, Julio C.; Huang, Hongzhan; Chi, An; Shabanowitz, Jeffrey; Hearing, Vincent J.; Appella, Ettore; Wu, Cathy
2007-01-01
Complete and accurate profiling of cellular organelle proteomes, while challenging, is important for the understanding of detailed cellular processes at the organelle level. Mass spectrometry technologies coupled with bioinformatics analysis provide an effective approach for protein identification and functional interpretation of organelle proteomes. In this study, we have compiled human organelle reference datasets from large-scale proteomic studies and protein databases for seven lysosome-related organelles (LROs), as well as the endoplasmic reticulum and mitochondria, for comparative organelle proteome analysis. Heterogeneous sources of human organelle proteins and rodent homologs are mapped to human UniProtKB protein entries based on ID and/or peptide mappings, followed by functional annotation and categorization using the iProXpress proteomic expression analysis system. Cataloging organelle proteomes allows close examination of both shared and unique proteins among various LROs and reveals their functional relevance. The proteomic comparisons show that LROs are a closely related family of organelles. The shared proteins indicate the dynamic and hybrid nature of LROs, while the unique transmembrane proteins may represent additional candidate marker proteins for LROs. This comparative analysis, therefore, provides a basis for hypothesis formulation and experimental validation of organelle proteins and their functional roles.
Lavallée-Adam, Mathieu
2017-01-01
PSEA-Quant analyzes quantitative mass spectrometry-based proteomics datasets to identify enrichments of annotations contained in repositories such as the Gene Ontology and Molecular Signature databases. It allows users to identify the annotations that are significantly enriched for reproducibly quantified high abundance proteins. PSEA-Quant is available on the web and as a command-line tool. It is compatible with all label-free and isotopic labeling-based quantitative proteomics methods. This protocol describes how to use PSEA-Quant and interpret its output. The importance of each parameter as well as troubleshooting approaches are also discussed. PMID:27010334
Gaspari, Marco; Chiesa, Luca; Nicastri, Annalisa; Gabriele, Caterina; Harper, Valeria; Britti, Domenico; Cuda, Giovanni; Procopio, Antonio
2016-12-06
The ability of tandem mass spectrometry to determine the primary structure of proteolytic peptides can be exploited to trace back the organisms from which the corresponding proteins were extracted. This information can be important when food products, such as protein powders, can be supplemented with lower-quality starting materials. In order to dissect the origin of proteinaceous material composing a given unknown mixture, a two-step database search strategy for bottom-up nanoscale liquid chromatography-tandem mass spectrometry (nanoLC-MS/MS) data was implemented. A single nanoLC-MS/MS analysis was sufficient not only to determine the qualitative composition of the mixtures under examination, but also to assess the relative percent composition of the various proteomes, if dedicated calibration curves were previously generated. The approach of two-step database search for qualitative analysis and proteome total ion current (pTIC) calculation for quantitative analysis was applied to several binary and ternary mixtures which mimic the composition of milk replacers typically used in calf feeding.
Proteome analysis of bell pepper (Capsicum annuum L.) chromoplasts.
Siddique, Muhammad Asim; Grossmann, Jonas; Gruissem, Wilhelm; Baginsky, Sacha
2006-12-01
We report a comprehensive proteome analysis of chromoplasts from bell pepper (Capsicum annuum L.). The combination of a novel strategy for database-independent detection of proteins from tandem mass spectrometry (MS/MS) data with standard database searches allowed us to identify 151 proteins with a high level of confidence. These include several well-known plastid proteins but also novel proteins that were not previously reported from other plastid proteome studies. The majority of the identified proteins are active in plastid carbohydrate and amino acid metabolism. Among the most abundant individual proteins are capsanthin/capsorubin synthase and fibrillin, which are involved in the synthesis and storage of carotenoids that accumulate to high levels in chromoplasts. The relative abundances of the identified chromoplast proteins differ remarkably compared with their abundances in other plastid types, suggesting a chromoplast-specific metabolic network. Our results provide an overview of the major metabolic pathways active in chromoplasts and extend existing knowledge about prevalent metabolic activities of different plastid types.
Irazusta, Verónica; Bernal, Anahí Romina; Estévez, María Cristina; de Figueroa, Lucía I C
2018-02-01
Cyberlindnera jadinii M9 and Wickerhamomyces anomalus M10 isolated from textile-dye liquid effluents has shown capacity for chromium detoxification via Cr(VI) biological reduction. The aim of the study was to evaluate the effect of hexavalent chromium on synthesis of novel and/or specific proteins involved in chromium tolerance and reduction in response to chromium overload in two indigenous yeasts. A study was carried out following a proteomic approach with W. anomalus M10 and Cy. jadinii M9 strains. For this, proteins extracts belonging to total cell extracts, membranes and mitochondria were analyzed. When Cr(VI) was added to culture medium there was an over-synthesis of 39 proteins involved in different metabolic pathways. In both strains, chromium supplementation changed protein biosynthesis by upregulating proteins involved in stress response, methionine metabolism, energy production, protein degradation and novel oxide-reductase enzymes. Moreover, we observed that Cy. jadinii M9 and W. anomalus M10 displayed ability to activate superoxide dismutase, catalase and chromate reductase activity. Two enzymes from the total cell extracts, type II nitroreductase (Frm2) and flavoprotein wrbA (Ycp4), were identified as possibly responsible for inducing crude chromate-reductase activity in cytoplasm of W. anomalus M10 under chromium overload. In Cy.jadinii M9, mitochondrial Ferredoxine-NADP reductase (Yah1) and membrane FAD flavoprotein (Lpd1) were identified as probably involved in Cr(VI) reduction. To our knowledge, this is the first study proposing chromate reductase activity of these four enzymes in yeast and reporting a relationship between protein synthesis, enzymatic response and chromium biospeciation in Cy. jadinii and W. anomalus. Copyright © 2017 Elsevier Inc. All rights reserved.
Database Search Engines: Paradigms, Challenges and Solutions.
Verheggen, Kenneth; Martens, Lennart; Berven, Frode S; Barsnes, Harald; Vaudel, Marc
2016-01-01
The first step in identifying proteins from mass spectrometry based shotgun proteomics data is to infer peptides from tandem mass spectra, a task generally achieved using database search engines. In this chapter, the basic principles of database search engines are introduced with a focus on open source software, and the use of database search engines is demonstrated using the freely available SearchGUI interface. This chapter also discusses how to tackle general issues related to sequence database searching and shows how to minimize their impact.
Crasto, Chiquito J.; Marenco, Luis N.; Liu, Nian; Morse, Thomas M.; Cheung, Kei-Hoi; Lai, Peter C.; Bahl, Gautam; Masiar, Peter; Lam, Hugo Y.K.; Lim, Ernest; Chen, Huajin; Nadkarni, Prakash; Migliore, Michele; Miller, Perry L.; Shepherd, Gordon M.
2009-01-01
This article presents the latest developments in neuroscience information dissemination through the SenseLab suite of databases: NeuronDB, CellPropDB, ORDB, OdorDB, OdorMapDB, ModelDB and BrainPharm. These databases include information related to: (i) neuronal membrane properties and neuronal models, and (ii) genetics, genomics, proteomics and imaging studies of the olfactory system. We describe here: the new features for each database, the evolution of SenseLab’s unifying database architecture and instances of SenseLab database interoperation with other neuroscience online resources. PMID:17510162
Pelassa, Ilaria; Fiumara, Ferdinando
2015-01-01
Homopolymeric amino acids repeats (AARs), which are widespread in proteomes, have often been viewed simply as spacers between protein domains, or even as “junk” sequences with no obvious function but with a potential to cause harm upon expansion as in genetic diseases associated with polyglutamine or polyalanine expansions, including Huntington disease and cleidocranial dysplasia. A growing body of evidence indicates however that at least some AARs can form organized, functional protein structures, and can regulate protein function. In particular, certain AARs can mediate protein-protein interactions, either through homotypic AAR-AAR contacts or through heterotypic contacts with other protein domains. It is still unclear however, whether AARs may have a generalized, proteome-wide role in shaping protein-protein interaction networks. Therefore, we have undertaken here a bioinformatics screening of the human proteome and interactome in search of quantitative evidence of such a role. We first identified the sets of proteins that contain repeats of any one of the 20 amino acids, as well as control sets of proteins chosen at random in the proteome. We then analyzed the connectivity between the proteins of the AAR-containing protein sets and we compared it with that observed in the corresponding control networks. We find evidence for different degrees of connectivity in the different AAR-containing protein networks. Indeed, networks of proteins containing polyglutamine, polyglutamate, polyproline, and other AARs show significantly increased levels of connectivity, whereas networks containing polyleucine and other hydrophobic repeats show lower degrees of connectivity. Furthermore, we observed that numerous protein-protein, -nucleic acid, and -lipid interaction domains are significantly enriched in specific AAR protein groups. These findings support the notion of a generalized, combinatorial role of AARs, together with conventional protein interaction domains, in shaping the interaction networks of the human proteome, and define proteome-wide knowledge that may guide the informed biological exploration of the role of AARs in protein interactions. PMID:26734058
A new reference implementation of the PSICQUIC web service.
del-Toro, Noemi; Dumousseau, Marine; Orchard, Sandra; Jimenez, Rafael C; Galeota, Eugenia; Launay, Guillaume; Goll, Johannes; Breuer, Karin; Ono, Keiichiro; Salwinski, Lukasz; Hermjakob, Henning
2013-07-01
The Proteomics Standard Initiative Common QUery InterfaCe (PSICQUIC) specification was created by the Human Proteome Organization Proteomics Standards Initiative (HUPO-PSI) to enable computational access to molecular-interaction data resources by means of a standard Web Service and query language. Currently providing >150 million binary interaction evidences from 28 servers globally, the PSICQUIC interface allows the concurrent search of multiple molecular-interaction information resources using a single query. Here, we present an extension of the PSICQUIC specification (version 1.3), which has been released to be compliant with the enhanced standards in molecular interactions. The new release also includes a new reference implementation of the PSICQUIC server available to the data providers. It offers augmented web service capabilities and improves the user experience. PSICQUIC has been running for almost 5 years, with a user base growing from only 4 data providers to 28 (April 2013) allowing access to 151 310 109 binary interactions. The power of this web service is shown in PSICQUIC View web application, an example of how to simultaneously query, browse and download results from the different PSICQUIC servers. This application is free and open to all users with no login requirement (http://www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml).
2012-01-01
Background Pseudomonas aeruginosa is an opportunistic pathogen that is the major cause of morbidity and mortality in patients with cystic fibrosis (CF). While most CF patients are thought to acquire P. aeruginosa from the environment, person-person transmissible strains have been identified in CF clinics worldwide. The molecular basis for transmissibility and colonization of the CF lung remains poorly understood. Results A dual proteomics approach consisting of gel-based and gel-free comparisons were undertaken to analyse protein profiles in a transmissible, early (acute) isolate of the Australian epidemic strain 1 (AES-1R), the virulent burns/wound isolate PA14, and the poorly virulent, laboratory-associated strain PAO1. Over 1700 P. aeruginosa proteins were confidently identified. AES-1R protein profiles revealed elevated abundance of proteins associated with virulence and siderophore biosynthesis and acquisition, antibiotic resistance and lipopolysaccharide and fatty acid biosynthesis. The most abundant protein in AES-1R was confirmed as a previously hypothetical protein with sequence similarity to carbohydrate-binding proteins and database search revealed this gene is only found in the CF-associated strain PA2192. The link with CF infection may suggest that transmissible strains have acquired an ability to rapidly interact with host mucosal glycoproteins. Conclusions Our data suggest that AES-1R expresses higher levels of proteins, such as those involved in antibiotic resistance, iron acquisition and virulence that may provide a competitive advantage during early infection in the CF lung. Identification of novel proteins associated with transmissibility and acute infection may aid in deciphering new strategies for intervention to limit P. aeruginosa infections in CF patients. PMID:22264352
Hare, Nathan J; Solis, Nestor; Harmer, Christopher; Marzook, N Bishara; Rose, Barbara; Harbour, Colin; Crossett, Ben; Manos, Jim; Cordwell, Stuart J
2012-01-22
Pseudomonas aeruginosa is an opportunistic pathogen that is the major cause of morbidity and mortality in patients with cystic fibrosis (CF). While most CF patients are thought to acquire P. aeruginosa from the environment, person-person transmissible strains have been identified in CF clinics worldwide. The molecular basis for transmissibility and colonization of the CF lung remains poorly understood. A dual proteomics approach consisting of gel-based and gel-free comparisons were undertaken to analyse protein profiles in a transmissible, early (acute) isolate of the Australian epidemic strain 1 (AES-1R), the virulent burns/wound isolate PA14, and the poorly virulent, laboratory-associated strain PAO1. Over 1700 P. aeruginosa proteins were confidently identified. AES-1R protein profiles revealed elevated abundance of proteins associated with virulence and siderophore biosynthesis and acquisition, antibiotic resistance and lipopolysaccharide and fatty acid biosynthesis. The most abundant protein in AES-1R was confirmed as a previously hypothetical protein with sequence similarity to carbohydrate-binding proteins and database search revealed this gene is only found in the CF-associated strain PA2192. The link with CF infection may suggest that transmissible strains have acquired an ability to rapidly interact with host mucosal glycoproteins. Our data suggest that AES-1R expresses higher levels of proteins, such as those involved in antibiotic resistance, iron acquisition and virulence that may provide a competitive advantage during early infection in the CF lung. Identification of novel proteins associated with transmissibility and acute infection may aid in deciphering new strategies for intervention to limit P. aeruginosa infections in CF patients.
The role of targeted chemical proteomics in pharmacology
Sutton, Chris W
2012-01-01
Traditionally, proteomics is the high-throughput characterization of the global complement of proteins in a biological system using cutting-edge technologies (robotics and mass spectrometry) and bioinformatics tools (Internet-based search engines and databases). As the field of proteomics has matured, a diverse range of strategies have evolved to answer specific problems. Chemical proteomics is one such direction that provides the means to enrich and detect less abundant proteins (the ‘hidden’ proteome) from complex mixtures of wide dynamic range (the ‘deep’ proteome). In pharmacology, chemical proteomics has been utilized to determine the specificity of drugs and their analogues, for anticipated known targets, only to discover other proteins that bind and could account for side effects observed in preclinical and clinical trials. As a consequence, chemical proteomics provides a valuable accessory in refinement of second- and third-generation drug design for treatment of many diseases. However, determining definitive affinity capture of proteins by a drug immobilized on soft gel chromatography matrices has highlighted some of the challenges that remain to be addressed. Examples of the different strategies that have emerged using well-established drugs against pharmaceutically important enzymes, such as protein kinases, metalloproteases, PDEs, cytochrome P450s, etc., indicate the potential opportunity to employ chemical proteomics as an early-stage screening approach in the identification of new targets. PMID:22074351
Alberio, Tiziana; Pieroni, Luisa; Ronci, Maurizio; Banfi, Cristina; Bongarzone, Italia; Bottoni, Patrizia; Brioschi, Maura; Caterino, Marianna; Chinello, Clizia; Cormio, Antonella; Cozzolino, Flora; Cunsolo, Vincenzo; Fontana, Simona; Garavaglia, Barbara; Giusti, Laura; Greco, Viviana; Lucacchini, Antonio; Maffioli, Elisa; Magni, Fulvio; Monteleone, Francesca; Monti, Maria; Monti, Valentina; Musicco, Clara; Petrosillo, Giuseppe; Porcelli, Vito; Saletti, Rosaria; Scatena, Roberto; Soggiu, Alessio; Tedeschi, Gabriella; Zilocchi, Mara; Roncada, Paola; Urbani, Andrea; Fasano, Mauro
2017-12-01
The Mitochondrial Human Proteome Project aims at understanding the function of the mitochondrial proteome and its crosstalk with the proteome of other organelles. Being able to choose a suitable and validated enrichment protocol of functional mitochondria, based on the specific needs of the downstream proteomics analysis, would greatly help the researchers in the field. Mitochondrial fractions from ten model cell lines were prepared using three enrichment protocols and analyzed on seven different LC-MS/MS platforms. All data were processed using neXtProt as reference database. The data are available for the Human Proteome Project purposes through the ProteomeXchange Consortium with the identifier PXD007053. The processed data sets were analyzed using a suite of R routines to perform a statistical analysis and to retrieve subcellular and submitochondrial localizations. Although the overall number of identified total and mitochondrial proteins was not significantly dependent on the enrichment protocol, specific line to line differences were observed. Moreover, the protein lists were mapped to a network representing the functional mitochondrial proteome, encompassing mitochondrial proteins and their first interactors. More than 80% of the identified proteins resulted in nodes of this network but with a different ability in coisolating mitochondria-associated structures for each enrichment protocol/cell line pair.
Pre-fractionation strategies to resolve pea (Pisum sativum) sub-proteomes
Meisrimler, Claudia-Nicole; Menckhoff, Ljiljana; Kukavica, Biljana M.; Lüthje, Sabine
2015-01-01
Legumes are important crop plants and pea (Pisum sativum L.) has been investigated as a model with respect to several physiological aspects. The sequencing of the pea genome has not been completed. Therefore, proteomic approaches are currently limited. Nevertheless, the increasing numbers of available EST-databases as well as the high homology of the pea and medicago genome (Medicago truncatula Gaertner) allow the successful identification of proteins. Due to the un-sequenced pea genome, pre-fractionation approaches have been used in pea proteomic surveys in the past. Aside from a number of selective proteome studies on crude extracts and the chloroplast, few studies have targeted other components such as the pea secretome, an important sub-proteome of interest due to its role in abiotic and biotic stress processes. The secretome itself can be further divided into different sub-proteomes (plasma membrane, apoplast, cell wall proteins). Cell fractionation in combination with different gel-electrophoresis, chromatography methods and protein identification by mass spectrometry are important partners to gain insight into pea sub-proteomes, post-translational modifications and protein functions. Overall, pea proteomics needs to link numerous existing physiological and biochemical data to gain further insight into adaptation processes, which play important roles in field applications. Future developments and directions in pea proteomics are discussed. PMID:26539198
Curated protein information in the Saccharomyces genome database.
Hellerstedt, Sage T; Nash, Robert S; Weng, Shuai; Paskov, Kelley M; Wong, Edith D; Karra, Kalpana; Engel, Stacia R; Cherry, J Michael
2017-01-01
Due to recent advancements in the production of experimental proteomic data, the Saccharomyces genome database (SGD; www.yeastgenome.org ) has been expanding our protein curation activities to make new data types available to our users. Because of broad interest in post-translational modifications (PTM) and their importance to protein function and regulation, we have recently started incorporating expertly curated PTM information on individual protein pages. Here we also present the inclusion of new abundance and protein half-life data obtained from high-throughput proteome studies. These new data types have been included with the aim to facilitate cellular biology research. : www.yeastgenome.org. © The Author(s) 2017. Published by Oxford University Press.
Parsons, Harriet T.; Christiansen, Katy; Knierim, Bernhard; Carroll, Andrew; Ito, Jun; Batth, Tanveer S.; Smith-Moritz, Andreia M.; Morrison, Stephanie; McInerney, Peter; Hadi, Masood Z.; Auer, Manfred; Mukhopadhyay, Aindrila; Petzold, Christopher J.; Scheller, Henrik V.; Loqué, Dominique; Heazlewood, Joshua L.
2012-01-01
The plant Golgi plays a pivotal role in the biosynthesis of cell wall matrix polysaccharides, protein glycosylation, and vesicle trafficking. Golgi-localized proteins have become prospective targets for reengineering cell wall biosynthetic pathways for the efficient production of biofuels from plant cell walls. However, proteomic characterization of the Golgi has so far been limited, owing to the technical challenges inherent in Golgi purification. In this study, a combination of density centrifugation and surface charge separation techniques have allowed the reproducible isolation of Golgi membranes from Arabidopsis (Arabidopsis thaliana) at sufficiently high purity levels for in-depth proteomic analysis. Quantitative proteomic analysis, immunoblotting, enzyme activity assays, and electron microscopy all confirm high purity levels. A composition analysis indicated that approximately 19% of proteins were likely derived from contaminating compartments and ribosomes. The localization of 13 newly assigned proteins to the Golgi using transient fluorescent markers further validated the proteome. A collection of 371 proteins consistently identified in all replicates has been proposed to represent the Golgi proteome, marking an appreciable advancement in numbers of Golgi-localized proteins. A significant proportion of proteins likely involved in matrix polysaccharide biosynthesis were identified. The potential within this proteome for advances in understanding Golgi processes has been demonstrated by the identification and functional characterization of the first plant Golgi-resident nucleoside diphosphatase, using a yeast complementation assay. Overall, these data show key proteins involved in primary cell wall synthesis and include a mixture of well-characterized and unknown proteins whose biological roles and importance as targets for future research can now be realized. PMID:22430844
Tear film proteome in age-related macular degeneration.
Winiarczyk, Mateusz; Kaarniranta, Kai; Winiarczyk, Stanisław; Adaszek, Łukasz; Winiarczyk, Dagmara; Mackiewicz, Jerzy
2018-06-01
Age-related macular degeneration (AMD) is the main reason for blindness in elderly people in the developed countries. Current screening protocols have limitations in detecting the early signs of retinal degeneration. Therefore, it would be desirable to find novel biomarkers for early detection of AMD. Development of novel biomarkers would help in the prevention, diagnostics, and treatment of AMD. Proteomic analysis of tear film has shown promise in this research area. If an optimal set of biomarkers could be obtained from accessible body fluids, it would represent a reliable way to monitor disease progression and response to novel therapies. Tear films were collected on Schirmer strips from a total of 22 patients (8 with wet AMD, 6 with dry AMD, and 8 control individuals). 2D electrophoresis was used to separate tear film proteins prior to their identification with matrix-assisted laser desorption/ionization time of flight spectrometer (MALDI-TOF/TOF) and matching with functional databases. A total of 342 proteins were identified. Most of them were previously described in various proteomic studies concerning AMD. Shootin-1, histatin-3, fidgetin-like protein 1, SRC kinase signaling inhibitor, Graves disease carrier protein, actin cytoplasmic 1, prolactin-inducible protein 1, and protein S100-A7A were upregulated in the tear film samples isolated from AMD patients and were not previously linked with this disease in any proteomic analysis. The upregulated proteins supplement our current knowledge of AMD pathogenesis, providing evidence that certain specific proteins are expressed into the tear film in AMD. As far we are aware, this is the first study to have undertaken a comprehensive in-depth analysis of the human tear film proteome in AMD patients.
Ferrari, Cibele Santos; Amaral, Fernanda Plucani; Bueno, Jessica Cavalheiro Ferreira; Scariot, Mirella Christine; Valentim-Neto, Pedro Alexandre; Arisi, Ana Carolina Maisonnave
2014-11-01
Several molecular tools have been used to clarify the basis of plant-bacteria interaction; however, the mechanism behind the association is still unclear. In this study, we used a proteomic approach to investigate the root proteome of Zea mays (cv. DKB240) inoculated with Herbaspirillum seropedicae strain SmR1 grown in vitro and harvested 7 days after inoculation. Eighteen differentially accumulated proteins were observed in root samples, ten of which were identified by MALDI-TOF mass spectrometry peptide mass fingerprint. Among the identified proteins, we observed three proteins present exclusively in inoculated root samples and six upregulated proteins and one downregulated protein relative to control. Differentially expressed maize proteins were identified as hypothetical protein ZEAMMB73_483204, hypothetical protein ZEAMMB73_269466, and tubulin beta-7 chain. The following were identified as H. seropedicae proteins: peroxiredoxin protein, EF-Tu elongation factor protein, cation transport ATPase, NADPH:quinone oxidoreductase, dinitrogenase reductase, and type III secretion ATP synthase. Our results presented the first evidence of type III secretion ATP synthase expression during H. seropedicae-maize root interaction.
P-MartCancer–Interactive Online Software to Enable Analysis of Shotgun Cancer Proteomic Datasets
DOE Office of Scientific and Technical Information (OSTI.GOV)
Webb-Robertson, Bobbie-Jo M.; Bramer, Lisa M.; Jensen, Jeffrey L.
P-MartCancer is a new interactive web-based software environment that enables biomedical and biological scientists to perform in-depth analyses of global proteomics data without requiring direct interaction with the data or with statistical software. P-MartCancer offers a series of statistical modules associated with quality assessment, peptide and protein statistics, protein quantification and exploratory data analyses driven by the user via customized workflows and interactive visualization. Currently, P-MartCancer offers access to multiple cancer proteomic datasets generated through the Clinical Proteomics Tumor Analysis Consortium (CPTAC) at the peptide, gene and protein levels. P-MartCancer is deployed using Azure technologies (http://pmart.labworks.org/cptac.html), the web-service is alternativelymore » available via Docker Hub (https://hub.docker.com/r/pnnl/pmart-web/) and many statistical functions can be utilized directly from an R package available on GitHub (https://github.com/pmartR).« less
Interaction Analysis through Proteomic Phage Display
2014-01-01
Phage display is a powerful technique for profiling specificities of peptide binding domains. The method is suited for the identification of high-affinity ligands with inhibitor potential when using highly diverse combinatorial peptide phage libraries. Such experiments further provide consensus motifs for genome-wide scanning of ligands of potential biological relevance. A complementary but considerably less explored approach is to display expression products of genomic DNA, cDNA, open reading frames (ORFs), or oligonucleotide libraries designed to encode defined regions of a target proteome on phage particles. One of the main applications of such proteomic libraries has been the elucidation of antibody epitopes. This review is focused on the use of proteomic phage display to uncover protein-protein interactions of potential relevance for cellular function. The method is particularly suited for the discovery of interactions between peptide binding domains and their targets. We discuss the largely unexplored potential of this method in the discovery of domain-motif interactions of potential biological relevance. PMID:25295249
Singh, Varinder; Singh, Baldev; Joshi, Robin; Jaju, Puneet
2017-01-01
Withania somnifera is a high value medicinal plant which is used against large number of ailments. The medicinal properties of the plant attributes to a wide array of important secondary metabolites. The plant is predominantly infected with leaf spot pathogen Alternaria alternata, which leads to substantial biodeterioration of pharmaceutically important metabolites. To develop an effective strategy to combat this disease, proteomics based approach could be useful. Hence, in the present study, three different protein extraction methods tris-buffer based, phenol based and trichloroacetic acid-acetone (TCA-acetone) based method were comparatively evaluated for two-dimensional electrophoresis (2-DE) analysis of W. somnifera. TCA-acetone method was found to be most effective and was further used to identify differentially expressed proteins in response to fungal infection. Thirty-eight differentially expressed proteins were identified by matrix assisted laser desorption/ionization time of flight-mass spectrometry (MALDI TOF/TOF MS/MS). The known proteins were categorized into eight different groups based on their function and maximum proteins belonged to energy and metabolism, cell structure, stress and defense and RNA/DNA categories. Differential expression of some key proteins were also crosschecked at transcriptomic level by using qRT-PCR and were found to be consistent with the 2-DE data. These outcomes enable us to evaluate modifications that take place at the proteomic level during a compatible host pathogen interaction. The comparative proteome analysis conducted in this paper revealed the involvement of many key proteins in the process of pathogenesis and further investigation of these identified proteins could assist in the discovery of new strategies for the development of pathogen resistance in the plant. PMID:28575108
Go, Young-Mi; Jones, Dean P.
2013-01-01
The redox proteome consists of reversible and irreversible covalent modifications that link redox metabolism to biologic structure and function. These modifications, especially of Cys, function at the molecular level in protein folding and maturation, catalytic activity, signaling, and macromolecular interactions and at the macroscopic level in control of secretion and cell shape. Interaction of the redox proteome with redox-active chemicals is central to macromolecular structure, regulation, and signaling during the life cycle and has a central role in the tolerance and adaptability to diet and environmental challenges. PMID:23861437
Pineda-Lucena, Antonio; Liao, Jack C C; Cort, John R; Yee, Adelinda; Kennedy, Michael A; Edwards, Aled M; Arrowsmith, Cheryl H
2003-05-01
As part of the Northeast Structural Genomics Consortium pilot project focused on small eukaryotic proteins and protein domains, we have determined the NMR structure of the protein encoded by ORF YML108W from Saccharomyces cerevisiae. YML108W belongs to one of the numerous structural proteomics targets whose biological function is unknown. Moreover, this protein does not have sequence similarity to any other protein. The NMR structure of YML108W consists of a four-stranded beta-sheet with strand order 2143 and two alpha-helices, with an overall topology of betabetaalphabetabetaalpha. Strand beta1 runs parallel to beta4, and beta2:beta1 and beta4:beta3 pairs are arranged in an antiparallel fashion. Although this fold belongs to the split betaalphabeta family, it appears to be unique among this family; it is a novel arrangement of secondary structure, thereby expanding the universe of protein folds.
Integrated database for identifying candidate genes for Aspergillus flavus resistance in maize
2010-01-01
Background Aspergillus flavus Link:Fr, an opportunistic fungus that produces aflatoxin, is pathogenic to maize and other oilseed crops. Aflatoxin is a potent carcinogen, and its presence markedly reduces the value of grain. Understanding and enhancing host resistance to A. flavus infection and/or subsequent aflatoxin accumulation is generally considered an efficient means of reducing grain losses to aflatoxin. Different proteomic, genomic and genetic studies of maize (Zea mays L.) have generated large data sets with the goal of identifying genes responsible for conferring resistance to A. flavus, or aflatoxin. Results In order to maximize the usage of different data sets in new studies, including association mapping, we have constructed a relational database with web interface integrating the results of gene expression, proteomic (both gel-based and shotgun), Quantitative Trait Loci (QTL) genetic mapping studies, and sequence data from the literature to facilitate selection of candidate genes for continued investigation. The Corn Fungal Resistance Associated Sequences Database (CFRAS-DB) (http://agbase.msstate.edu/) was created with the main goal of identifying genes important to aflatoxin resistance. CFRAS-DB is implemented using MySQL as the relational database management system running on a Linux server, using an Apache web server, and Perl CGI scripts as the web interface. The database and the associated web-based interface allow researchers to examine many lines of evidence (e.g. microarray, proteomics, QTL studies, SNP data) to assess the potential role of a gene or group of genes in the response of different maize lines to A. flavus infection and subsequent production of aflatoxin by the fungus. Conclusions CFRAS-DB provides the first opportunity to integrate data pertaining to the problem of A. flavus and aflatoxin resistance in maize in one resource and to support queries across different datasets. The web-based interface gives researchers different query options for mining the database across different types of experiments. The database is publically available at http://agbase.msstate.edu. PMID:20946609
Integrated database for identifying candidate genes for Aspergillus flavus resistance in maize.
Kelley, Rowena Y; Gresham, Cathy; Harper, Jonathan; Bridges, Susan M; Warburton, Marilyn L; Hawkins, Leigh K; Pechanova, Olga; Peethambaran, Bela; Pechan, Tibor; Luthe, Dawn S; Mylroie, J E; Ankala, Arunkanth; Ozkan, Seval; Henry, W B; Williams, W P
2010-10-07
Aspergillus flavus Link:Fr, an opportunistic fungus that produces aflatoxin, is pathogenic to maize and other oilseed crops. Aflatoxin is a potent carcinogen, and its presence markedly reduces the value of grain. Understanding and enhancing host resistance to A. flavus infection and/or subsequent aflatoxin accumulation is generally considered an efficient means of reducing grain losses to aflatoxin. Different proteomic, genomic and genetic studies of maize (Zea mays L.) have generated large data sets with the goal of identifying genes responsible for conferring resistance to A. flavus, or aflatoxin. In order to maximize the usage of different data sets in new studies, including association mapping, we have constructed a relational database with web interface integrating the results of gene expression, proteomic (both gel-based and shotgun), Quantitative Trait Loci (QTL) genetic mapping studies, and sequence data from the literature to facilitate selection of candidate genes for continued investigation. The Corn Fungal Resistance Associated Sequences Database (CFRAS-DB) (http://agbase.msstate.edu/) was created with the main goal of identifying genes important to aflatoxin resistance. CFRAS-DB is implemented using MySQL as the relational database management system running on a Linux server, using an Apache web server, and Perl CGI scripts as the web interface. The database and the associated web-based interface allow researchers to examine many lines of evidence (e.g. microarray, proteomics, QTL studies, SNP data) to assess the potential role of a gene or group of genes in the response of different maize lines to A. flavus infection and subsequent production of aflatoxin by the fungus. CFRAS-DB provides the first opportunity to integrate data pertaining to the problem of A. flavus and aflatoxin resistance in maize in one resource and to support queries across different datasets. The web-based interface gives researchers different query options for mining the database across different types of experiments. The database is publically available at http://agbase.msstate.edu.
MIPS: curated databases and comprehensive secondary data resources in 2010.
Mewes, H Werner; Ruepp, Andreas; Theis, Fabian; Rattei, Thomas; Walter, Mathias; Frishman, Dmitrij; Suhre, Karsten; Spannagl, Manuel; Mayer, Klaus F X; Stümpflen, Volker; Antonov, Alexey
2011-01-01
The Munich Information Center for Protein Sequences (MIPS at the Helmholtz Center for Environmental Health, Neuherberg, Germany) has many years of experience in providing annotated collections of biological data. Selected data sets of high relevance, such as model genomes, are subjected to careful manual curation, while the bulk of high-throughput data is annotated by automatic means. High-quality reference resources developed in the past and still actively maintained include Saccharomyces cerevisiae, Neurospora crassa and Arabidopsis thaliana genome databases as well as several protein interaction data sets (MPACT, MPPI and CORUM). More recent projects are PhenomiR, the database on microRNA-related phenotypes, and MIPS PlantsDB for integrative and comparative plant genome research. The interlinked resources SIMAP and PEDANT provide homology relationships as well as up-to-date and consistent annotation for 38,000,000 protein sequences. PPLIPS and CCancer are versatile tools for proteomics and functional genomics interfacing to a database of compilations from gene lists extracted from literature. A novel literature-mining tool, EXCERBT, gives access to structured information on classified relations between genes, proteins, phenotypes and diseases extracted from Medline abstracts by semantic analysis. All databases described here, as well as the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.helmholtz-muenchen.de).
MIPS: curated databases and comprehensive secondary data resources in 2010
Mewes, H. Werner; Ruepp, Andreas; Theis, Fabian; Rattei, Thomas; Walter, Mathias; Frishman, Dmitrij; Suhre, Karsten; Spannagl, Manuel; Mayer, Klaus F.X.; Stümpflen, Volker; Antonov, Alexey
2011-01-01
The Munich Information Center for Protein Sequences (MIPS at the Helmholtz Center for Environmental Health, Neuherberg, Germany) has many years of experience in providing annotated collections of biological data. Selected data sets of high relevance, such as model genomes, are subjected to careful manual curation, while the bulk of high-throughput data is annotated by automatic means. High-quality reference resources developed in the past and still actively maintained include Saccharomyces cerevisiae, Neurospora crassa and Arabidopsis thaliana genome databases as well as several protein interaction data sets (MPACT, MPPI and CORUM). More recent projects are PhenomiR, the database on microRNA-related phenotypes, and MIPS PlantsDB for integrative and comparative plant genome research. The interlinked resources SIMAP and PEDANT provide homology relationships as well as up-to-date and consistent annotation for 38 000 000 protein sequences. PPLIPS and CCancer are versatile tools for proteomics and functional genomics interfacing to a database of compilations from gene lists extracted from literature. A novel literature-mining tool, EXCERBT, gives access to structured information on classified relations between genes, proteins, phenotypes and diseases extracted from Medline abstracts by semantic analysis. All databases described here, as well as the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.helmholtz-muenchen.de). PMID:21109531
A Non-canonical Feedback Circuit for Rapid Interactions between Somatosensory Cortices.
Minamisawa, Genki; Kwon, Sung Eun; Chevée, Maxime; Brown, Solange P; O'Connor, Daniel H
2018-05-29
Sensory perception depends on interactions among cortical areas. These interactions are mediated by canonical patterns of connectivity in which higher areas send feedback projections to lower areas via neurons in superficial and deep layers. Here, we probed the circuit basis of interactions among two areas critical for touch perception in mice, whisker primary (wS1) and secondary (wS2) somatosensory cortices. Neurons in layer 4 of wS2 (S2 L4 ) formed a major feedback pathway to wS1. Feedback from wS2 to wS1 was organized somatotopically. Spikes evoked by whisker deflections occurred nearly as rapidly in wS2 as in wS1, including among putative S2 L4 → S1 feedback neurons. Axons from S2 L4 → S1 neurons sent stimulus orientation-specific activity to wS1. Optogenetic excitation of S2 L4 neurons modulated activity across both wS2 and wS1, while inhibition of S2 L4 reduced orientation tuning among wS1 neurons. Thus, a non-canonical feedback circuit, originating in layer 4 of S2, rapidly modulates early tactile processing. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
HUPO BPP pilot study: a proteomics analysis of the mouse brain of different developmental stages.
Wang, Jing; Gu, Yong; Wang, Lihong; Hang, Xingyi; Gao, Yan; Wang, Hangyan; Zhang, Chenggang
2007-11-01
This study is a part of the HUPO Brain Proteome Project (BPP) pilot study, which aims at obtaining a reliable database of mouse brain proteome, at the comparison of techniques, laboratories, and approaches as well as at preparing subsequent proteome studies of neurologic diseases. The C57/Bl6 mouse brains of three developmental stages at embryonic day 16 (E16), postnatal day 7 (P7), and 8 wk (P56) (n = 5 in each group) were provided by the HUPO BPP executive committee. The whole brain proteins of each animal were individually prepared using 2-DE coupled with PDQuest software analysis. The protein spots representing developmentally related or stably expressed proteins were then prepared with in-gel digestion followed with MALDI-TOF/TOF MS/MS and analyzed using the MASCOT search engines to search the Swiss-Prot or NCBInr database. The 2-DE gel maps of the mouse brains of all of the developmental stages were obtained and submitted to the Data Collection Centre (DCC). The proteins alpha-enolase, stathmin, actin, C14orf166 homolog, 28,000 kDa heat- and acid-stable phosphoprotein, 3-mercaptopyruvate sulfurtransferase and 40 S ribosomal protein S3a were successfully identified. A further Western blotting analysis demonstrated that enolase is a protein up-regulated in the mouse brain from embryonic stage to adult stage. These data are helpful for understanding the proteome changes in the development of the mouse brain.
Mapping Proteome-Wide Interactions of Reactive Chemicals Using Chemoproteomic Platforms
Counihan, Jessica L.; Ford, Breanna; Nomura, Daniel K.
2015-01-01
A large number of pharmaceuticals, endogenous metabolites, and environmental chemicals act through covalent mechanisms with protein targets. Yet, their specific interactions with the proteome still remain poorly defined for most of these reactive chemicals. Deciphering direct protein targets of reactive small-molecules is critical in understanding their biological action, off-target effects, potential toxicological liabilities, and development of safer and more selective agents. Chemoproteomic technologies have arisen as a powerful strategy that enable the assessment of proteome-wide interactions of these irreversible agents directly in complex biological systems. We review here several chemoproteomic strategies that have facilitated our understanding of specific protein interactions of irreversibly-acting pharmaceuticals, endogenous metabolites, and environmental electrophiles to reveal novel pharmacological, biological, and toxicological mechanisms. PMID:26647369
The chordate proteome history database.
Levasseur, Anthony; Paganini, Julien; Dainat, Jacques; Thompson, Julie D; Poch, Olivier; Pontarotti, Pierre; Gouret, Philippe
2012-01-01
The chordate proteome history database (http://ioda.univ-provence.fr) comprises some 20,000 evolutionary analyses of proteins from chordate species. Our main objective was to characterize and study the evolutionary histories of the chordate proteome, and in particular to detect genomic events and automatic functional searches. Firstly, phylogenetic analyses based on high quality multiple sequence alignments and a robust phylogenetic pipeline were performed for the whole protein and for each individual domain. Novel approaches were developed to identify orthologs/paralogs, and predict gene duplication/gain/loss events and the occurrence of new protein architectures (domain gains, losses and shuffling). These important genetic events were localized on the phylogenetic trees and on the genomic sequence. Secondly, the phylogenetic trees were enhanced by the creation of phylogroups, whereby groups of orthologous sequences created using OrthoMCL were corrected based on the phylogenetic trees; gene family size and gene gain/loss in a given lineage could be deduced from the phylogroups. For each ortholog group obtained from the phylogenetic or the phylogroup analysis, functional information and expression data can be retrieved. Database searches can be performed easily using biological objects: protein identifier, keyword or domain, but can also be based on events, eg, domain exchange events can be retrieved. To our knowledge, this is the first database that links group clustering, phylogeny and automatic functional searches along with the detection of important events occurring during genome evolution, such as the appearance of a new domain architecture.
PeroxisomeDB: a database for the peroxisomal proteome, functional genomics and disease
Schlüter, Agatha; Fourcade, Stéphane; Domènech-Estévez, Enric; Gabaldón, Toni; Huerta-Cepas, Jaime; Berthommier, Guillaume; Ripp, Raymond; Wanders, Ronald J. A.; Poch, Olivier; Pujol, Aurora
2007-01-01
Peroxisomes are essential organelles of eukaryotic origin, ubiquitously distributed in cells and organisms, playing key roles in lipid and antioxidant metabolism. Loss or malfunction of peroxisomes causes more than 20 fatal inherited conditions. We have created a peroxisomal database () that includes the complete peroxisomal proteome of Homo sapiens and Saccharomyces cerevisiae, by gathering, updating and integrating the available genetic and functional information on peroxisomal genes. PeroxisomeDB is structured in interrelated sections ‘Genes’, ‘Functions’, ‘Metabolic pathways’ and ‘Diseases’, that include hyperlinks to selected features of NCBI, ENSEMBL and UCSC databases. We have designed graphical depictions of the main peroxisomal metabolic routes and have included updated flow charts for diagnosis. Precomputed BLAST, PSI-BLAST, multiple sequence alignment (MUSCLE) and phylogenetic trees are provided to assist in direct multispecies comparison to study evolutionary conserved functions and pathways. Highlights of the PeroxisomeDB include new tools developed for facilitating (i) identification of novel peroxisomal proteins, by means of identifying proteins carrying peroxisome targeting signal (PTS) motifs, (ii) detection of peroxisomes in silico, particularly useful for screening the deluge of newly sequenced genomes. PeroxisomeDB should contribute to the systematic characterization of the peroxisomal proteome and facilitate system biology approaches on the organelle. PMID:17135190
Lin, Chia-En; Chang, Wen-Shin; Lee, Jen-Ai; Chang, Ting-Ya; Huang, Yu-Shen; Hirasaki, Yoshiro; Chen, Hung-Shing; Imai, Kazuhiro; Chen, Shih-Ming
2018-03-01
Aristolochic acid (AA) causes interstitial renal fibrosis, called aristolochic acid nephropathy (AAN). There is no specific indicator for diagnosing AAN, so this study aimed to investigate the biomarkers for AAN using a proteomics method. The C3H/He female mice were given ad libitum AA-distilled water (0.5 mg/kg/day) and distilled water for 56 days in the AA and normal groups, respectively. The AA-induced proteins in the kidney were investigated using a proteomics study, including fluorogenic derivatization with 7-chloro-N-[2-(dimethylamino)ethyl]-2,1,3-benzoxadiazole-4-sulfonamide, followed by high-performance liquid chromatography analysis and liquid chromatography tandem mass spectrometry with a MASCOT database searching system. There were two altered proteins, thrombospondin type 1 (TSP1) and G protein-coupled receptor 87 (GPR87), in the kidney of AA-group mice on day 56. GPR87, a tumorigenesis-related protein, is reported for the first time in the current study. The renal interstitial fibrosis was certainly induced in the AA-group mice under histological examination. Based on the results of histological examination and the proteomics study, this model might be applied to AAN studies in the future. TSP1 might be a novel biomarker for AAN, and the further role of GPR87 leading to AA-induced tumorigenesis should be researched in future studies. Copyright © 2017 John Wiley & Sons, Ltd.
Li, Chien-Feng; Shen, Kun-Hung; Chien, Lan-Hsiang; Huang, Cheng-Hao; Wu, Ting-Feng; He, Hong-Lin
2018-04-19
Among various heterogeneous types of bladder tumors, urothelial carcinoma is the most prevalent lesion. Some of the urinary bladder urothelial carcinomas (UBUCs) develop local recurrence and may cause distal invasion. Galectin-1 de-regulation significantly affects cell transformation, cell proliferation, angiogenesis, and cell invasiveness. In continuation of our previous investigation on the role of galectin-1 in UBUC tumorigenesis, in this study, proteomics strategies were implemented in order to find more galectin-1-associated signaling pathways. The results of this study showed that galectin-1 knockdown could induce 15 down-regulated proteins and two up-regulated proteins in T24 cells. These de-regulated proteins might participate in lipid/amino acid/energy metabolism, cytoskeleton, cell proliferation, cell-cell interaction, cell apoptosis, metastasis, and protein degradation. The aforementioned dys-regulated proteins were confirmed by western immunoblotting. Proteomics results were further translated to prognostic markers by analyses of biopsy samples. Results of cohort studies demonstrated that over-expressions of glutamine synthetase, alcohol dehydrogenase (NADP⁺), fatty acid binding protein 4, and toll interacting protein in clinical specimens were all significantly associated with galectin-1 up-regulation. Univariate analyses showed that de-regulations of glutamine synthetase and fatty acid binding protein 4 in clinical samples were respectively linked to disease-specific survival and metastasis-free survival.
van Herwijnen, Martijn J.C.; Zonneveld, Marijke I.; Goerdayal, Soenita; Nolte – 't Hoen, Esther N.M.; Garssen, Johan; Stahl, Bernd; Maarten Altelaar, A.F.; Redegeld, Frank A.; Wauben, Marca H.M.
2016-01-01
Breast milk contains several macromolecular components with distinctive functions, whereby milk fat globules and casein micelles mainly provide nutrition to the newborn, and whey contains molecules that can stimulate the newborn's developing immune system and gastrointestinal tract. Although extracellular vesicles (EV) have been identified in breast milk, their physiological function and composition has not been addressed in detail. EV are submicron sized vehicles released by cells for intercellular communication via selectively incorporated lipids, nucleic acids, and proteins. Because of the difficulty in separating EV from other milk components, an in-depth analysis of the proteome of human milk-derived EV is lacking. In this study, an extensive LC-MS/MS proteomic analysis was performed of EV that had been purified from breast milk of seven individual donors using a recently established, optimized density-gradient-based EV isolation protocol. A total of 1963 proteins were identified in milk-derived EV, including EV-associated proteins like CD9, Annexin A5, and Flotillin-1, with a remarkable overlap between the different donors. Interestingly, 198 of the identified proteins are not present in the human EV database Vesiclepedia, indicating that milk-derived EV harbor proteins not yet identified in EV of different origin. Similarly, the proteome of milk-derived EV was compared with that of other milk components. For this, data from 38 published milk proteomic studies were combined in order to construct the total milk proteome, which consists of 2698 unique proteins. Remarkably, 633 proteins identified in milk-derived EV have not yet been identified in human milk to date. Interestingly, these novel proteins include proteins involved in regulation of cell growth and controlling inflammatory signaling pathways, suggesting that milk-derived EVs could support the newborn's developing gastrointestinal tract and immune system. Overall, this study provides an expansion of the whole milk proteome and illustrates that milk-derived EV are macromolecular components with a unique functional proteome. PMID:27601599
Xu, Benhong; Gao, Yanpan; Zhan, Shaohua; Ge, Wei
2017-07-01
Lysosomes play vital roles in both innate and adaptive immunity. It is widely accepted that lysosomes do not function exclusively as a digestive organelle. It is also involved in the process of immune cells against pathogens. However, the changes in the lysosomal proteome caused by infection with various microbes are still largely unknown, and our understanding of the proteome of the purified lysosome is another obstacle that needs to be resolved. Here, we performed a proteomic study on lysosomes enriched from THP1 cells after infection with Listeria monocytogenes (L.m), Herpes Simplex Virus 1 (HSV-1) and Vesicular Stomatitis Virus (VSV). In combination with the gene ontology (GO) analysis, we identified 284 lysosomal-related proteins from a total of 4560 proteins. We also constructed the protein-protein interaction networks for the differentially expressed proteins and revealed the core lysosomal proteins, including SRC in the L. m treated group, SRC, GLB1, HEXA and HEXB in the HSV-1 treated group and GLB1, CTSA, CTSB, HEXA and HEXB in the VSV treated group, which are involved in responding to diverse microbial infections. This study not only reveals variable lysosome responses depending on the bacterial or virus infection, but also provides the evidence based on which we propose a novel approach to proteome research for investigation of the function of the enriched organelles. Copyright © 2017 Elsevier Ltd. All rights reserved.
P-MartCancer-Interactive Online Software to Enable Analysis of Shotgun Cancer Proteomic Datasets.
Webb-Robertson, Bobbie-Jo M; Bramer, Lisa M; Jensen, Jeffrey L; Kobold, Markus A; Stratton, Kelly G; White, Amanda M; Rodland, Karin D
2017-11-01
P-MartCancer is an interactive web-based software environment that enables statistical analyses of peptide or protein data, quantitated from mass spectrometry-based global proteomics experiments, without requiring in-depth knowledge of statistical programming. P-MartCancer offers a series of statistical modules associated with quality assessment, peptide and protein statistics, protein quantification, and exploratory data analyses driven by the user via customized workflows and interactive visualization. Currently, P-MartCancer offers access and the capability to analyze multiple cancer proteomic datasets generated through the Clinical Proteomics Tumor Analysis Consortium at the peptide, gene, and protein levels. P-MartCancer is deployed as a web service (https://pmart.labworks.org/cptac.html), alternatively available via Docker Hub (https://hub.docker.com/r/pnnl/pmart-web/). Cancer Res; 77(21); e47-50. ©2017 AACR . ©2017 American Association for Cancer Research.
Arai, Kazuya; Sakamoto, Ruriko; Kubota, Daisuke; Kondo, Tadashi
2013-08-01
Chemoresistance is one of the most critical prognostic factors in osteosarcoma, and elucidation of the molecular backgrounds of chemoresistance may lead to better clinical outcomes. Spheroid cells resemble in vivo cells and are considered an in vitro model for the drug discovery. We found that spheroid cells displayed more chemoresistance than conventional monolayer cells across 11 osteosarcoma cell lines. To investigate the molecular mechanisms underlying the resistance to chemotherapy, we examined the proteomic differences between the monolayer and spheroid cells by 2D-DIGE. Of the 4762 protein species observed, we further investigated 435 species with annotated mass spectra in the public proteome database, Genome Medicine Database of Japan Proteomics. Among the 435 protein species, we found that 17 species exhibited expression level differences when the cells formed spheroids in more than five cell lines and four species out of these 17 were associated with spheroid-formation associated resistance to doxorubicin. We confirmed the upregulation of cathepsin D in spheroid cells by western blotting. Cathepsin D has been implicated in chemoresistance of various malignancies but has not previously been implemented in osteosarcoma. Our study suggested that the spheroid system may be a useful tool to reveal the molecular backgrounds of chemoresistance in osteosarcoma. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Exploring the dark foldable proteome by considering hydrophobic amino acids topology
Bitard-Feildel, Tristan; Callebaut, Isabelle
2017-01-01
The protein universe corresponds to the set of all proteins found in all organisms. A way to explore it is by taking into account the domain content of the proteins. However, some part of sequences and many entire sequences remain un-annotated despite a converging number of domain families. The un-annotated part of the protein universe is referred to as the dark proteome and remains poorly characterized. In this study, we quantify the amount of foldable domains within the dark proteome by using the hydrophobic cluster analysis methodology. These un-annotated foldable domains were grouped using a combination of remote homology searches and domain annotations, leading to define different levels of darkness. The dark foldable domains were analyzed to understand what make them different from domains stored in databases and thus difficult to annotate. The un-annotated domains of the dark proteome universe display specific features relative to database domains: shorter length, non-canonical content and particular topology in hydrophobic residues, higher propensity for disorder, and a higher energy. These features make them hard to relate to known families. Based on these observations, we emphasize that domain annotation methodologies can still be improved to fully apprehend and decipher the molecular evolution of the protein universe. PMID:28134276
Renard, Bernhard Y.; Xu, Buote; Kirchner, Marc; Zickmann, Franziska; Winter, Dominic; Korten, Simone; Brattig, Norbert W.; Tzur, Amit; Hamprecht, Fred A.; Steen, Hanno
2012-01-01
Currently, the reliable identification of peptides and proteins is only feasible when thoroughly annotated sequence databases are available. Although sequencing capacities continue to grow, many organisms remain without reliable, fully annotated reference genomes required for proteomic analyses. Standard database search algorithms fail to identify peptides that are not exactly contained in a protein database. De novo searches are generally hindered by their restricted reliability, and current error-tolerant search strategies are limited by global, heuristic tradeoffs between database and spectral information. We propose a Bayesian information criterion-driven error-tolerant peptide search (BICEPS) and offer an open source implementation based on this statistical criterion to automatically balance the information of each single spectrum and the database, while limiting the run time. We show that BICEPS performs as well as current database search algorithms when such algorithms are applied to sequenced organisms, whereas BICEPS only uses a remotely related organism database. For instance, we use a chicken instead of a human database corresponding to an evolutionary distance of more than 300 million years (International Chicken Genome Sequencing Consortium (2004) Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432, 695–716). We demonstrate the successful application to cross-species proteomics with a 33% increase in the number of identified proteins for a filarial nematode sample of Litomosoides sigmodontis. PMID:22493179
VizieR Online Data Catalog: VIMOS Public Extragalactic Survey (VIPERS) DR1 (Garilli+, 2014)
NASA Astrophysics Data System (ADS)
Garilli, B.; Guzzo, L.; Scodeggio, M.; Bolzonella, M.; Abbas, U.; Adami, C.; Arnouts, S.; Bel, J.; Bottini, D.; Branchini, E.; Cappi, A.; Coupon, J.; Cucciati, O.; Davidzon, I.; de Lucia, G.; de la Torre, S.; Franzetti, P.; Fritz, A.; Fumana, M.; Granett, B. R.; Ilbert, O.; Iovino, A.; Krywult, J.; Le Brun, V.; Le Fevre, O.; Maccagni, D.; Malek, K.; Marulli, F.; McCracken, H. J.; Paioro, L.; Polletta, M.; Pollo, A.; Schlagenhaufer, H.; Tasca, L. A. M.; Tojeiro, R.; Vergani, D.; Zamorani, G.; Zanichelli, A.; Burden, A.; di Porto, C.; Marchetti, A.; Marinoni, C.; Mellier, Y.; Moscardini, L.; Nichol, R. C.; Peacock, J. A.; Percival, W. J.; Phleps, S.; Wolk, M.
2014-09-01
We present the first Public Data Release (PDR-1) of the VIMOS Public Extragalactic Survey (VIPERS). It comprises 57204 spectroscopic measurements together with all additional information necessary for optimal scientific exploitation of the data, in particular the associated photometric measurements and quantification of the photometric and survey completeness. VIPERS is an ESO Large Programme designed to build a spectroscopic sample of =~100000 galaxies with iAB<22.5 and 0.5
Investigation of Glandular Trichome Proteins in Artemisia annua L. Using Comparative Proteomics
Wu, Ting; Wang, Yejun; Guo, Dianjing
2012-01-01
Glandular secreting trichomes (GSTs) are called biofactories because they are active in synthesizing, storing and secreting various types of plant secondary metabolites. As the most effective drug against malaria, artemisinin, a sesquiterpene lactone is derived from GSTs of Artemisia annua. However, low artemisinin content (0.001%∼1.54% of dry weight) has hindered its wide application. We investigate the GST-expressed proteins in Artemisia annua using a comparative proteomics approach, aiming for a better understanding of the trichome proteome and arteminisin metabolism. 2D-electrophoresis was employed to compare the protein profiles of GSTs and leaves. More than 700 spots were resolved for GSTs, of which ∼93 non-redundant proteins were confidently identified by searching NCBI and Artemisia EST databases. Over 70% of these proteins were highly expressed in GTSs. Functional classification of these GSTs enriched proteins revealed that many of them participate in major plant metabolic processes such as electron transport, transcription and translation. PMID:22905110
Khorsandi, Shirin Elizabeth; Salehi, Siamak; Cortes, Miriam; Vilca-Melendez, Hector; Menon, Krishna; Srinivasan, Parthi; Prachalias, Andreas; Jassem, Wayel; Heaton, Nigel
2018-02-15
Mitochondria have their own genomic, transcriptomic and proteomic machinery but are unable to be autonomous, needing both nuclear and mitochondrial genomes. The aim of this work was to use computational biology to explore the involvement of Mitochondrial microRNAs (MitomiRs) and their interactions with the mitochondrial proteome in a clinical model of primary non function (PNF) of the donor after cardiac death (DCD) liver. Archival array data on the differential expression of miRNA in DCD PNF was re-analyzed using a number of publically available computational algorithms. 10 MitomiRs were identified of importance in DCD PNF, 7 with predicted interaction of their seed sequence with the mitochondrial transcriptome that included both coding, and non coding areas of the hypervariability region 1 (HVR1) and control region. Considering miRNA regulation of the nuclear encoded mitochondrial proteome, 7 hypothetical small proteins were identified with homolog function that ranged from co-factor for formation of ATP Synthase, REDOX balance and an importin/exportin protein. In silico, unconventional seed interactions, both non canonical and alternative seed sites, appear to be of greater importance in MitomiR regulation of the mitochondrial genome. Additionally, a number of novel small proteins of relevance in transplantation have been identified which need further characterization.
Thiele, H.; Glandorf, J.; Koerting, G.; Reidegeld, K.; Blüggel, M.; Meyer, H.; Stephan, C.
2007-01-01
In today’s proteomics research, various techniques and instrumentation bioinformatics tools are necessary to manage the large amount of heterogeneous data with an automatic quality control to produce reliable and comparable results. Therefore a data-processing pipeline is mandatory for data validation and comparison in a data-warehousing system. The proteome bioinformatics platform ProteinScape has been proven to cover these needs. The reprocessing of HUPO BPP participants’ MS data was done within ProteinScape. The reprocessed information was transferred into the global data repository PRIDE. ProteinScape as a data-warehousing system covers two main aspects: archiving relevant data of the proteomics workflow and information extraction functionality (protein identification, quantification and generation of biological knowledge). As a strategy for automatic data validation, different protein search engines are integrated. Result analysis is performed using a decoy database search strategy, which allows the measurement of the false-positive identification rate. Peptide identifications across different workflows, different MS techniques, and different search engines are merged to obtain a quality-controlled protein list. The proteomics identifications database (PRIDE), as a public data repository, is an archiving system where data are finally stored and no longer changed by further processing steps. Data submission to PRIDE is open to proteomics laboratories generating protein and peptide identifications. An export tool has been developed for transferring all relevant HUPO BPP data from ProteinScape into PRIDE using the PRIDE.xml format. The EU-funded ProDac project will coordinate the development of software tools covering international standards for the representation of proteomics data. The implementation of data submission pipelines and systematic data collection in public standards–compliant repositories will cover all aspects, from the generation of MS data in each laboratory to the conversion of all the annotating information and identifications to a standardized format. Such datasets can be used in the course of publishing in scientific journals.
Comparative analysis of genomics and proteomics in Bacillus thuringiensis 4.0718.
Rang, Jie; He, Hao; Wang, Ting; Ding, Xuezhi; Zuo, Mingxing; Quan, Meifang; Sun, Yunjun; Yu, Ziquan; Hu, Shengbiao; Xia, Liqiu
2015-01-01
Bacillus thuringiensis is a widely used biopesticide that produced various insecticidal active substances during its life cycle. Separation and purification of numerous insecticide active substances have been difficult because of the relatively short half-life of such substances. On the other hand, substances can be synthetized at different times during development, so samples at different stages have to be studied, further complicating the analysis. A dual genomic and proteomic approach would enhance our ability to identify such substances, and particularily using mass spectrometry-based proteomic methods. The comparative analysis for genomic and proteomic data have showed that not all of the products deduced from the annotated genome could be identified among the proteomic data. For instance, genome annotation results showed that 39 coding sequences in the whole genome were related to insect pathogenicity, including five cry genes. However, Cry2Ab, Cry1Ia, Cytotoxin K, Bacteriocin, Exoenzyme C3 and Alveolysin could not be detected in the proteomic data obtained. The sporulation-related proteins were also compared analysis, results showed that the great majority sporulation-related proteins can be detected by mass spectrometry. This analysis revealed Spo0A~P, SigF, SigE(+), SigK(+) and SigG(+), all known to play an important role in the process of spore formation regulatory network, also were displayed in the proteomic data. Through the comparison of the two data sets, it was possible to infer that some genes were silenced or were expressed at very low levels. For instance, found that cry2Ab seems to lack a functional promoter while cry1Ia may not be expressed due to the presence of transposons. With this comparative study a relatively complete database can be constructed and used to transform hereditary material, thereby prompting the high expression of toxic proteins. A theoretical basis is provided for constructing highly virulent engineered bacteria and for promoting the application of proteogenomics in the life sciences.
Systematic Proteomic Approach to Characterize the Impacts of ...
Chemical interactions have posed a big challenge in toxicity characterization and human health risk assessment of environmental mixtures. To characterize the impacts of chemical interactions on protein and cytotoxicity responses to environmental mixtures, we established a systems biology approach integrating proteomics, bioinformatics, statistics, and computational toxicology to measure expression or phosphorylation levels of 21 critical toxicity pathway regulators and 445 downstream proteins in human BEAS-28 cells treated with 4 concentrations of nickel, 2 concentrations each of cadmium and chromium, as well as 12 defined binary and 8 defined ternary mixtures of these metals in vitro. Multivariate statistical analysis and mathematical modeling of the metal-mediated proteomic response patterns showed a high correlation between changes in protein expression or phosphorylation and cellular toxic responses to both individual metals and metal mixtures. Of the identified correlated proteins, only a small set of proteins including HIF-1a is likely to be responsible for selective cytotoxic responses to different metals and metals mixtures. Furthermore, support vector machine learning was utilized to computationally predict protein responses to uncharacterized metal mixtures using experimentally generated protein response profiles corresponding to known metal mixtures. This study provides a novel proteomic approach for characterization and prediction of toxicities of
A Proteomic Approach to Identify Phosphorylation-Dependent Targets of BRCT Domains
2008-03-01
5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Zhou Songyang, Ph.D. 5d. PROJECT NUMBER 5e. TASK NUMBER E -Mail: songyang@bcm.tmc.edu 5f...L pSQEY BARD1 SG K H M I pSEDE? ECT2 TG K E R W ? PTIP (560-757) TG K P L L pSQVF pSQEY? TOPBP1...22-207) TS K L L F ? TOPBP1 (1177- 1401) SS K E L A ? XRCC1 SG K E S Y ? BRCTD1 TG K K
García-Dorival, Isabel; Wu, Weining; Dowall, Stuart; Armstrong, Stuart; Touzelet, Olivier; Wastling, Jonathan; Barr, John N; Matthews, David; Carroll, Miles; Hewson, Roger; Hiscox, Julian A
2014-11-07
Viral pathogenesis in the infected cell is a balance between antiviral responses and subversion of host-cell processes. Many viral proteins specifically interact with host-cell proteins to promote virus biology. Understanding these interactions can lead to knowledge gains about infection and provide potential targets for antiviral therapy. One such virus is Ebola, which has profound consequences for human health and causes viral hemorrhagic fever where case fatality rates can approach 90%. The Ebola virus VP24 protein plays a critical role in the evasion of the host immune response and is likely to interact with multiple cellular proteins. To map these interactions and better understand the potential functions of VP24, label-free quantitative proteomics was used to identify cellular proteins that had a high probability of forming the VP24 cellular interactome. Several known interactions were confirmed, thus placing confidence in the technique, but new interactions were also discovered including one with ATP1A1, which is involved in osmoregulation and cell signaling. Disrupting the activity of ATP1A1 in Ebola-virus-infected cells with a small molecule inhibitor resulted in a decrease in progeny virus, thus illustrating how quantitative proteomics can be used to identify potential therapeutic targets.
Cassidy, Liam; Prasse, Daniela; Linke, Dennis; Schmitz, Ruth A; Tholey, Andreas
2016-10-07
The recent discovery of an increasing number of small open reading frames (sORF) creates the need for suitable analytical technologies for the comprehensive identification of the corresponding gene products. For biological and functional studies the knowledge of the entire set of proteins and sORF gene products is essential. Consequently in the present study we evaluated analytical approaches that will allow for simultaneous analysis of widest parts of the proteome together with the predicted sORF. We performed a full proteome analysis of the methane producing archaeon Methanosarcina mazei strain Gö1 cytosolic proteome using a high/low pH reversed phase LC-MS bottom-up approach. The second analytical approach was based on semi-top-down strategy, encompassing a separation at intact protein level using a GelFree system, followed by digestion and LC-MS analysis. A high overlap in identified proteins was found for both approaches yielding the most comprehensive coverage of the cytosolic proteome of this organism achieved so far. The application of the second approach in combination with an adjustment of the search criteria for database searches further led to a significant increase of sORF peptide identifications, finally allowing to detect and identify 28 sORF gene products.
Plant proteome analysis: a 2006 update.
Jorrín, Jesús V; Maldonado, Ana M; Castillejo, Ma Angeles
2007-08-01
This 2006 'Plant Proteomics Update' is a continuation of the two previously published in 'Proteomics' by 2004 (Canovas et al., Proteomics 2004, 4, 285-298) and 2006 (Rossignol et al., Proteomics 2006, 6, 5529-5548) and it aims to bring up-to-date the contribution of proteomics to plant biology on the basis of the original research papers published throughout 2006, with references to those appearing last year. According to the published papers and topics addressed, we can conclude that, as observed for the three previous years, there has been a quantitative, but not qualitative leap in plant proteomics. The full potential of proteomics is far from being exploited in plant biology research, especially if compared to other organisms, mainly yeast and humans, and a number of challenges, mainly technological, remain to be tackled. The original papers published last year numbered nearly 100 and deal with the proteome of at least 26 plant species, with a high percentage for Arabidopsis thaliana (28) and rice (11). Scientific objectives ranged from proteomic analysis of organs/tissues/cell suspensions (57) or subcellular fractions (29), to the study of plant development (12), the effect of hormones and signalling molecules (8) and response to symbionts (4) and stresses (27). A small number of contributions have covered PTMs (8) and protein interactions (4). 2-DE (specifically IEF-SDS-PAGE) coupled to MS still constitutes the almost unique platform utilized in plant proteome analysis. The application of gel-free protein separation methods and 'second generation' proteomic techniques such as multidimensional protein identification technology (MudPIT), and those for quantitative proteomics including DIGE, isotope-coded affinity tags (ICAT), iTRAQ and stable isotope labelling by amino acids in cell culture (SILAC) still remains anecdotal. This review is divided into seven sections: Introduction, Methodology, Subcellular proteomes, Development, Responses to biotic and abiotic stresses, PTMs and Protein interactions. Section 8 summarizes the major pitfalls and challenges of plant proteomics.
Vanderperre, Benoît; Lucier, Jean-François; Bissonnette, Cyntia; Motard, Julie; Tremblay, Guillaume; Vanderperre, Solène; Wisztorski, Maxence; Salzet, Michel; Boisvert, François-Michel; Roucou, Xavier
2013-01-01
A fully mature mRNA is usually associated to a reference open reading frame encoding a single protein. Yet, mature mRNAs contain unconventional alternative open reading frames (AltORFs) located in untranslated regions (UTRs) or overlapping the reference ORFs (RefORFs) in non-canonical +2 and +3 reading frames. Although recent ribosome profiling and footprinting approaches have suggested the significant use of unconventional translation initiation sites in mammals, direct evidence of large-scale alternative protein expression at the proteome level is still lacking. To determine the contribution of alternative proteins to the human proteome, we generated a database of predicted human AltORFs revealing a new proteome mainly composed of small proteins with a median length of 57 amino acids, compared to 344 amino acids for the reference proteome. We experimentally detected a total of 1,259 alternative proteins by mass spectrometry analyses of human cell lines, tissues and fluids. In plasma and serum, alternative proteins represent up to 55% of the proteome and may be a potential unsuspected new source for biomarkers. We observed constitutive co-expression of RefORFs and AltORFs from endogenous genes and from transfected cDNAs, including tumor suppressor p53, and provide evidence that out-of-frame clones representing AltORFs are mistakenly rejected as false positive in cDNAs screening assays. Functional importance of alternative proteins is strongly supported by significant evolutionary conservation in vertebrates, invertebrates, and yeast. Our results imply that coding of multiple proteins in a single gene by the use of AltORFs may be a common feature in eukaryotes, and confirm that translation of unconventional ORFs generates an as yet unexplored proteome. PMID:23950983
Di, Guilan; Li, Hui; Zhang, Chao; Zhao, Yanjing; Zhou, Chuanjiang; Naeem, Sajid; Li, Li; Kong, Xianghui
2017-07-01
Outbreaks of infectious diseases in common carp Cyprinus carpio, a major cultured fish in northern regions of China, constantly result in significant economic losses. Until now, information proteomic on immune defence remains limited. In the present study, a profile of intestinal mucosa immune response in Cyprinus carpio was investigated after 0, 12, 36 and 84 h after challenging tissues with Aeromonas hydrophila at a concentration of 1.4 × 10 8 CFU/mL. Proteomic profiles in different samples were compared using label-free quantitative proteomic approach. Based on MASCOT database search, 1149 proteins were identified in samples after normalisation of proteins. Treated groups 1 (T1) and 2 (T2) were first clustered together and then clustered with control (C group). The distance between C and treated group 3 (T3) represented the maxima according to hierarchical cluster analysis. Therefore, comparative analysis between C and T3 was selected in the following analysis. A total of 115 proteins with differential abundance were detected to show conspicuous expressing variances. A total of 52 up-regulated proteins and 63 down-regulated proteins were detected in T3. Gene ontology analysis showed that identified up-regulated differentially expressed proteins in T3 were mainly localised in the hemoglobin complex, and down-regulated proteins in T3 were mainly localised in the major histocompatibility complex II protein complex. Forty-six proteins of differential abundance (40% of 115) were involved in immune response, with 17 up-regulated and 29 down-regulated proteins detected in T3. This study is the first to report proteome response of carp intestinal mucosa against A. hydrophila infection; information obtained contribute to understanding defence mechanisms of carp intestinal mucosa. Copyright © 2017 Elsevier Ltd. All rights reserved.
Kustatscher, Georg; Grabowski, Piotr; Rappsilber, Juri
2016-02-01
Subcellular localization is an important aspect of protein function, but the protein composition of many intracellular compartments is poorly characterized. For example, many nuclear bodies are challenging to isolate biochemically and thus remain inaccessible to proteomics. Here, we explore covariation in proteomics data as an alternative route to subcellular proteomes. Rather than targeting a structure of interest biochemically, we target it by machine learning. This becomes possible by taking data obtained for one organelle and searching it for traces of another organelle. As an extreme example and proof-of-concept we predict mitochondrial proteins based on their covariation in published interphase chromatin data. We detect about ⅓ of the known mitochondrial proteins in our chromatin data, presumably most as contaminants. However, these proteins are not present at random. We show covariation of mitochondrial proteins in chromatin proteomics data. We then exploit this covariation by multiclassifier combinatorial proteomics to define a list of mitochondrial proteins. This list agrees well with different databases on mitochondrial composition. This benchmark test raises the possibility that, in principle, covariation proteomics may also be applicable to structures for which no biochemical isolation procedures are available. © 2015 The Authors. Proteomics Published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Schmidt, Ulrike G.; Endler, Anne; Schelbert, Silvia; Brunner, Arco; Schnell, Magali; Neuhaus, H. Ekkehard; Marty-Mazars, Daniéle; Marty, Francis; Baginsky, Sacha; Martinoia, Enrico
2007-01-01
Young meristematic plant cells contain a large number of small vacuoles, while the largest part of the vacuome in mature cells is composed by a large central vacuole, occupying 80% to 90% of the cell volume. Thus far, only a limited number of vacuolar membrane proteins have been identified and characterized. The proteomic approach is a powerful tool to identify new vacuolar membrane proteins. To analyze vacuoles from growing tissues we isolated vacuoles from cauliflower (Brassica oleracea) buds, which are constituted by a large amount of small cells but also contain cells in expansion as well as fully expanded cells. Here we show that using purified cauliflower vacuoles and different extraction procedures such as saline, NaOH, acetone, and chloroform/methanol and analyzing the data against the Arabidopsis (Arabidopsis thaliana) database 102 cauliflower integral proteins and 214 peripheral proteins could be identified. The vacuolar pyrophosphatase was the most prominent protein. From the 102 identified proteins 45 proteins were already described. Nine of these, corresponding to 46% of peptides detected, are known vacuolar proteins. We identified 57 proteins (55.9%) containing at least one membrane spanning domain with unknown subcellular localization. A comparison of the newly identified proteins with expression profiles from in silico data revealed that most of them are highly expressed in young, developing tissues. To verify whether the newly identified proteins were indeed localized in the vacuole we constructed and expressed green fluorescence protein fusion proteins for five putative vacuolar membrane proteins exhibiting three to 11 transmembrane domains. Four of them, a putative organic cation transporter, a nodulin N21 family protein, a membrane protein of unknown function, and a senescence related membrane protein were localized in the vacuolar membrane, while a white-brown ATP-binding cassette transporter homolog was shown to reside in the plasma membrane. These results demonstrate that proteomic analysis of highly purified vacuoles from specific tissues allows the identification of new vacuolar proteins and provides an additional view of tonoplastic proteins. PMID:17660356
Woods, Alisa G.; Lazar, Catalin; Radu, Gabriel L.; Darie, Costel C.; Branza-Nichita, Norica
2013-01-01
Hepatitis B virus (HBV) is a human pathogen causing severe liver disease and eventually death. Despite important progress in deciphering HBV internalization, the early virus-cell interactions leading to infection are not known. HepaRG is a human bipotent liver cell line bearing the unique ability to differentiate towards a mixture of hepatocyte- and biliary-like cells. In addition to expressing metabolic functions normally found in liver, differentiated HepaRG cells support HBV infection in vitro, thus resembling cultured primary hepatocytes more than other hepatoma cells. Therefore, extensive characterization of the plasma membrane proteome from HepaRG cells would allow the identification of new cellular factors potentially involved in infection. Here we analyzed the plasma membranes of non-differentiated and differentiated HepaRG cells using nanoliquid chromatography-tandem mass spectrometry to identify the differences between the proteomes and the changes that lead to differentiation of these cells. We followed up on differentially-regulated proteins in hepatocytes- and biliary-like cells, focusing on Cathepsins D and K, Cyclophilin A, Annexin 1/A1, PDI and PDI A4/ERp72. Major differences between the two proteomes were found, including differentially regulated proteins, protein-protein interactions and intracellular localizations following differentiation. The results advance our current understanding of HepaRG differentiation and the unique properties of these cells. PMID:23977166
Zhang, Lina; Boeren, Sjef; Hageman, Jos A; van Hooijdonk, Toon; Vervoort, Jacques; Hettinga, Kasper
2015-01-01
In order to better understand the milk proteome and its changes from colostrum to mature milk, samples taken at seven time points in the first 9 days from 4 individual cows were analyzed using proteomic techniques. Both the similarity in changes from day 0 to day 9 in the quantitative milk proteome, and the differences in specific protein abundance, were observed among four cows. One third of the quantified proteins showed a significant decrease in concentration over the first 9 days after calving, especially in the immune proteins (as much as 40 fold). Three relative high abundant enzymes (XDH, LPL, and RNASE1) and cell division and proliferation protein (CREG1) may be involved in the maturation of the gastro-intestinal tract. In addition, high correlations between proteins involved in complement and blood coagulation cascades illustrates the complex nature of biological interrelationships between milk proteins. The linear decrease of protease inhibitors and proteins involved in innate and adaptive immune system implies a protective role for protease inhibitor against degradation. In conclusion, the results found in this study not only improve our understanding of the role of colostrum in both host defense and development of the newborn calf but also provides guidance for the improvement of infant formula through better understanding of the complex interactions between milk proteins.
Petrareanu, Catalina; Macovei, Alina; Sokolowska, Izabela; Woods, Alisa G; Lazar, Catalin; Radu, Gabriel L; Darie, Costel C; Branza-Nichita, Norica
2013-01-01
Hepatitis B virus (HBV) is a human pathogen causing severe liver disease and eventually death. Despite important progress in deciphering HBV internalization, the early virus-cell interactions leading to infection are not known. HepaRG is a human bipotent liver cell line bearing the unique ability to differentiate towards a mixture of hepatocyte- and biliary-like cells. In addition to expressing metabolic functions normally found in liver, differentiated HepaRG cells support HBV infection in vitro, thus resembling cultured primary hepatocytes more than other hepatoma cells. Therefore, extensive characterization of the plasma membrane proteome from HepaRG cells would allow the identification of new cellular factors potentially involved in infection. Here we analyzed the plasma membranes of non-differentiated and differentiated HepaRG cells using nanoliquid chromatography-tandem mass spectrometry to identify the differences between the proteomes and the changes that lead to differentiation of these cells. We followed up on differentially-regulated proteins in hepatocytes- and biliary-like cells, focusing on Cathepsins D and K, Cyclophilin A, Annexin 1/A1, PDI and PDI A4/ERp72. Major differences between the two proteomes were found, including differentially regulated proteins, protein-protein interactions and intracellular localizations following differentiation. The results advance our current understanding of HepaRG differentiation and the unique properties of these cells.
Using FlyBase, a Database of Drosophila Genes and Genomes.
Marygold, Steven J; Crosby, Madeline A; Goodman, Joshua L
2016-01-01
For nearly 25 years, FlyBase (flybase.org) has provided a freely available online database of biological information about Drosophila species, focusing on the model organism D. melanogaster. The need for a centralized, integrated view of Drosophila research has never been greater as advances in genomic, proteomic, and high-throughput technologies add to the quantity and diversity of available data and resources.FlyBase has taken several approaches to respond to these changes in the research landscape. Novel report pages have been generated for new reagent types and physical interaction data; Drosophila models of human disease are now represented and showcased in dedicated Human Disease Model Reports; other integrated reports have been established that bring together related genes, datasets, or reagents; Gene Reports have been revised to improve access to new data types and to highlight functional data; links to external sites have been organized and expanded; and new tools have been developed to display and interrogate all these data, including improved batch processing and bulk file availability. In addition, several new community initiatives have served to enhance interactions between researchers and FlyBase, resulting in direct user contributions and improved feedback.This chapter provides an overview of the data content, organization, and available tools within FlyBase, focusing on recent improvements. We hope it serves as a guide for our diverse user base, enabling efficient and effective exploration of the database and thereby accelerating research discoveries.
Proteomic characterization of the nucleolar linker histone H1 interaction network
Szerlong, Heather J.; Herman, Jacob A.; Krause, Christine M.; DeLuca, Jennifer G.; Skoultchi, Arthur; Winger, Quinton A.; Prenni, Jessica E.; Hansen, Jeffrey C.
2015-01-01
To investigate the relationship between linker histone H1 and protein-protein interactions in the nucleolus, biochemical and proteomics approaches were used to characterize nucleoli purified from cultured human and mouse cells. Mass spectrometry identified 175 proteins in human T-cell nucleolar extracts that bound to sepharose-immobilized H1 in vitro. Gene ontology analysis found significant enrichment for H1 binding proteins with functions related to nucleolar chromatin structure and RNA polymerase I transcription regulation, rRNA processing, and mRNA splicing. Consistent with the affinity binding results, H1 existed in large (400 to >650 kDa) macromolecular complexes in human T cell nucleolar extracts. To complement the biochemical experiments, the effects of in vivo H1 depletion on protein content and structural integrity of the nucleolus were investigated using the H1 triple isoform knock out (H1ΔTKO) mouse embryonic stem cell (mESC) model system. Proteomic profiling of purified wild type mESC nucleoli identified a total of 613 proteins, only ~60% of which were detected in the H1 mutant nucleoli. Within the affected group, spectral counting analysis quantitated 135 specific nucleolar proteins whose levels were significantly altered in H1ΔTKO mESC. Importantly, the functions of the affected proteins in mESC closely overlapped with those of the human T cell nucleolar H1 binding proteins. Immunofluorescence microscopy of intact H1ΔTKO mESC demonstrated both a loss of nucleolar RNA content and altered nucleolar morphology resulting from in vivo H1 depletion. We conclude that H1 organizes and maintains an extensive protein-protein interaction network in the nucleolus required for nucleolar structure and integrity. PMID:25584861
Razban, Rostam M; Gilson, Amy I; Durfee, Niamh; Strobelt, Hendrik; Dinkla, Kasper; Choi, Jeong-Mo; Pfister, Hanspeter; Shakhnovich, Eugene I
2018-05-08
Protein evolution spans time scales and its effects span the length of an organism. A web app named ProteomeVis is developed to provide a comprehensive view of protein evolution in the S. cerevisiae and E. coli proteomes. ProteomeVis interactively creates protein chain graphs, where edges between nodes represent structure and sequence similarities within user-defined ranges, to study the long time scale effects of protein structure evolution. The short time scale effects of protein sequence evolution are studied by sequence evolutionary rate (ER) correlation analyses with protein properties that span from the molecular to the organismal level. We demonstrate the utility and versatility of ProteomeVis by investigating the distribution of edges per node in organismal protein chain universe graphs (oPCUGs) and putative ER determinants. S. cerevisiae and E. coli oPCUGs are scale-free with scaling constants of 1.79 and 1.56, respectively. Both scaling constants can be explained by a previously reported theoretical model describing protein structure evolution (Dokholyan et al., 2002). Protein abundance most strongly correlates with ER among properties in ProteomeVis, with Spearman correlations of -0.49 (p-value<10-10) and -0.46 (p-value<10-10) for S. cerevisiae and E. coli, respectively. This result is consistent with previous reports that found protein expression to be the most important ER determinant (Zhang and Yang, 2015). ProteomeVis is freely accessible at http://proteomevis.chem.harvard.edu. Supplementary data are available at Bioinformatics. shakhnovich@chemistry.harvard.edu.
A tutorial for software development in quantitative proteomics using PSI standard formats☆
Gonzalez-Galarza, Faviel F.; Qi, Da; Fan, Jun; Bessant, Conrad; Jones, Andrew R.
2014-01-01
The Human Proteome Organisation — Proteomics Standards Initiative (HUPO-PSI) has been working for ten years on the development of standardised formats that facilitate data sharing and public database deposition. In this article, we review three HUPO-PSI data standards — mzML, mzIdentML and mzQuantML, which can be used to design a complete quantitative analysis pipeline in mass spectrometry (MS)-based proteomics. In this tutorial, we briefly describe the content of each data model, sufficient for bioinformaticians to devise proteomics software. We also provide guidance on the use of recently released application programming interfaces (APIs) developed in Java for each of these standards, which makes it straightforward to read and write files of any size. We have produced a set of example Java classes and a basic graphical user interface to demonstrate how to use the most important parts of the PSI standards, available from http://code.google.com/p/psi-standard-formats-tutorial. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan. PMID:23584085
SALAD database: a motif-based database of protein annotations for plant comparative genomics
Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi
2010-01-01
Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209 529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named ‘SALAD on ARRAYs’ to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis. PMID:19854933
SALAD database: a motif-based database of protein annotations for plant comparative genomics.
Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi
2010-01-01
Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209,529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named 'SALAD on ARRAYs' to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis.
Post-genomics of microsporidia, with emphasis on a model of minimal eukaryotic proteome: a review.
Texier, Catherine; Brosson, Damien; El Alaoui, Hicham; Méténier, Guy; Vivarès, Christian P
2005-05-01
The genome sequence of the microsporidian parasite Encephalitozoon cuniculi Levaditi, Nicolau et Schoen, 1923 contains about 2,000 genes that are representative of a non-redundant potential proteome composed of 1,909 protein chains. The purpose of this review is to relate some advances in the characterisation of this proteome through bioinformatics and experimental approaches. The reduced diversity of the set of E. cuniculi proteins is perceptible in all the compilations of predicted domains, orthologs, families and superfamilies, available in several public databases. The phyletic patterns of orthologs for seven eukaryotic organisms support an extensive gene loss in the fungal clade, with additional deletions in E. cuniculi. Most microsporidial orthologs are the smallest ones among eukaryotes, justifying an interest in the use of these compacted proteins to better discriminate between essential and non-essential regions. The three components of the E. cuniculi mRNA capping apparatus have been especially well characterized and the three-dimensional structure of the cap methyltransferase has been elucidated following the crystallisation of the microsporidial enzyme Ecm1. So far, our mass spectrometry-based analyses of the E. cuniculi spore proteome has led to the identification of about 170 proteins, one-quarter of these having no clearly predicted function. Immunocytochemical studies are in progress to determine the subcellular localisation of microsporidia-specific proteins. Post-translational modifications such as phosphorylation and glycosylation are expected to be soon explored.
Database for Parkinson Disease Mutations and Rare Variants
2016-09-01
AWARD NUMBER: W81XWH-14-1-0097 TITLE: “ Database for Parkinson Disease Mutations and Rare Variants” PRINCIPAL INVESTIGATOR: JEFFERY M. VANCE...TO THE ABOVE ADDRESS. 1. REPORT DATE September 2016 2. REPORT TYPE FINAL 3. DATES COVERED 1 Jul 2014 – 30 Jun 2016 4. TITLE AND SUBTITLE Database ...For Parkinson Disease (PD) specifically, the variant databases currently available are incomplete, don’t assess impact and/or are not equipped to
Integrated inference and evaluation of host–fungi interaction networks
Remmele, Christian W.; Luther, Christian H.; Balkenhol, Johannes; Dandekar, Thomas; Müller, Tobias; Dittrich, Marcus T.
2015-01-01
Fungal microorganisms frequently lead to life-threatening infections. Within this group of pathogens, the commensal Candida albicans and the filamentous fungus Aspergillus fumigatus are by far the most important causes of invasive mycoses in Europe. A key capability for host invasion and immune response evasion are specific molecular interactions between the fungal pathogen and its human host. Experimentally validated knowledge about these crucial interactions is rare in literature and even specialized host–pathogen databases mainly focus on bacterial and viral interactions whereas information on fungi is still sparse. To establish large-scale host–fungi interaction networks on a systems biology scale, we develop an extended inference approach based on protein orthology and data on gene functions. Using human and yeast intraspecies networks as template, we derive a large network of pathogen–host interactions (PHI). Rigorous filtering and refinement steps based on cellular localization and pathogenicity information of predicted interactors yield a primary scaffold of fungi–human and fungi–mouse interaction networks. Specific enrichment of known pathogenicity-relevant genes indicates the biological relevance of the predicted PHI. A detailed inspection of functionally relevant subnetworks reveals novel host–fungal interaction candidates such as the Candida virulence factor PLB1 and the anti-fungal host protein APP. Our results demonstrate the applicability of interolog-based prediction methods for host–fungi interactions and underline the importance of filtering and refinement steps to attain biologically more relevant interactions. This integrated network framework can serve as a basis for future analyses of high-throughput host–fungi transcriptome and proteome data. PMID:26300851
Jones, K.; Kim, K.; Patel, B.; Kelsen, S.; Braverman, A.; Swinton, D.; Gafken, P.; Jones, L.; Lane, W.; Neveu, J.; Leung, H.; Shaffer, S.; Leszyk, J.; Stanley, B.; Fox, T.; Stanley, A.; Yeung, Anthony
2013-01-01
Proteomic research can benefit from simultaneous access to multiple cutting-edge mass spectrometers. 18 core facilities responded to our investigators seeking service through the ABRF Discussion Forum. Five of the facilities selected completed four plasma proteomics experiments as routine fee-for-service. Each biological experiment entailed an iTRAQ 4-plex proteome comparison of immunodepleted plasma provided as 30 labeled-peptide fractions. Identical samples were analyzed by two AB SCIEX TripleTOF 5600 and three Thermo Orbitrap (Elite/Velos Pro/Q Exactive) instruments. 480 LC-MS/MS runs delivered >250 GB of data over two months. We compare herein routine service analyses of three peptide fractions of different peptide abundance. Data files from each instrument were studied to develop optimal analysis parameters to compare with default parameters in Mascot Distiller 2.4, ProteinPilot 4.5 beta, AB Sciex MS Data Converter 1.3 beta, and Proteome Discover 1.3. Peak-picking for TripleTOFs was best by ProteinPilot 4.5 beta while Mascot Distiller and Proteome Discoverer were comparable for the Orbitraps. We compared protein identification and quantitation in SwissProt 2012_07 database by Mascot Server 2.4.01 versus ProteinPilot. By all search methods, more proteins, up to two fold, were identified using the Q Exactive than others. Q Exactive excelled also at the number of unique significant peptide ion sequences. However, software-dependent impact on subsequent interpretation, due to peptide modifications, can be critical. These findings may have special implications for iTRAQ plasma proteomics. For the low abundance peptide ions, the slope of the dynamic range drop-off in the plasma proteome is uniquely sharp compared with cell lysates. Our study provides data for testable improvements in the operation of these mass spectrometers. More importantly, we have demonstrated a new affordable expedient workflow for investigators to perform proteomic experiments through the ABRF infrastructure. (We acknowledge John Cottrell for optimizing the peak-picking parameters for Mascot Distiller).
Symbolic and Interactional Perspectives on Leadership: An Integrative Framework.
1985-05-01
RD-RI55 24? SYMBOLIC AND INTERACTIONAL PERSPECTIVES ON LEADERSHIP: 1/1 AN INTEGRATIVE FRA..(U) TEXAS A AND M UNIV COLLEGE STATION DEPT OF MANAGEMENT...Processing Systems Office of Naval Research Technical Report Series Symbolic and Interactional 11% Perspectives on Leadership: An Integrative Framework...Richard Daft -~ and Ricky Griffin CAs * Principal Investigators IThi. dmmu asbom apro 1W ~ ~ 1W ~ w 4 d a% f dkbsa Symbolic and Interactional Perspectives
SPIKE – a database, visualization and analysis tool of cellular signaling pathways
Elkon, Ran; Vesterman, Rita; Amit, Nira; Ulitsky, Igor; Zohar, Idan; Weisz, Mali; Mass, Gilad; Orlev, Nir; Sternberg, Giora; Blekhman, Ran; Assa, Jackie; Shiloh, Yosef; Shamir, Ron
2008-01-01
Background Biological signaling pathways that govern cellular physiology form an intricate web of tightly regulated interlocking processes. Data on these regulatory networks are accumulating at an unprecedented pace. The assimilation, visualization and interpretation of these data have become a major challenge in biological research, and once met, will greatly boost our ability to understand cell functioning on a systems level. Results To cope with this challenge, we are developing the SPIKE knowledge-base of signaling pathways. SPIKE contains three main software components: 1) A database (DB) of biological signaling pathways. Carefully curated information from the literature and data from large public sources constitute distinct tiers of the DB. 2) A visualization package that allows interactive graphic representations of regulatory interactions stored in the DB and superposition of functional genomic and proteomic data on the maps. 3) An algorithmic inference engine that analyzes the networks for novel functional interplays between network components. SPIKE is designed and implemented as a community tool and therefore provides a user-friendly interface that allows registered users to upload data to SPIKE DB. Our vision is that the DB will be populated by a distributed and highly collaborative effort undertaken by multiple groups in the research community, where each group contributes data in its field of expertise. Conclusion The integrated capabilities of SPIKE make it a powerful platform for the analysis of signaling networks and the integration of knowledge on such networks with omics data. PMID:18289391
Characterization of the Proteome of Theobroma cacao Beans by Nano-UHPLC-ESI MS/MS.
Scollo, Emanuele; Neville, David; Oruna-Concha, M Jose; Trotin, Martine; Cramer, Rainer
2018-02-01
Cocoa seed storage proteins play an important role in flavour development as aroma precursors are formed from their degradation during fermentation. Major proteins in the beans of Theobroma cacao are the storage proteins belonging to the vicilin and albumin classes. Although both these classes of proteins have been extensively characterized, there is still limited information on the expression and abundance of other proteins present in cocoa beans. This work is the first attempt to characterize the whole cocoa bean proteome by nano-UHPLC-ESI MS/MS analysis using tryptic digests of cocoa bean protein extracts. The results of this analysis show that >1000 proteins could be identified using a species-specific Theobroma cacao database. The majority of the identified proteins were involved with metabolism and energy. Additionally, a significant number of the identified proteins were linked to protein synthesis and processing. Several proteins were also involved with plant response to stress conditions and defence. Albumin and vicilin storage proteins showed the highest intensity values among all detected proteins, although only seven entries were identified as storage proteins. A comparison of MS/MS data searches carried out against larger non-specific databases confirmed that using a species-specific database can increase the number of identified proteins, and at the same time reduce the number of false positives. The results of this work will be useful in developing tools that can allow the comparison of the proteomic profile of cocoa beans from different genotypes and geographic origins. Data are available via ProteomeXchange with identifier PXD005586. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Comprehensive prediction of drug-protein interactions and side effects for the human proteome
Zhou, Hongyi; Gao, Mu; Skolnick, Jeffrey
2015-01-01
Identifying unexpected drug-protein interactions is crucial for drug repurposing. We develop a comprehensive proteome scale approach that predicts human protein targets and side effects of drugs. For drug-protein interaction prediction, FINDSITEcomb, whose average precision is ~30% and recall ~27%, is employed. For side effect prediction, a new method is developed with a precision of ~57% and a recall of ~24%. Our predictions show that drugs are quite promiscuous, with the average (median) number of human targets per drug of 329 (38), while a given protein interacts with 57 drugs. The result implies that drug side effects are inevitable and existing drugs may be useful for repurposing, with only ~1,000 human proteins likely causing serious side effects. A killing index derived from serious side effects has a strong correlation with FDA approved drugs being withdrawn. Therefore, it provides a pre-filter for new drug development. The methodology is free to the academic community on the DR. PRODIS (DRugome, PROteome, and DISeasome) webserver at http://cssb.biology.gatech.edu/dr.prodis/. DR. PRODIS provides protein targets of drugs, drugs for a given protein target, associated diseases and side effects of drugs, as well as an interface for the virtual target screening of new compounds. PMID:26057345
Effect of Genetic Database Comprehensiveness on Fractional Proteomics of Escherichia coli O157:H7
2014-01-01
proteins would be observed in the extracellular fraction. 15. SUBJECT TERMS Escherichia coli O157:H7 Liquid chromatography Mass spectrometry...Preparation ...............1 2.2 Liquid Chromatography /Mass Spectrometry Sample Preparation ....................2 2.3 Liquid Chromatography /Mass... Chromatography /Mass Spectrometry Sample Preparation. Samples were prepared for liquid chromatography tandem mass spectrometry (LC-MS/MS) in a similar
Mazandu, Gaston K; Mulder, Nicola J
2012-07-01
Despite ever-increasing amounts of sequence and functional genomics data, there is still a deficiency of functional annotation for many newly sequenced proteins. For Mycobacterium tuberculosis (MTB), more than half of its genome is still uncharacterized, which hampers the search for new drug targets within the bacterial pathogen and limits our understanding of its pathogenicity. As for many other genomes, the annotations of proteins in the MTB proteome were generally inferred from sequence homology, which is effective but its applicability has limitations. We have carried out large-scale biological data integration to produce an MTB protein functional interaction network. Protein functional relationships were extracted from the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database, and additional functional interactions from microarray, sequence and protein signature data. The confidence level of protein relationships in the additional functional interaction data was evaluated using a dynamic data-driven scoring system. This functional network has been used to predict functions of uncharacterized proteins using Gene Ontology (GO) terms, and the semantic similarity between these terms measured using a state-of-the-art GO similarity metric. To achieve better trade-off between improvement of quality, genomic coverage and scalability, this prediction is done by observing the key principles driving the biological organization of the functional network. This study yields a new functionally characterized MTB strain CDC1551 proteome, consisting of 3804 and 3698 proteins out of 4195 with annotations in terms of the biological process and molecular function ontologies, respectively. These data can contribute to research into the Development of effective anti-tubercular drugs with novel biological mechanisms of action. Copyright © 2011 Elsevier B.V. All rights reserved.
Tsai, Meng-Tsz; Chen, Yu-Jen; Chen, Ching-Yi; Tsai, Mong-Hsun; Han, Chia-Li; Chen, Yu-Ju; Mersmann, Harry J; Ding, Shih-Torng
2017-03-01
Background: Prevalent worldwide obesity is associated with increased incidence of nonalcoholic fatty liver disease (NAFLD) and metabolic syndrome. The identification of noninvasive biomarkers for NAFLD is of recent interest. Because primary de novo lipogenesis occurs in chicken liver as in human liver, adult chickens with age-associated steatosis resembling human NAFLD is an appealing animal model. Objective: The objective of this study was to screen potential biomarkers in the chicken model for NAFLD by transcriptomic and proteomic analysis. Methods: Hy-Line W-36 laying hens were fed standard feed from 25 to 45 wk of age to induce fatty liver. They were killed every 4 wk, and liver and plasma were collected at each time point to assess fatty liver development and for transcriptomic and proteomic analysis. Next, selected biomarkers were confirmed in additional experiments by providing supplements of the hepatoprotective nutrients betaine [300, 600, or 900 parts per million (ppm) in vivo; 2 mM in vitro] or docosahexaenoic acid (DHA; 1% in vivo; 100 μM in vitro) to 30-wk-old Hy-Line W-36 laying hens for 4 mo and to Hy-Line W-36 chicken primary hepatocytes with oleic acid-induced steatosis. Liver or hepatocyte lipid contents and the expression of biomarkers were then examined. Results: Plasma acetoacetyl-CoA synthetase (AACS), dipeptidyl-peptidase 4 (DPP4), glutamine synthetase (GLUL), and glutathione S -transferase (GST) concentrations are well-established biomarkers for NAFLD. Selected biomarkers had significant positive associations with hepatic lipid deposition ( P < 0.001). Betaine (900 ppm in vivo; 2 mM in vitro) and DHA (1% in vivo; 100 μM in vitro) supplementation both resulted in lower steatosis accompanied by the reduced expression of selected biomarkers in vivo and in vitro ( P < 0.05). Conclusion: This study used adult laying hens to identify biomarkers for NAFLD and indicated that AACS, DPP4, GLUL, and GST could be considered to be potential diagnostic indicators for NAFLD in the future. © 2017 American Society for Nutrition.
Delcourt, Vivian; Franck, Julien; Leblanc, Eric; Narducci, Fabrice; Robin, Yves-Marie; Gimeno, Jean-Pascal; Quanico, Jusal; Wisztorski, Maxence; Kobeissy, Firas; Jacques, Jean-François; Roucou, Xavier; Salzet, Michel; Fournier, Isabelle
2017-07-01
Recently, it was demonstrated that proteins can be translated from alternative open reading frames (altORFs), increasing the size of the actual proteome. Top-down mass spectrometry-based proteomics allows the identification of intact proteins containing post-translational modifications (PTMs) as well as truncated forms translated from reference ORFs or altORFs. Top-down tissue microproteomics was applied on benign, tumor and necrotic-fibrotic regions of serous ovarian cancer biopsies, identifying proteins exhibiting region-specific cellular localization and PTMs. The regions of interest (ROIs) were determined by MALDI mass spectrometry imaging and spatial segmentation. Analysis with a customized protein sequence database containing reference and alternative proteins (altprots) identified 15 altprots, including alternative G protein nucleolar 1 (AltGNL1) found in the tumor, and translated from an altORF nested within the GNL1 canonical coding sequence. Co-expression of GNL1 and altGNL1 was validated by transfection in HEK293 and HeLa cells with an expression plasmid containing a GNL1-FLAG (V5) construct. Western blot and immunofluorescence experiments confirmed constitutive co-expression of altGNL1-V5 with GNL1-FLAG. Taken together, our approach provides means to evaluate protein changes in the case of serous ovarian cancer, allowing the detection of potential markers that have never been considered. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.
Brunet, Marie A; Levesque, Sébastien A; Hunting, Darel J; Cohen, Alan A; Roucou, Xavier
2018-05-01
Technological advances promise unprecedented opportunities for whole exome sequencing and proteomic analyses of populations. Currently, data from genome and exome sequencing or proteomic studies are searched against reference genome annotations. This provides the foundation for research and clinical screening for genetic causes of pathologies. However, current genome annotations substantially underestimate the proteomic information encoded within a gene. Numerous studies have now demonstrated the expression and function of alternative (mainly small, sometimes overlapping) ORFs within mature gene transcripts. This has important consequences for the correlation of phenotypes and genotypes. Most alternative ORFs are not yet annotated because of a lack of evidence, and this absence from databases precludes their detection by standard proteomic methods, such as mass spectrometry. Here, we demonstrate how current approaches tend to overlook alternative ORFs, hindering the discovery of new genetic drivers and fundamental research. We discuss available tools and techniques to improve identification of proteins from alternative ORFs and finally suggest a novel annotation system to permit a more complete representation of the transcriptomic and proteomic information contained within a gene. Given the crucial challenge of distinguishing functional ORFs from random ones, the suggested pipeline emphasizes both experimental data and conservation signatures. The addition of alternative ORFs in databases will render identification less serendipitous and advance the pace of research and genomic knowledge. This review highlights the urgent medical and research need to incorporate alternative ORFs in current genome annotations and thus permit their inclusion in hypotheses and models, which relate phenotypes and genotypes. © 2018 Brunet et al.; Published by Cold Spring Harbor Laboratory Press.
Wen, Bo; Xu, Shaohang; Sheynkman, Gloria M; Feng, Qiang; Lin, Liang; Wang, Quanhui; Xu, Xun; Wang, Jun; Liu, Siqi
2014-11-01
Single nucleotide variations (SNVs) located within a reading frame can result in single amino acid polymorphisms (SAPs), leading to alteration of the corresponding amino acid sequence as well as function of a protein. Accurate detection of SAPs is an important issue in proteomic analysis at the experimental and bioinformatic level. Herein, we present sapFinder, an R software package, for detection of the variant peptides based on tandem mass spectrometry (MS/MS)-based proteomics data. This package automates the construction of variation-associated databases from public SNV repositories or sample-specific next-generation sequencing (NGS) data and the identification of SAPs through database searching, post-processing and generation of HTML-based report with visualized interface. sapFinder is implemented as a Bioconductor package in R. The package and the vignette can be downloaded at http://bioconductor.org/packages/devel/bioc/html/sapFinder.html and are provided under a GPL-2 license. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Andromeda: a peptide search engine integrated into the MaxQuant environment.
Cox, Jürgen; Neuhauser, Nadin; Michalski, Annette; Scheltema, Richard A; Olsen, Jesper V; Mann, Matthias
2011-04-01
A key step in mass spectrometry (MS)-based proteomics is the identification of peptides in sequence databases by their fragmentation spectra. Here we describe Andromeda, a novel peptide search engine using a probabilistic scoring model. On proteome data, Andromeda performs as well as Mascot, a widely used commercial search engine, as judged by sensitivity and specificity analysis based on target decoy searches. Furthermore, it can handle data with arbitrarily high fragment mass accuracy, is able to assign and score complex patterns of post-translational modifications, such as highly phosphorylated peptides, and accommodates extremely large databases. The algorithms of Andromeda are provided. Andromeda can function independently or as an integrated search engine of the widely used MaxQuant computational proteomics platform and both are freely available at www.maxquant.org. The combination enables analysis of large data sets in a simple analysis workflow on a desktop computer. For searching individual spectra Andromeda is also accessible via a web server. We demonstrate the flexibility of the system by implementing the capability to identify cofragmented peptides, significantly improving the total number of identified peptides.
2017-01-01
Mass-spectrometry-based, high-throughput proteomics experiments produce large amounts of data. While typically acquired to answer specific biological questions, these data can also be reused in orthogonal ways to reveal new biological knowledge. We here present a novel method for such orthogonal data reuse of public proteomics data. Our method elucidates biological relationships between proteins based on the co-occurrence of these proteins across human experiments in the PRIDE database. The majority of the significantly co-occurring protein pairs that were detected by our method have been successfully mapped to existing biological knowledge. The validity of our novel method is substantiated by the extremely few pairs that can be mapped to existing knowledge based on random associations between the same set of proteins. Moreover, using literature searches and the STRING database, we were able to derive meaningful biological associations for unannotated protein pairs that were detected using our method, further illustrating that as-yet unknown associations present highly interesting targets for follow-up analysis. PMID:28480704
Bennuru, Sasisekhar; Cotton, James A.; Ribeiro, Jose M. C.; Grote, Alexandra; Harsha, Bhavana; Holroyd, Nancy; Mhashilkar, Amruta; Molina, Douglas M.; Randall, Arlo Z.; Shandling, Adam D.; Unnasch, Thomas R.; Ghedin, Elodie; Berriman, Matthew
2016-01-01
ABSTRACT Onchocerciasis (river blindness) is a neglected tropical disease that has been successfully targeted by mass drug treatment programs in the Americas and small parts of Africa. Achieving the long-term goal of elimination of onchocerciasis, however, requires additional tools, including drugs, vaccines, and biomarkers of infection. Here, we describe the transcriptome and proteome profiles of the major vector and the human host stages (L1, L2, L3, molting L3, L4, adult male, and adult female) of Onchocerca volvulus along with the proteome of each parasitic stage and of its Wolbachia endosymbiont (wOv). In so doing, we have identified stage-specific pathways important to the parasite’s adaptation to its human host during its early development. Further, we generated a protein array that, when screened with well-characterized human samples, identified novel diagnostic biomarkers of O. volvulus infection and new potential vaccine candidates. This immunomic approach not only demonstrates the power of this postgenomic discovery platform but also provides additional tools for onchocerciasis control programs. PMID:27881553
Carpp, Lindsay N.; Rogers, Richard S.; Moritz, Robert L.; Aitchison, John D.
2014-01-01
Dengue virus is considered to be the most important mosquito-borne virus worldwide and poses formidable economic and health care burdens on many tropical and subtropical countries. Dengue infection induces drastic rearrangement of host endoplasmic reticulum membranes into complex membranous structures housing replication complexes; the contribution(s) of host proteins and pathways to this process is poorly understood but is likely to be mediated by protein-protein interactions. We have developed an approach for obtaining high confidence protein-protein interaction data by employing affinity tags and quantitative proteomics, in the context of viral infection, followed by robust statistical analysis. Using this approach, we identified high confidence interactors of NS5, the viral polymerase, and NS3, the helicase/protease. Quantitative proteomics allowed us to exclude a large number of presumably nonspecific interactors from our data sets and imparted a high level of confidence to our resulting data sets. We identified 53 host proteins reproducibly associated with NS5 and 41 with NS3, with 13 of these candidates present in both data sets. The host factors identified have diverse functions, including retrograde Golgi-to-endoplasmic reticulum transport, biosynthesis of long-chain fatty-acyl-coenzyme As, and in the unfolded protein response. We selected GBF1, a guanine nucleotide exchange factor responsible for ARF activation, from the NS5 data set for follow up and functional validation. We show that GBF1 plays a critical role early in dengue infection that is independent of its role in the maintenance of Golgi structure. Importantly, the approach described here can be applied to virtually any organism/system as a tool for better understanding its molecular interactions. PMID:24855065
ArrayNinja: An Open Source Platform for Unified Planning and Analysis of Microarray Experiments.
Dickson, B M; Cornett, E M; Ramjan, Z; Rothbart, S B
2016-01-01
Microarray-based proteomic platforms have emerged as valuable tools for studying various aspects of protein function, particularly in the field of chromatin biochemistry. Microarray technology itself is largely unrestricted in regard to printable material and platform design, and efficient multidimensional optimization of assay parameters requires fluidity in the design and analysis of custom print layouts. This motivates the need for streamlined software infrastructure that facilitates the combined planning and analysis of custom microarray experiments. To this end, we have developed ArrayNinja as a portable, open source, and interactive application that unifies the planning and visualization of microarray experiments and provides maximum flexibility to end users. Array experiments can be planned, stored to a private database, and merged with the imaged results for a level of data interaction and centralization that is not currently attainable with available microarray informatics tools. © 2016 Elsevier Inc. All rights reserved.
Role of Proteomics in the Development of Personalized Medicine.
Jain, Kewal K
2016-01-01
Advances in proteomic technologies have made import contribution to the development of personalized medicine by facilitating detection of protein biomarkers, proteomics-based molecular diagnostics, as well as protein biochips and pharmacoproteomics. Application of nanobiotechnology in proteomics, nanoproteomics, has further enhanced applications in personalized medicine. Proteomics-based molecular diagnostics will have an important role in the diagnosis of certain conditions and understanding the pathomechanism of disease. Proteomics will be a good bridge between diagnostics and therapeutics; the integration of these will be important for advancing personalized medicine. Use of proteomic biomarkers and combination of pharmacoproteomics with pharmacogenomics will enable stratification of clinical trials and improve monitoring of patients for development of personalized therapies. Proteomics is an important component of several interacting technologies used for development of personalized medicine, which is depicted graphically. Finally, cancer is a good example of applications of proteomic technologies for personalized management of cancer. © 2016 Elsevier Inc. All rights reserved.
Impact of nanoscale topography on genomics and proteomics of adherent bacteria.
Rizzello, Loris; Sorce, Barbara; Sabella, Stefania; Vecchio, Giuseppe; Galeone, Antonio; Brunetti, Virgilio; Cingolani, Roberto; Pompa, Pier Paolo
2011-03-22
Bacterial adhesion onto inorganic/nanoengineered surfaces is a key issue in biotechnology and medicine, because it is one of the first necessary steps to determine a general pathogenic event. Understanding the molecular mechanisms of bacteria-surface interaction represents a milestone for planning a new generation of devices with unanimously certified antibacterial characteristics. Here, we show how highly controlled nanostructured substrates impact the bacterial behavior in terms of morphological, genomic, and proteomic response. We observed by atomic force microscopy (AFM) and scanning electron microscopy (SEM) that type-1 fimbriae typically disappear in Escherichia coli adherent onto nanostructured substrates, as opposed to bacteria onto reference glass or flat gold surfaces. A genetic variation of the fimbrial operon regulation was consistently identified by real time qPCR in bacteria interacting with the nanorough substrates. To gain a deeper insight into the molecular basis of the interaction mechanisms, we explored the entire proteomic profile of E. coli by 2D-DIGE, finding significant changes in the bacteria adherent onto the nanorough substrates, such as regulations of proteins involved in stress processes and defense mechanisms. We thus demonstrated that a pure physical stimulus, that is, a nanoscale variation of surface topography, may play per se a significant role in determining the morphological, genetic, and proteomic profile of bacteria. These data suggest that in depth investigations of the molecular processes of microorganisms adhering to surfaces are of great importance for the design of innovative biomaterials with active biological functionalities.
Deswal, Renu; Abat, Jasmeet Kaur; Sehrawat, Ankita; Gupta, Ravi; Kashyap, Prakriti; Sharma, Shruti; Sharma, Bhavana; Chaurasia, Satya Prakash; Chanu, Sougrakpam Yaiphabi; Masi, Antonio; Agrawal, Ganesh Kumar; Sarkar, Abhijit; Agrawal, Raj; Dunn, Michael J; Renaut, Jenny; Rakwal, Randeep
2014-07-01
International Plant Proteomics Organization (INPPO) outlined ten initiatives to promote plant proteomics in each and every country. With greater emphasis in developing countries, one of those was to "organize workshops at national and international levels to train manpower and exchange information". This third INPPO highlights covers the workshop organized for the very first time in a developing country, India, at the Department of Botany in University of Delhi on December 26-30, 2013 titled - "1(st) Plant Proteomics Workshop / Training Program" under the umbrella of INPPO India-Nepal chapter. Selected 20 participants received on-hand training mainly on gel-based proteomics approach along with manual booklet and parallel lectures on this and associated topics. In house, as well as invited experts drawn from other Universities and Institutes (national and international), delivered talks on different aspects of gel-based and gel-free proteomics. Importance of gel-free proteomics approach, translational proteomics, and INPPO roles were presented and interactively discussed by a group of three invited speakers Drs. Ganesh Kumar Agrawal (Nepal), Randeep Rakwal (Japan), and Antonio Masi (Italy). Given the output of this systematic workshop, it was proposed and thereafter decided to be organized every alternate year; the next workshop will be held in 2015. Furthermore, possibilities on providing advanced training to those students / researchers / teachers with basic knowledge in proteomics theory and experiments at national and international levels were discussed. INPPO is committed to generating next-generation trained manpower in proteomics, and it would only happen by the firm determination of scientists to come forward and do it. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Games, Patrícia Dias; daSilva, Elói Quintas Gonçalves; Barbosa, Meire de Oliveira; Almeida-Souza, Hebréia Oliveira; Fontes, Patrícia Pereira; deMagalhães, Marcos Jorge; Pereira, Paulo Roberto Gomes; Prates, Maura Vianna; Franco, Gloria Regina; Faria-Campos, Alessandra; Campos, Sérgio Vale Aguiar; Baracat-Pereira, Maria Cristina
2016-12-15
Antimicrobial peptides from plants present mechanisms of action that are different from those of conventional defense agents. They are under-explored but have a potential as commercial antimicrobials. Bell pepper leaves ('Magali R') are discarded after harvesting the fruit and are sources of bioactive peptides. This work reports the isolation by peptidomics tools, and the identification and partially characterization by computational tools of an antimicrobial peptide from bell pepper leaves, and evidences the usefulness of records and the in silico analysis for the study of plant peptides aiming biotechnological uses. Aqueous extracts from leaves were enriched in peptide by salt fractionation and ultrafiltration. An antimicrobial peptide was isolated by tandem chromatographic procedures. Mass spectrometry, automated peptide sequencing and bioinformatics tools were used alternately for identification and partial characterization of the Hevein-like peptide, named HEV-CANN. The computational tools that assisted to the identification of the peptide included BlastP, PSI-Blast, ClustalOmega, PeptideCutter, and ProtParam; conventional protein databases (DB) as Mascot, Protein-DB, GenBank-DB, RefSeq, Swiss-Prot, and UniProtKB; specific for peptides DB as Amper, APD2, CAMP, LAMPs, and PhytAMP; other tools included in ExPASy for Proteomics; The Bioactive Peptide Databases, and The Pepper Genome Database. The HEV-CANN sequence presented 40 amino acid residues, 4258.8 Da, theoretical pI-value of 8.78, and four disulfide bonds. It was stable, and it has inhibited the growth of phytopathogenic bacteria and a fungus. HEV-CANN presented a chitin-binding domain in their sequence. There was a high identity and a positive alignment of HEV-CANN sequence in various databases, but there was not a complete identity, suggesting that HEV-CANN may be produced by ribosomal synthesis, which is in accordance with its constitutive nature. Computational tools for proteomics and databases are not adjusted for short sequences, which hampered HEV-CANN identification. The adjustment of statistical tests in large databases for proteins is an alternative to promote the significant identification of peptides. The development of specific DB for plant antimicrobial peptides, with information about peptide sequences, functional genomic data, structural motifs and domains of molecules, functional domains, and peptide-biomolecule interactions are valuable and necessary.
Plant subcellular proteomics: Application for exploring optimal cell function in soybean.
Wang, Xin; Komatsu, Setsuko
2016-06-30
Plants have evolved complicated responses to developmental changes and stressful environmental conditions. Subcellular proteomics has the potential to elucidate localized cellular responses and investigate communications among subcellular compartments during plant development and in response to biotic and abiotic stresses. Soybean, which is a valuable legume crop rich in protein and vegetable oil, can grow in several climatic zones; however, the growth and yield of soybean are markedly decreased under stresses. To date, numerous proteomic studies have been performed in soybean to examine the specific protein profiles of cell wall, plasma membrane, nucleus, mitochondrion, chloroplast, and endoplasmic reticulum. In this review, methods for the purification and purity assessment of subcellular organelles from soybean are summarized. In addition, the findings from subcellular proteomic analyses of soybean during development and under stresses, particularly flooding stress, are presented and the proteins regulated among subcellular compartments are discussed. Continued advances in subcellular proteomics are expected to greatly contribute to the understanding of the responses and interactions that occur within and among subcellular compartments during development and under stressful environmental conditions. Subcellular proteomics has the potential to investigate the cellular events and interactions among subcellular compartments in response to development and stresses in plants. Soybean could grow in several climatic zones; however, the growth and yield of soybean are markedly decreased under stresses. Numerous proteomics of cell wall, plasma membrane, nucleus, mitochondrion, chloroplast, and endoplasmic reticulum was carried out to investigate the respecting proteins and their functions in soybean during development or under stresses. In this review, methods of subcellular-organelle enrichment and purity assessment are summarized. In addition, previous findings of subcellular proteomics are presented, and functional proteins regulated among different subcellular are discussed. Subcellular proteomics contributes greatly to uncovering responses and interactions among subcellular compartments during development and under stressful environmental conditions in soybean. Copyright © 2016 Elsevier B.V. All rights reserved.
Computational prediction of protein-protein interactions in Leishmania predicted proteomes.
Rezende, Antonio M; Folador, Edson L; Resende, Daniela de M; Ruiz, Jeronimo C
2012-01-01
The Trypanosomatids parasites Leishmania braziliensis, Leishmania major and Leishmania infantum are important human pathogens. Despite of years of study and genome availability, effective vaccine has not been developed yet, and the chemotherapy is highly toxic. Therefore, it is clear just interdisciplinary integrated studies will have success in trying to search new targets for developing of vaccines and drugs. An essential part of this rationale is related to protein-protein interaction network (PPI) study which can provide a better understanding of complex protein interactions in biological system. Thus, we modeled PPIs for Trypanosomatids through computational methods using sequence comparison against public database of protein or domain interaction for interaction prediction (Interolog Mapping) and developed a dedicated combined system score to address the predictions robustness. The confidence evaluation of network prediction approach was addressed using gold standard positive and negative datasets and the AUC value obtained was 0.94. As result, 39,420, 43,531 and 45,235 interactions were predicted for L. braziliensis, L. major and L. infantum respectively. For each predicted network the top 20 proteins were ranked by MCC topological index. In addition, information related with immunological potential, degree of protein sequence conservation among orthologs and degree of identity compared to proteins of potential parasite hosts was integrated. This information integration provides a better understanding and usefulness of the predicted networks that can be valuable to select new potential biological targets for drug and vaccine development. Network modularity which is a key when one is interested in destabilizing the PPIs for drug or vaccine purposes along with multiple alignments of the predicted PPIs were performed revealing patterns associated with protein turnover. In addition, around 50% of hypothetical protein present in the networks received some degree of functional annotation which represents an important contribution since approximately 60% of Leishmania predicted proteomes has no predicted function.
Proteomics and plant disease: advances in combating a major threat to the global food supply.
Rampitsch, Christof; Bykova, Natalia V
2012-02-01
The study of plant disease and immunity is benefiting tremendously from proteomics. Parallel streams of research from model systems, from pathogens in vitro and from the relevant pathogen-crop interactions themselves have begun to reveal a model of how plants succumb to invading pathogens and how they defend themselves without the benefit of a circulating immune system. In this review, we discuss the contribution of proteomics to these advances, drawing mainly on examples from crop-fungus interactions, from Arabidopsis-bacteria interactions, from elicitor-based model systems and from pathogen studies, to highlight also the important contribution of non-crop systems to advancing crop protection. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Introducing the PRIDE Archive RESTful web services.
Reisinger, Florian; del-Toro, Noemi; Ternent, Tobias; Hermjakob, Henning; Vizcaíno, Juan Antonio
2015-07-01
The PRIDE (PRoteomics IDEntifications) database is one of the world-leading public repositories of mass spectrometry (MS)-based proteomics data and it is a founding member of the ProteomeXchange Consortium of proteomics resources. In the original PRIDE database system, users could access data programmatically by accessing the web services provided by the PRIDE BioMart interface. New REST (REpresentational State Transfer) web services have been developed to serve the most popular functionality provided by BioMart (now discontinued due to data scalability issues) and address the data access requirements of the newly developed PRIDE Archive. Using the API (Application Programming Interface) it is now possible to programmatically query for and retrieve peptide and protein identifications, project and assay metadata and the originally submitted files. Searching and filtering is also possible by metadata information, such as sample details (e.g. species and tissues), instrumentation (mass spectrometer), keywords and other provided annotations. The PRIDE Archive web services were first made available in April 2014. The API has already been adopted by a few applications and standalone tools such as PeptideShaker, PRIDE Inspector, the Unipept web application and the Python-based BioServices package. This application is free and open to all users with no login requirement and can be accessed at http://www.ebi.ac.uk/pride/ws/archive/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
The Functional Human C-Terminome
Hedden, Michael; Lyon, Kenneth F.; Brooks, Steven B.; David, Roxanne P.; Limtong, Justin; Newsome, Jacklyn M.; Novakovic, Nemanja; Rajasekaran, Sanguthevar; Thapar, Vishal; Williams, Sean R.; Schiller, Martin R.
2016-01-01
All translated proteins end with a carboxylic acid commonly called the C-terminus. Many short functional sequences (minimotifs) are located on or immediately proximal to the C-terminus. However, information about the function of protein C-termini has not been consolidated into a single source. Here, we built a new “C-terminome” database and web system focused on human proteins. Approximately 3,600 C-termini in the human proteome have a minimotif with an established molecular function. To help evaluate the function of the remaining C-termini in the human proteome, we inferred minimotifs identified by experimentation in rodent cells, predicted minimotifs based upon consensus sequence matches, and predicted novel highly repetitive sequences in C-termini. Predictions can be ranked by enrichment scores or Gene Evolutionary Rate Profiling (GERP) scores, a measurement of evolutionary constraint. By searching for new anchored sequences on the last 10 amino acids of proteins in the human proteome with lengths between 3–10 residues and up to 5 degenerate positions in the consensus sequences, we have identified new consensus sequences that predict instances in the majority of human genes. All of this information is consolidated into a database that can be accessed through a C-terminome web system with search and browse functions for minimotifs and human proteins. A known consensus sequence-based predicted function is assigned to nearly half the proteins in the human proteome. Weblink: http://cterminome.bio-toolkit.com. PMID:27050421
Proteomic profiling of mature leaves from oil palm (Elaeis guineensis Jacq.).
Tan, Hooi Sin; Jacoby, Richard P; Ong-Abdullah, Meilina; Taylor, Nicolas L; Liddell, Susan; Chee, Wong Wei; Chin, Chiew Foan
2017-04-01
Oil palm is one of the most productive oil bearing crops grown in Southeast Asia. Due to the dwindling availability of agricultural land and increasing demand for high yielding oil palm seedlings, clonal propagation is vital to the oil palm industry. Most commonly, leaf explants are used for in vitro micropropagation of oil palm and to optimize this process it is important to unravel the physiological and molecular mechanisms underlying somatic embryo production from leaves. In this study, a proteomic approach was used to determine protein abundance of mature oil palm leaves. To do this, leaf proteins were extracted using TCA/acetone precipitation protocol and separated by 2DE. A total of 191 protein spots were observed on the 2D gels and 67 of the most abundant protein spots that were consistently observed were selected for further analysis with 35 successfully identified using MALDI TOF/TOF MS. The majority of proteins were classified as being involved in photosynthesis, metabolism, cellular biogenesis, stress response, and transport. This study provides the first proteomic assessment of oil palm leaves in this important oil crop and demonstrates the successful identification of selected proteins spots using the Malaysian Palm Oil Board (MPOB) Elaeis guineensis EST and NCBI-protein databases. The MS data have been deposited in the ProteomeXchange Consortium database with the data set identifier PXD001307. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
ERIC Educational Resources Information Center
Albright, Jessica C.; Dassenko, David J.; Mohamed, Essa A.; Beussman, Douglas J.
2009-01-01
Matrix-assisted laser desorption/ionization (MALDI) mass spectrometry is an important bioanalytical technique in drug discovery, proteomics, and research at the biology-chemistry interface. This is an especially powerful tool when combined with gel separation of proteins and database mining using the mass spectral data. Currently, few hands-on…
Lavallée-Adam, Mathieu; Yates, John R
2016-03-24
PSEA-Quant analyzes quantitative mass spectrometry-based proteomics datasets to identify enrichments of annotations contained in repositories such as the Gene Ontology and Molecular Signature databases. It allows users to identify the annotations that are significantly enriched for reproducibly quantified high abundance proteins. PSEA-Quant is available on the Web and as a command-line tool. It is compatible with all label-free and isotopic labeling-based quantitative proteomics methods. This protocol describes how to use PSEA-Quant and interpret its output. The importance of each parameter as well as troubleshooting approaches are also discussed. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Pineda, M; Sajnani, C; Barón, M
2010-01-01
We have analyzed the chloroplast proteome of Nicotiana benthamiana using two-dimensional gel electrophoresis and mass spectrometry followed by a database search. In order to improve the resolution of the two-dimensional electrophoresis gels, we have made separate maps for the low and the high pH range. At least 200 spots were detected. We identified 72 polypeptides, some being isoforms of different multiprotein families. In addition, changes in this chloroplast proteome induced by the infection with the Spanish strain of the Pepper mild mottle virus were investigated. Viral infection induced the down-regulation of several chloroplastidic proteins involved in both the photosynthetic electron-transport chain and the Benson-Calvin cycle.
Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework
2012-01-01
Background For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. Results We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. Conclusion The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources. PMID:23216909
dbHiMo: a web-based epigenomics platform for histone-modifying enzymes.
Choi, Jaeyoung; Kim, Ki-Tae; Huh, Aram; Kwon, Seomun; Hong, Changyoung; Asiegbu, Fred O; Jeon, Junhyun; Lee, Yong-Hwan
2015-01-01
Over the past two decades, epigenetics has evolved into a key concept for understanding regulation of gene expression. Among many epigenetic mechanisms, covalent modifications such as acetylation and methylation of lysine residues on core histones emerged as a major mechanism in epigenetic regulation. Here, we present the database for histone-modifying enzymes (dbHiMo; http://hme.riceblast.snu.ac.kr/) aimed at facilitating functional and comparative analysis of histone-modifying enzymes (HMEs). HMEs were identified by applying a search pipeline built upon profile hidden Markov model (HMM) to proteomes. The database incorporates 11,576 HMEs identified from 603 proteomes including 483 fungal, 32 plants and 51 metazoan species. The dbHiMo provides users with web-based personalized data browsing and analysis tools, supporting comparative and evolutionary genomics. With comprehensive data entries and associated web-based tools, our database will be a valuable resource for future epigenetics/epigenomics studies. © The Author(s) 2015. Published by Oxford University Press.
Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework.
Lewis, Steven; Csordas, Attila; Killcoyne, Sarah; Hermjakob, Henning; Hoopmann, Michael R; Moritz, Robert L; Deutsch, Eric W; Boyle, John
2012-12-05
For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources.
Wang, Chen; Zhou, Jiangrui; Wang, Shuowen; Ye, Mingliang; Jiang, Chunlei; Fan, Guorong; Zou, Hanfa
2010-06-04
This study investigated the mechanisms involved in the antinociceptive action induced by levo-tetrahydropalmatine (l-THP) in the formalin test by combined comparative and chemical proteomics. Rats were pretreated with l-THP by the oral route (40 mg/kg) 1 h before formalin injection. The antinociceptive effect of l-THP was shown in the first and second phases of the formalin test. To address the mechanisms by which l-THP inhibits formalin-induced nociception in rats, the combined comparative and chemical proteomics were applied. A novel high-throughput comparative proteomic approach based on 2D-nano-LC-MS/MS was applied to simultaneously evaluate the deregulated proteins involved in the response of l-THP treatment in formalin-induced pain rats. Thousands of proteins were identified, among which 17 proteins survived the stringent filter criteria and were further included for functional discussion. Two proteins (Neurabin-1 and Calcium-dependent secretion activator 1) were randomly selected, and their expression levels were further confirmed by Western Blots. The results matched well with those of proteomics. In the present study, we also described the development and application of l-THP immobilized beads to bind the targets. Following incubation with cellular lysates, the proteome interacting with the fixed l-THP was identified. The results of comparative and chemical proteomics were quite complementary. Although the precise roles of these identified moleculars in l-THP-induced antinociception need further study, the combined results indicated that proteins associated with signal transduction, vesicular trafficking and neurotransmitter release, energy metabolism, and ion transport play important roles in l-THP-induced antinociception in the formalin test.
Bergerat, Agnes; Decano, Julius; Wu, Chang-Jiun; Choi, Hyungwon; Nesvizhskii, Alexey I; Moran, Ann Marie; Ruiz-Opazo, Nelson; Steffen, Martin; Herrera, Victoria LM
2011-01-01
Stroke is the third leading cause of death in the United States with high rates of morbidity among survivors. The search to fill the unequivocal need for new therapeutic approaches would benefit from unbiased proteomic analyses of animal models of spontaneous stroke in the prestroke stage. Since brain microvessels play key roles in neurovascular coupling, we investigated prestroke microvascular proteome changes. Proteomic analysis of cerebral cortical microvessels (cMVs) was done by tandem mass spectrometry comparing two prestroke time points. Metaprotein-pathway analyses of proteomic spectral count data were done to identify risk factor–induced changes, followed by QSPEC-analyses of individual protein changes associated with increased stroke susceptibility. We report 26 cMV proteome profiles from male and female stroke-prone and non–stroke-prone rats at 2 months and 4.5 months of age prior to overt stroke events. We identified 1,934 proteins by two or more peptides. Metaprotein pathway analysis detected age-associated changes in energy metabolism and cell-to-microenvironment interactions, as well as sex-specific changes in energy metabolism and endothelial leukocyte transmigration pathways. Stroke susceptibility was associated independently with multiple protein changes associated with ischemia, angiogenesis or involved in blood brain barrier (BBB) integrity. Immunohistochemical analysis confirmed aquaporin-4 and laminin-α1 induction in cMVs, representative of proteomic changes with >65 Bayes factor (BF), associated with stroke susceptibility. Altogether, proteomic analysis demonstrates significant molecular changes in ischemic cerebral microvasculature in the prestroke stage, which could contribute to the observed model phenotype of microhemorrhages and postischemic hemorrhagic transformation. These pathways comprise putative targets for translational research of much needed novel diagnostic and therapeutic approaches for stroke. PMID:21519634
The Knowledge-Integrated Network Biomarkers Discovery for Major Adverse Cardiac Events
Jin, Guangxu; Zhou, Xiaobo; Wang, Honghui; Zhao, Hong; Cui, Kemi; Zhang, Xiang-Sun; Chen, Luonan; Hazen, Stanley L.; Li, King; Wong, Stephen T. C.
2010-01-01
The mass spectrometry (MS) technology in clinical proteomics is very promising for discovery of new biomarkers for diseases management. To overcome the obstacles of data noises in MS analysis, we proposed a new approach of knowledge-integrated biomarker discovery using data from Major Adverse Cardiac Events (MACE) patients. We first built up a cardiovascular-related network based on protein information coming from protein annotations in Uniprot, protein–protein interaction (PPI), and signal transduction database. Distinct from the previous machine learning methods in MS data processing, we then used statistical methods to discover biomarkers in cardiovascular-related network. Through the tradeoff between known protein information and data noises in mass spectrometry data, we finally could firmly identify those high-confident biomarkers. Most importantly, aided by protein–protein interaction network, that is, cardiovascular-related network, we proposed a new type of biomarkers, that is, network biomarkers, composed of a set of proteins and the interactions among them. The candidate network biomarkers can classify the two groups of patients more accurately than current single ones without consideration of biological molecular interaction. PMID:18665624
EuPathDB: the eukaryotic pathogen genomics database resource
Aurrecoechea, Cristina; Barreto, Ana; Basenko, Evelina Y.; Brestelli, John; Brunk, Brian P.; Cade, Shon; Crouch, Kathryn; Doherty, Ryan; Falke, Dave; Fischer, Steve; Gajria, Bindu; Harb, Omar S.; Heiges, Mark; Hertz-Fowler, Christiane; Hu, Sufen; Iodice, John; Kissinger, Jessica C.; Lawrence, Cris; Li, Wei; Pinney, Deborah F.; Pulman, Jane A.; Roos, David S.; Shanmugasundram, Achchuthan; Silva-Franco, Fatima; Steinbiss, Sascha; Stoeckert, Christian J.; Spruill, Drew; Wang, Haiming; Warrenfeltz, Susanne; Zheng, Jie
2017-01-01
The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host–pathogen interactions. PMID:27903906
Radhouani, Hajer; Poeta, Patrícia; Pinto, Luís; Miranda, Júlio; Coelho, Céline; Carvalho, Carlos; Rodrigues, Jorge; López, María; Torres, Carmen; Vitorino, Rui; Domingues, Pedro; Igrejas, Gilberto
2010-09-21
Enterococci have emerged as the third most common cause of nosocomial infections, requiring bactericidal antimicrobial therapy. Although vancomycin resistance is a major problem in clinics and has emerged in an important extend in farm animals, few studies have examined it in wild animals. To determine the prevalence of vanA-containing Enterococcus strains among faecal samples of Seagulls (Larus cachinnans) of Berlengas Natural Reserve of Portugal, we developed a proteomic approach integrated with genomic data. The purpose was to detect the maximum number of proteins that vary in different enterococci species which are thought to be connected in some, as yet unknown, way to antibiotic resistance. From the 57 seagull samples, 54 faecal samples showed the presence of Enterococcus isolates (94.7%). For the enterococci, E. faecium was the most prevalent species in seagulls (50%), followed by E. faecalis and E. durans (10.4%), and E. hirae (6.3%). VanA-containing enterococcal strains were detected in 10.5% of the 57 seagull faecal samples studied. Four of the vanA-containing enterococci were identified as E. faecium and two as E. durans. The tet(M) gene was found in all five tetracycline-resistant vanA strains. The erm(B) gene was demonstrated in all six erythromycin-resistant vanA strains. The hyl virulence gene was detected in all four vanA-containing E. faecium isolates in this study, and two of them harboured the purK1 allele. In addition these strains also showed ampicillin and ciprofoxacin resistance. The whole-cell proteomic profile of vanA-containing Enterococcus strains was applied to evaluate the discriminatory power of this technique for their identification. The major differences among species-specific profiles were found in the positions corresponding to 97-45 kDa. Sixty individualized protein spots for each vanA isolate was identified and suitable for peptide mass fingerprinting measures by spectrometry measuring (MALDI/TOF MS) and their identification through bioinformatic databases query. The proteins were classified in different groups according to their biological function: protein biosynthesis, ATP synthesis, glycolysis, conjugation and antibiotic resistance. Taking into account the origin of these strains and its relation to infectious processes in humans and animals, it is important to explore the proteome of new strains which might serve as protein biomarkers for biological activity. The comprehensive description of proteins isolated from vancomycin-resistant Enterococcus faecium and E. durans may provide new targets for development of antimicrobial agents. This knowledge may help to identify new biomarkers of antibiotic resistance and virulence factors.
2010-01-01
Background Enterococci have emerged as the third most common cause of nosocomial infections, requiring bactericidal antimicrobial therapy. Although vancomycin resistance is a major problem in clinics and has emerged in an important extend in farm animals, few studies have examined it in wild animals. To determine the prevalence of vanA-containing Enterococcus strains among faecal samples of Seagulls (Larus cachinnans) of Berlengas Natural Reserve of Portugal, we developed a proteomic approach integrated with genomic data. The purpose was to detect the maximum number of proteins that vary in different enterococci species which are thought to be connected in some, as yet unknown, way to antibiotic resistance. Results From the 57 seagull samples, 54 faecal samples showed the presence of Enterococcus isolates (94.7%). For the enterococci, E. faecium was the most prevalent species in seagulls (50%), followed by E. faecalis and E. durans (10.4%), and E. hirae (6.3%). VanA-containing enterococcal strains were detected in 10.5% of the 57 seagull faecal samples studied. Four of the vanA-containing enterococci were identified as E. faecium and two as E. durans. The tet(M) gene was found in all five tetracycline-resistant vanA strains. The erm(B) gene was demonstrated in all six erythromycin-resistant vanA strains. The hyl virulence gene was detected in all four vanA-containing E. faecium isolates in this study, and two of them harboured the purK1 allele. In addition these strains also showed ampicillin and ciprofoxacin resistance. The whole-cell proteomic profile of vanA-containing Enterococcus strains was applied to evaluate the discriminatory power of this technique for their identification. The major differences among species-specific profiles were found in the positions corresponding to 97-45 kDa. Sixty individualized protein spots for each vanA isolate was identified and suitable for peptide mass fingerprinting measures by spectrometry measuring (MALDI/TOF MS) and their identification through bioinformatic databases query. The proteins were classified in different groups according to their biological function: protein biosynthesis, ATP synthesis, glycolysis, conjugation and antibiotic resistance. Taking into account the origin of these strains and its relation to infectious processes in humans and animals, it is important to explore the proteome of new strains which might serve as protein biomarkers for biological activity. Conclusions The comprehensive description of proteins isolated from vancomycin-resistant Enterococcus faecium and E. durans may provide new targets for development of antimicrobial agents. This knowledge may help to identify new biomarkers of antibiotic resistance and virulence factors. PMID:20858227
Proteogenomics Dashboard for the Human Proteome Project.
Tabas-Madrid, Daniel; Alves-Cruzeiro, Joao; Segura, Victor; Guruceaga, Elizabeth; Vialas, Vital; Prieto, Gorka; García, Carlos; Corrales, Fernando J; Albar, Juan Pablo; Pascual-Montano, Alberto
2015-09-04
dasHPPboard is a novel proteomics-based dashboard that collects and reports the experiments produced by the Spanish Human Proteome Project consortium (SpHPP) and aims to help HPP to map the entire human proteome. We have followed the strategy of analog genomics projects like the Encyclopedia of DNA Elements (ENCODE), which provides a vast amount of data on human cell lines experiments. The dashboard includes results of shotgun and selected reaction monitoring proteomics experiments, post-translational modifications information, as well as proteogenomics studies. We have also processed the transcriptomics data from the ENCODE and Human Body Map (HBM) projects for the identification of specific gene expression patterns in different cell lines and tissues, taking special interest in those genes having little proteomic evidence available (missing proteins). Peptide databases have been built using single nucleotide variants and novel junctions derived from RNA-Seq data that can be used in search engines for sample-specific protein identifications on the same cell lines or tissues. The dasHPPboard has been designed as a tool that can be used to share and visualize a combination of proteomic and transcriptomic data, providing at the same time easy access to resources for proteogenomics analyses. The dasHPPboard can be freely accessed at: http://sphppdashboard.cnb.csic.es.
Genome-Wide Identification of Molecular Mimicry Candidates in Parasites
Ludin, Philipp; Nilsson, Daniel; Mäser, Pascal
2011-01-01
Among the many strategies employed by parasites for immune evasion and host manipulation, one of the most fascinating is molecular mimicry. With genome sequences available for host and parasite, mimicry of linear amino acid epitopes can be investigated by comparative genomics. Here we developed an in silico pipeline for genome-wide identification of molecular mimicry candidate proteins or epitopes. The predicted proteome of a given parasite was broken down into overlapping fragments, each of which was screened for close hits in the human proteome. Control searches were carried out against unrelated, free-living eukaryotes to eliminate the generally conserved proteins, and with randomized versions of the parasite proteins to get an estimate of statistical significance. This simple but computation-intensive approach yielded interesting candidates from human-pathogenic parasites. From Plasmodium falciparum, it returned a 14 amino acid motif in several of the PfEMP1 variants identical to part of the heparin-binding domain in the immunosuppressive serum protein vitronectin. And in Brugia malayi, fragments were detected that matched to periphilin-1, a protein of cell-cell junctions involved in barrier formation. All the results are publicly available by means of mimicDB, a searchable online database for molecular mimicry candidates from pathogens. To our knowledge, this is the first genome-wide survey for molecular mimicry proteins in parasites. The strategy can be adopted to any pair of host and pathogen, once appropriate negative control organisms are chosen. MimicDB provides a host of new starting points to gain insights into the molecular nature of host-pathogen interactions. PMID:21408160
LFQuant: a label-free fast quantitative analysis tool for high-resolution LC-MS/MS proteomics data.
Zhang, Wei; Zhang, Jiyang; Xu, Changming; Li, Ning; Liu, Hui; Ma, Jie; Zhu, Yunping; Xie, Hongwei
2012-12-01
Database searching based methods for label-free quantification aim to reconstruct the peptide extracted ion chromatogram based on the identification information, which can limit the search space and thus make the data processing much faster. The random effect of the MS/MS sampling can be remedied by cross-assignment among different runs. Here, we present a new label-free fast quantitative analysis tool, LFQuant, for high-resolution LC-MS/MS proteomics data based on database searching. It is designed to accept raw data in two common formats (mzXML and Thermo RAW), and database search results from mainstream tools (MASCOT, SEQUEST, and X!Tandem), as input data. LFQuant can handle large-scale label-free data with fractionation such as SDS-PAGE and 2D LC. It is easy to use and provides handy user interfaces for data loading, parameter setting, quantitative analysis, and quantitative data visualization. LFQuant was compared with two common quantification software packages, MaxQuant and IDEAL-Q, on the replication data set and the UPS1 standard data set. The results show that LFQuant performs better than them in terms of both precision and accuracy, and consumes significantly less processing time. LFQuant is freely available under the GNU General Public License v3.0 at http://sourceforge.net/projects/lfquant/. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
DBGC: A Database of Human Gastric Cancer
Wang, Chao; Zhang, Jun; Cai, Mingdeng; Zhu, Zhenggang; Gu, Wenjie; Yu, Yingyan; Zhang, Xiaoyan
2015-01-01
The Database of Human Gastric Cancer (DBGC) is a comprehensive database that integrates various human gastric cancer-related data resources. Human gastric cancer-related transcriptomics projects, proteomics projects, mutations, biomarkers and drug-sensitive genes from different sources were collected and unified in this database. Moreover, epidemiological statistics of gastric cancer patients in China and clinicopathological information annotated with gastric cancer cases were also integrated into the DBGC. We believe that this database will greatly facilitate research regarding human gastric cancer in many fields. DBGC is freely available at http://bminfor.tongji.edu.cn/dbgc/index.do PMID:26566288
Plants versus Fungi and Oomycetes: Pathogenesis, Defense and Counter-Defense in the Proteomics Era
El Hadrami, Abdelbasset; El-Bebany, Ahmed F.; Yao, Zhen; Adam, Lorne R.; El Hadrami, Ismailx; Daayf, Fouad
2012-01-01
Plant-fungi and plant-oomycete interactions have been studied at the proteomic level for many decades. However, it is only in the last few years, with the development of new approaches, combined with bioinformatics data mining tools, gel staining, and analytical instruments, such as 2D-PAGE/nanoflow-LC-MS/MS, that proteomic approaches thrived. They allow screening and analysis, at the sub-cellular level, of peptides and proteins resulting from plants, pathogens, and their interactions. They also highlight post-translational modifications to proteins, e.g., glycosylation, phosphorylation or cleavage. However, many challenges are encountered during in planta studies aimed at stressing details of host defenses and fungal and oomycete pathogenicity determinants during interactions. Dissecting the mechanisms of such host-pathogen systems, including pathogen counter-defenses, will ensure a step ahead towards understanding current outcomes of interactions from a co-evolutionary point of view, and eventually move a step forward in building more durable strategies for management of diseases caused by fungi and oomycetes. Unraveling intricacies of more complex proteomic interactions that involve additional microbes, i.e., PGPRs and symbiotic fungi, which strengthen plant defenses will generate valuable information on how pathosystems actually function in nature, and thereby provide clues to solving disease problems that engender major losses in crops every year. PMID:22837691
Plants versus fungi and oomycetes: pathogenesis, defense and counter-defense in the proteomics era.
El Hadrami, Abdelbasset; El-Bebany, Ahmed F; Yao, Zhen; Adam, Lorne R; El Hadrami, Ismailx; Daayf, Fouad
2012-01-01
Plant-fungi and plant-oomycete interactions have been studied at the proteomic level for many decades. However, it is only in the last few years, with the development of new approaches, combined with bioinformatics data mining tools, gel staining, and analytical instruments, such as 2D-PAGE/nanoflow-LC-MS/MS, that proteomic approaches thrived. They allow screening and analysis, at the sub-cellular level, of peptides and proteins resulting from plants, pathogens, and their interactions. They also highlight post-translational modifications to proteins, e.g., glycosylation, phosphorylation or cleavage. However, many challenges are encountered during in planta studies aimed at stressing details of host defenses and fungal and oomycete pathogenicity determinants during interactions. Dissecting the mechanisms of such host-pathogen systems, including pathogen counter-defenses, will ensure a step ahead towards understanding current outcomes of interactions from a co-evolutionary point of view, and eventually move a step forward in building more durable strategies for management of diseases caused by fungi and oomycetes. Unraveling intricacies of more complex proteomic interactions that involve additional microbes, i.e., PGPRs and symbiotic fungi, which strengthen plant defenses will generate valuable information on how pathosystems actually function in nature, and thereby provide clues to solving disease problems that engender major losses in crops every year.
The role of proteomics in studies of protein moonlighting.
Beynon, Robert J; Hammond, Dean; Harman, Victoria; Woolerton, Yvonne
2014-12-01
The increasing acceptance that proteins may exert multiple functions in the cell brings with it new analytical challenges that will have an impact on the field of proteomics. Many proteomics workflows begin by destroying information about the interactions between different proteins, and the reduction of a complex protein mixture to constituent peptides also scrambles information about the combinatorial potential of post-translational modifications. To bring the focus of proteomics on to the domain of protein moonlighting will require novel analytical and quantitative approaches.
Yu, Yanbao; Leng, Taohua; Yun, Dong; Liu, Na; Yao, Jun; Dai, Ying; Yang, Pengyuan; Chen, Xian
2013-01-01
Emerging evidences indicate that blood platelets function in multiple biological processes including immune response, bone metastasis and liver regeneration in addition to their known roles in hemostasis and thrombosis. Global elucidation of platelet proteome will provide the molecular base of these platelet functions. Here, we set up a high throughput platform for maximum exploration of the rat/human platelet proteome using integrated proteomics technologies, and then applied to identify the largest number of the proteins expressed in both rat and human platelets. After stringent statistical filtration, a total of 837 unique proteins matched with at least two unique peptides were precisely identified, making it the first comprehensive protein database so far for rat platelets. Meanwhile, quantitative analyses of the thrombin-stimulated platelets offered great insights into the biological functions of platelet proteins and therefore confirmed our global profiling data. A comparative proteomic analysis between rat and human platelets was also conducted, which revealed not only a significant similarity, but also an across-species evolutionary link that the orthologous proteins representing ‘core proteome’, and the ‘evolutionary proteome’ is actually a relatively static proteome. PMID:20443191
Predicting PDZ domain mediated protein interactions from structure
2013-01-01
Background PDZ domains are structural protein domains that recognize simple linear amino acid motifs, often at protein C-termini, and mediate protein-protein interactions (PPIs) in important biological processes, such as ion channel regulation, cell polarity and neural development. PDZ domain-peptide interaction predictors have been developed based on domain and peptide sequence information. Since domain structure is known to influence binding specificity, we hypothesized that structural information could be used to predict new interactions compared to sequence-based predictors. Results We developed a novel computational predictor of PDZ domain and C-terminal peptide interactions using a support vector machine trained with PDZ domain structure and peptide sequence information. Performance was estimated using extensive cross validation testing. We used the structure-based predictor to scan the human proteome for ligands of 218 PDZ domains and show that the predictions correspond to known PDZ domain-peptide interactions and PPIs in curated databases. The structure-based predictor is complementary to the sequence-based predictor, finding unique known and novel PPIs, and is less dependent on training–testing domain sequence similarity. We used a functional enrichment analysis of our hits to create a predicted map of PDZ domain biology. This map highlights PDZ domain involvement in diverse biological processes, some only found by the structure-based predictor. Based on this analysis, we predict novel PDZ domain involvement in xenobiotic metabolism and suggest new interactions for other processes including wound healing and Wnt signalling. Conclusions We built a structure-based predictor of PDZ domain-peptide interactions, which can be used to scan C-terminal proteomes for PDZ interactions. We also show that the structure-based predictor finds many known PDZ mediated PPIs in human that were not found by our previous sequence-based predictor and is less dependent on training–testing domain sequence similarity. Using both predictors, we defined a functional map of human PDZ domain biology and predict novel PDZ domain function. Users may access our structure-based and previous sequence-based predictors at http://webservice.baderlab.org/domains/POW. PMID:23336252
van Herwijnen, Martijn J C; Zonneveld, Marijke I; Goerdayal, Soenita; Nolte-'t Hoen, Esther N M; Garssen, Johan; Stahl, Bernd; Maarten Altelaar, A F; Redegeld, Frank A; Wauben, Marca H M
2016-11-01
Breast milk contains several macromolecular components with distinctive functions, whereby milk fat globules and casein micelles mainly provide nutrition to the newborn, and whey contains molecules that can stimulate the newborn's developing immune system and gastrointestinal tract. Although extracellular vesicles (EV) have been identified in breast milk, their physiological function and composition has not been addressed in detail. EV are submicron sized vehicles released by cells for intercellular communication via selectively incorporated lipids, nucleic acids, and proteins. Because of the difficulty in separating EV from other milk components, an in-depth analysis of the proteome of human milk-derived EV is lacking. In this study, an extensive LC-MS/MS proteomic analysis was performed of EV that had been purified from breast milk of seven individual donors using a recently established, optimized density-gradient-based EV isolation protocol. A total of 1963 proteins were identified in milk-derived EV, including EV-associated proteins like CD9, Annexin A5, and Flotillin-1, with a remarkable overlap between the different donors. Interestingly, 198 of the identified proteins are not present in the human EV database Vesiclepedia, indicating that milk-derived EV harbor proteins not yet identified in EV of different origin. Similarly, the proteome of milk-derived EV was compared with that of other milk components. For this, data from 38 published milk proteomic studies were combined in order to construct the total milk proteome, which consists of 2698 unique proteins. Remarkably, 633 proteins identified in milk-derived EV have not yet been identified in human milk to date. Interestingly, these novel proteins include proteins involved in regulation of cell growth and controlling inflammatory signaling pathways, suggesting that milk-derived EVs could support the newborn's developing gastrointestinal tract and immune system. Overall, this study provides an expansion of the whole milk proteome and illustrates that milk-derived EV are macromolecular components with a unique functional proteome. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
The CENP-T/-W complex is a binding partner of the histone chaperone FACT
Prendergast, Lisa; Müller, Sebastian; Liu, Yiwei; Huang, Hongda; Dingli, Florent; Loew, Damarys; Vassias, Isabelle; Patel, Dinshaw J.; Sullivan, Kevin F.; Almouzni, Geneviève
2016-01-01
The CENP-T/-W histone fold complex, as an integral part of the inner kinetochore, is essential for building a proper kinetochore at the centromere in order to direct chromosome segregation during mitosis. Notably, CENP-T/-W is not inherited at centromeres, and new deposition is absolutely required at each cell cycle for kinetochore function. However, the mechanisms underlying this new deposition of CENP-T/-W at centromeres are unclear. Here, we found that CENP-T deposition at centromeres is uncoupled from DNA synthesis. We identified Spt16 and SSRP1, subunits of the H2A–H2B histone chaperone facilitates chromatin transcription (FACT), as CENP-W binding partners through a proteomic screen. We found that the C-terminal region of Spt16 binds specifically to the histone fold region of CENP-T/-W. Furthermore, depletion of Spt16 impairs CENP-T and CENP-W deposition at endogenous centromeres, and site-directed targeting of Spt16 alone is sufficient to ensure local de novo CENP-T accumulation. We propose a model in which the FACT chaperone stabilizes the soluble CENP-T/-W complex in the cell and promotes dynamics of exchange, enabling CENP-T/-W deposition at centromeres. PMID:27284163
The CENP-T/-W complex is a binding partner of the histone chaperone FACT.
Prendergast, Lisa; Müller, Sebastian; Liu, Yiwei; Huang, Hongda; Dingli, Florent; Loew, Damarys; Vassias, Isabelle; Patel, Dinshaw J; Sullivan, Kevin F; Almouzni, Geneviève
2016-06-01
The CENP-T/-W histone fold complex, as an integral part of the inner kinetochore, is essential for building a proper kinetochore at the centromere in order to direct chromosome segregation during mitosis. Notably, CENP-T/-W is not inherited at centromeres, and new deposition is absolutely required at each cell cycle for kinetochore function. However, the mechanisms underlying this new deposition of CENP-T/-W at centromeres are unclear. Here, we found that CENP-T deposition at centromeres is uncoupled from DNA synthesis. We identified Spt16 and SSRP1, subunits of the H2A-H2B histone chaperone facilitates chromatin transcription (FACT), as CENP-W binding partners through a proteomic screen. We found that the C-terminal region of Spt16 binds specifically to the histone fold region of CENP-T/-W. Furthermore, depletion of Spt16 impairs CENP-T and CENP-W deposition at endogenous centromeres, and site-directed targeting of Spt16 alone is sufficient to ensure local de novo CENP-T accumulation. We propose a model in which the FACT chaperone stabilizes the soluble CENP-T/-W complex in the cell and promotes dynamics of exchange, enabling CENP-T/-W deposition at centromeres. © 2016 Prendergast et al.; Published by Cold Spring Harbor Laboratory Press.
Maver, Ales; Medica, Igor; Peterlin, Borut
2009-12-01
The search for gene candidates in multifactorial diseases such as sarcoidosis can be based on the integration of linkage association data, gene expression data, and protein profile data from genomic, transcriptomic and proteomic studies, respectively. In this study we performed a literature-based search for studies reporting such data, followed by integration of collected information. Different databases were examined--Medline, HugGE Navigator, ArrayExpress and Gene Expression Omnibus (GEO). Candidate genes were defined as genes which were reported in at least 2 different types of omics studies. Genes previously investigated in sarcoidosis were excluded from further analyses. We identified 177 genes associated with sarcoidosis as potential new candidate genes. Subsequently, 9 gene candidates identified to overlap in 2 different types of studies (genomic, transcriptomic and/or proteomic) were consistently reported in at least 3 studies: SERPINB1, FABP4, S100A8, HBEGF, IL7R, LRIG1, PTPN23, DPM2 and NUP214. These genes are involved in regulation of immune response, cellular proliferation, apoptosis, inhibition of protease activity, lipid metabolism. Exact biological functions of HBEGF, LRIG1, PTPN23, DPM2 and NUP214 remain to be completely elucidated. We propose 9 candidate genes: SERPINB1, FABP4, S100A8, HBEGF, IL7R, LRIG1, PTPN23, DPM2 and NUP214, as genes with high potential for association with sarcoidosis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Boo-Ja; Kwon, Sun Jae; Kim, Sung-Kyu
Two-dimensional gel electrophoresis (2-DE) was applied for the screening of Tobacco mosaic virus (TMV)-induced hot pepper (Capsicum annuum cv. Bugang) nuclear proteins. From differentially expressed protein spots, we acquired the matched peptide mass fingerprint (PMF) data, analyzed by MALDI-TOF MS, from the non-redundant hot pepper EST protein FASTA database using the VEMS 2.0 software. Among six identified nuclear proteins, the hot pepper 26S proteasome subunit RPN7 (CaRPN7) was subjected to further study. The level of CaRPN7 mRNA was specifically increased during incompatible TMV-P{sub 0} interaction, but not during compatible TMV-P{sub 1.2} interaction. When CaRPN7::GFP fusion protein was targeted in onionmore » cells, the nuclei had been broken into pieces. In the hot pepper leaves, cell death was exacerbated and genomic DNA laddering was induced by Agrobacterium-mediated transient overexpression of CaPRN7. Thus, this report presents that the TMV-induced CaRPN7 may be involved in programmed cell death (PCD) in the hot pepper plant.« less
Giguère, Sophie S. B.; Guise, Amanda J.; Jean Beltran, Pierre M.; Joshi, Preeti M.; Greco, Todd M.; Quach, Olivia L.; Kong, Jeffery; Cristea, Ileana M.
2016-01-01
Deleted in breast cancer 1 (DBC1) has emerged as an important regulator of multiple cellular processes, ranging from gene expression to cell cycle progression. DBC1 has been linked to tumorigenesis both as an inhibitor of histone deacetylases, HDAC3 and sirtuin 1, and as a transcriptional cofactor for nuclear hormone receptors. However, despite mounting interest in DBC1, relatively little is known about the range of its interacting partners and the scope of its functions. Here, we carried out a functional proteomics-based investigation of DBC1 interactions in two relevant cell types, T cells and kidney cells. Microscopy, molecular biology, biochemistry, and mass spectrometry studies allowed us to assess DBC1 mRNA and protein levels, localization, phosphorylation status, and protein interaction networks. The comparison of DBC1 interactions in these cell types revealed conserved regulatory roles for DBC1 in gene expression, chromatin organization and modification, and cell cycle progression. Interestingly, we observe previously unrecognized DBC1 interactions with proteins encoded by cancer-associated genes. Among these interactions are five components of the SWI/SNF complex, the most frequently mutated chromatin remodeling complex in human cancers. Additionally, we identified a DBC1 interaction with TBL1XR1, a component of the NCoR complex, which we validated by reciprocal isolation. Strikingly, we discovered that DBC1 associates with proteins that regulate the circadian cycle, including DDX5, DHX9, and SFPQ. We validated this interaction by colocalization and reciprocal isolation. Functional assessment of this association demonstrated that DBC1 protein levels are important for regulating CLOCK and BMAL1 protein oscillations in synchronized T cells. Our results suggest that DBC1 is integral to the maintenance of the circadian molecular clock. Furthermore, the identified interactions provide a valuable resource for the exploration of pathways involved in DBC1-associated tumorigenesis. PMID:26657080
Liu, Suli; Im, Hogune; Bairoch, Amos; Cristofanilli, Massimo; Chen, Rui; Deutsch, Eric W; Dalton, Stephen; Fenyo, David; Fanayan, Susan; Gates, Chris; Gaudet, Pascale; Hincapie, Marina; Hanash, Samir; Kim, Hoguen; Jeong, Seul-Ki; Lundberg, Emma; Mias, George; Menon, Rajasree; Mu, Zhaomei; Nice, Edouard; Paik, Young-Ki; Uhlen, Mathias; Wells, Lance; Wu, Shiaw-Lin; Yan, Fangfei; Zhang, Fan; Zhang, Yue; Snyder, Michael; Omenn, Gilbert S; Beavis, Ronald C; Hancock, William S
2013-01-04
We report progress assembling the parts list for chromosome 17 and illustrate the various processes that we have developed to integrate available data from diverse genomic and proteomic knowledge bases. As primary resources, we have used GPMDB, neXtProt, PeptideAtlas, Human Protein Atlas (HPA), and GeneCards. All sites share the common resource of Ensembl for the genome modeling information. We have defined the chromosome 17 parts list with the following information: 1169 protein-coding genes, the numbers of proteins confidently identified by various experimental approaches as documented in GPMDB, neXtProt, PeptideAtlas, and HPA, examples of typical data sets obtained by RNASeq and proteomic studies of epithelial derived tumor cell lines (disease proteome) and a normal proteome (peripheral mononuclear cells), reported evidence of post-translational modifications, and examples of alternative splice variants (ASVs). We have constructed a list of the 59 "missing" proteins as well as 201 proteins that have inconclusive mass spectrometric (MS) identifications. In this report we have defined a process to establish a baseline for the incorporation of new evidence on protein identification and characterization as well as related information from transcriptome analyses. This initial list of "missing" proteins that will guide the selection of appropriate samples for discovery studies as well as antibody reagents. Also we have illustrated the significant diversity of protein variants (including post-translational modifications, PTMs) using regions on chromosome 17 that contain important oncogenes. We emphasize the need for mandated deposition of proteomics data in public databases, the further development of improved PTM, ASV, and single nucleotide variant (SNV) databases, and the construction of Web sites that can integrate and regularly update such information. In addition, we describe the distribution of both clustered and scattered sets of protein families on the chromosome. Since chromosome 17 is rich in cancer-associated genes, we have focused the clustering of cancer-associated genes in such genomic regions and have used the ERBB2 amplicon as an example of the value of a proteogenomic approach in which one integrates transcriptomic with proteomic information and captures evidence of coexpression through coordinated regulation.
Gugiu, Gabriel B
2017-01-01
Lipidomics refers to the large-scale study of lipids in biological systems (Wenk, Nat Rev Drug Discov 4(7):594-610, 2005; Rolim et al., Gene 554(2):131-139, 2015). From a mass spectrometric point of view, by lipidomics we understand targeted or untargeted mass spectrometric analysis of lipids using either liquid chromatography (LC) (Castro-Perez et al., J Proteome Res 9(5):2377-2389, 2010) or shotgun (Han and Gross, Mass Spectrom Rev 24(3):367-412, 2005) approaches coupled with tandem mass spectrometry. This chapter describes the former methodology, which is becoming rapidly the preferred method for lipid identification owing to similarities with established omics workflows, such as proteomics (Washburn et al., Nat Biotechnol 19(3):242-247, 2001) or genomics (Yadav, J Biomol Tech: JBT 18(5):277, 2007). The workflow described consists in lipid extraction using a modified Bligh and Dyer method (Bligh and Dyer, Can J Biochem Physiol 37(8):911-917, 1959), ultra high pressure liquid chromatography fractionation of lipid samples on a reverse phase C18 column, followed by tandem mass spectrometric analysis and in silico database search for lipid identification based on MSMS spectrum matching (Kind et al., Nat Methods 10(8):755-758, 2013; Yamada et al., J Chromatogr A 1292:211-218, 2013; Taguchi and Ishikawa, J Chromatogr A 1217(25):4229-4239, 2010; Peake et al., Thermoscientifices 1-3, 2015) and accurate mass of parent ion (Sud et al., Nucleic Acids Res 35(database issue):D527-D532, 2007; Wishart et al., Nucleic Acids Res 35(database):D521-D526, 2007).
Fahrmann, Johannes F.; Grapov, Dmitry; Wanichthanarak, Kwanjeera; DeFelice, Brian C.; Salemi, Michelle R.; Rom, William N.; Gandara, David R.; Phinney, Brett S.; Fiehn, Oliver; Pass, Harvey
2017-01-01
Abstract Lung cancer is the leading cause of cancer mortality in the United States with non-small cell lung cancer adenocarcinoma being the most common histological type. Early perturbations in cellular metabolism are a hallmark of cancer, but the extent of these changes in early stage lung adenocarcinoma remains largely unknown. In the current study, an integrated metabolomics and proteomics approach was utilized to characterize the biochemical and molecular alterations between malignant and matched control tissue from 27 subjects diagnosed with early stage lung adenocarcinoma. Differential analysis identified 71 metabolites and 1102 proteins that delineated tumor from control tissue. Integrated results indicated four major metabolic changes in early stage adenocarcinoma (1): increased glycosylation and glutaminolysis (2); elevated Nrf2 activation (3); increase in nicotinic and nicotinamide salvaging pathways and (4) elevated polyamine biosynthesis linked to differential regulation of the s-adenosylmethionine/nicotinamide methyl-donor pathway. Genomic data from publicly available databases were included to strengthen proteomic findings. Our findings provide insight into the biochemical and molecular biological reprogramming that may accompany early stage lung tumorigenesis and highlight potential therapeutic targets. PMID:28049629
Kim, Sang Hoon; Pajarillo, Edward Alain B; Balolong, Marilen P; Lee, Ji Yoon; Kang, Dae-Kyung
2016-06-28
In this study, the global proteome of the IPEC-J2 cell line was evaluated using ultra-high performance liquid chromatography coupled to a quadrupole Q Exactive™ Orbitrap mass spectrometer. Proteins were isolated from highly confluent IPEC-J2 cells in biological replicates and analyzed by label-free mass spectrometry prior to matching against a porcine genomic dataset. The results identified 1,517 proteins, accounting for 7.35% of all genes in the porcine genome. The highly abundant proteins detected, such as actin, annexin A2, and AHNAK nucleoprotein, are involved in structural integrity, signaling mechanisms, and cellular homeostasis. The high abundance of heat shock proteins indicated their significance in cellular defenses, barrier function, and gut homeostasis. Pathway analysis and annotation using the Kyoto Encyclopedia of Genes and Genomes database resulted in a putative protein network map of the regulation of immunological responses and structural integrity in the cell line. The comprehensive proteome analysis of IPEC-J2 cells provides fundamental insights into overall protein expression and pathway dynamics that might be useful in cell adhesion studies and immunological applications.
Hao, J H; Dong, C J; Zhang, Z G; Wang, X L; Shang, Q M
2012-05-01
To investigate the response of cucumber seedlings to exogenous salicylic acid (SA) and gain a better understanding of SA action mechanism, we generated a proteomic profile of cucumber (Cucumis sativus L.) cotyledons treated with exogenous SA. Analysis of 1500 protein spots from each gel revealed 63 differentially expressed proteins, 59 of which were identified successfully. Of the identified proteins, 97% matched cucumber proteins using a whole cucumber protein database based on the newly completed genome established by our laboratory. The identified proteins were involved in various cellular responses and metabolic processes, including antioxidative reactions, cell defense, photosynthesis, carbohydrate metabolism, respiration and energy homeostasis, protein folding and biosynthesis. The two largest functional categories included proteins involved in antioxidative reactions (23.7%) and photosynthesis (18.6%). Furthermore, the SA-responsive protein interaction network revealed 13 key proteins, suggesting that the expression changes of these proteins could be critical for SA-induced resistance. An analysis of these changes suggested that SA-induced resistance and seedling growth might be regulated in part through pathways involving antioxidative reactions and photosynthesis. © 2012 Elsevier Ireland Ltd. All rights reserved.
Fusarium graminearum and Its Interactions with Cereal Heads: Studies in the Proteomics Era
Yang, Fen; Jacobsen, Susanne; Jørgensen, Hans J. L.; Collinge, David B.; Svensson, Birte; Finnie, Christine
2013-01-01
The ascomycete fungal pathogen Fusarium graminearum (teleomorph stage: Gibberella zeae) is the causal agent of Fusarium head blight in wheat and barley. This disease leads to significant losses of crop yield, and especially quality through the contamination by diverse fungal mycotoxins, which constitute a significant threat to the health of humans and animals. In recent years, high-throughput proteomics, aiming at identifying a broad spectrum of proteins with a potential role in the pathogenicity and host resistance, has become a very useful tool in plant-fungus interaction research. In this review, we describe the progress in proteomics applications toward a better understanding of F. graminearum pathogenesis, virulence, and host defense mechanisms. The contribution of proteomics to the development of crop protection strategies against this pathogen is also discussed briefly. PMID:23450732
A peptide resource for the analysis of Staphylococcus aureus in host pathogen interaction studies
Depke, Maren; Michalik, Stephan; Rabe, Alexander; Surmann, Kristin; Brinkmann, Lars; Jehmlich, Nico; Bernhardt, Jörg; Hecker, Michael; Wollscheid, Bernd; Sun, Zhi; Moritz, Robert L.; Völker, Uwe; Schmidt, Frank
2016-01-01
Staphylococcus aureus is an opportunistic human pathogen, which can cause life-threatening disease. Proteome analyses of the bacterium can provide new insights into its pathophysiology and important facets of metabolic adaptation and, thus, aid the recognition of targets for intervention. However, the value of such proteome studies increases with their comprehensiveness. We present an MS–driven, proteome-wide characterization of the strain S. aureus HG001. Combining 144 high precision proteomic data sets, we identified 19 109 peptides from 2088 distinct S. aureus HG001 proteins, which account for 72% of the predicted ORFs. Peptides were further characterized concerning pI, GRAVY, and detectability scores in order to understand the low peptide coverage of 8.7% (19 109 out of 220 245 theoretical peptides). The high quality peptide-centric spectra have been organized into a comprehensive peptide fragmentation library (SpectraST) and used for identification of S. aureus-typic peptides in highly complex host–pathogen interaction experiments, which significantly improved the number of identified S. aureus proteins compared to a MASCOT search. This effort now allows the elucidation of crucial pathophysiological questions in S. aureus-specific host–pathogen interaction studies through comprehensive proteome analysis. The S. aureus-specific spectra resource developed here also represents an important spectral repository for SRM or for data-independent acquisition MS approaches. All MS data have been deposited in the ProteomeXchange with identifier PXD000702 (http://proteomecentral.proteomexchange.org/dataset/PXD000702). PMID:26224020
A peptide resource for the analysis of Staphylococcus aureus in host-pathogen interaction studies.
Depke, Maren; Michalik, Stephan; Rabe, Alexander; Surmann, Kristin; Brinkmann, Lars; Jehmlich, Nico; Bernhardt, Jörg; Hecker, Michael; Wollscheid, Bernd; Sun, Zhi; Moritz, Robert L; Völker, Uwe; Schmidt, Frank
2015-11-01
Staphylococcus aureus is an opportunistic human pathogen, which can cause life-threatening disease. Proteome analyses of the bacterium can provide new insights into its pathophysiology and important facets of metabolic adaptation and, thus, aid the recognition of targets for intervention. However, the value of such proteome studies increases with their comprehensiveness. We present an MS-driven, proteome-wide characterization of the strain S. aureus HG001. Combining 144 high precision proteomic data sets, we identified 19 109 peptides from 2088 distinct S. aureus HG001 proteins, which account for 72% of the predicted ORFs. Peptides were further characterized concerning pI, GRAVY, and detectability scores in order to understand the low peptide coverage of 8.7% (19 109 out of 220 245 theoretical peptides). The high quality peptide-centric spectra have been organized into a comprehensive peptide fragmentation library (SpectraST) and used for identification of S. aureus-typic peptides in highly complex host-pathogen interaction experiments, which significantly improved the number of identified S. aureus proteins compared to a MASCOT search. This effort now allows the elucidation of crucial pathophysiological questions in S. aureus-specific host-pathogen interaction studies through comprehensive proteome analysis. The S. aureus-specific spectra resource developed here also represents an important spectral repository for SRM or for data-independent acquisition MS approaches. All MS data have been deposited in the ProteomeXchange with identifier PXD000702 (http://proteomecentral.proteomexchange.org/dataset/PXD000702). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Tuncbag, Nurcan; Gursoy, Attila; Nussinov, Ruth; Keskin, Ozlem
2011-08-11
Prediction of protein-protein interactions at the structural level on the proteome scale is important because it allows prediction of protein function, helps drug discovery and takes steps toward genome-wide structural systems biology. We provide a protocol (termed PRISM, protein interactions by structural matching) for large-scale prediction of protein-protein interactions and assembly of protein complex structures. The method consists of two components: rigid-body structural comparisons of target proteins to known template protein-protein interfaces and flexible refinement using a docking energy function. The PRISM rationale follows our observation that globally different protein structures can interact via similar architectural motifs. PRISM predicts binding residues by using structural similarity and evolutionary conservation of putative binding residue 'hot spots'. Ultimately, PRISM could help to construct cellular pathways and functional, proteome-scale annotation. PRISM is implemented in Python and runs in a UNIX environment. The program accepts Protein Data Bank-formatted protein structures and is available at http://prism.ccbb.ku.edu.tr/prism_protocol/.
Proteomic analysis of PSD-93 knockout mice following the induction of ischemic cerebral injury.
Rong, Rong; Yang, Hui; Rong, Liangqun; Wei, Xiue; Li, Qingjie; Liu, Xiaomei; Gao, Hong; Xu, Yun; Zhang, Qingxiu
2016-03-01
Postsynaptic density protein-93 (PSD-93) is enriched in the postsynaptic density and is involved in N-methyl-d-aspartate receptor (NMDAR) triggered neurotoxicity through PSD-93/NMDAR/nNOS signaling pathway. In the present study, we found that PSD-93 deficiency reduced infarcted volume and neurological deficits induced by transient middle cerebral artery occlusion (tMCAO) in the mice. To identify novel targets of PSD-93 related neurotoxicity, we applied isobaric tags for relative and absolute quantitative (iTRAQ) labeling and combined this labeling with on-line two-dimensional LC/MS/MS technology to elucidate the changes in protein expression in PSD-93 knockout mice following tMCAO. The proteomic data set consisted of 1892 proteins. Compared to control group, differences in expression levels in ischemic group >1.5-fold and <0.66-fold were considered as differential expression. A total of 104 unique proteins with differential abundance levels were identified, among which 17 proteins were selected for further validation. Gene ontology analysis using UniProt database revealed that these differentially expressed proteins are involved in diverse function such as synaptic transmission, neuronal neurotransmitter and ion transport, modification of organelle membrane components. Moreover, network analysis revealed that the interacting proteins were involved in the transport of synaptic vesicles, the integrity of synaptic membranes and the activation of the ionotropic glutamate receptors NMDAR1 and NMDAR2B. Finally, RT-PCR and Western blot analysis showed that SynGAP, syntaxin-1A, protein kinase C β, and voltage-dependent L-type calcium channels were inhibited by ischemia-reperfusion. Identification of these proteins provides valuable clues to elucidate the mechanisms underlying the actions of PSD-93 in ischemia-reperfusion induced neurotoxicity. Copyright © 2015 Elsevier Inc. All rights reserved.
Metz, Thomas O.; Qian, Wei-Jun; Jacobs, Jon M.; Gritsenko, Marina A.; Moore, Ronald J.; Polpitiya, Ashoka D.; Monroe, Matthew E.; Camp, David G.; Mueller, Patricia W.; Smith, Richard D.
2009-01-01
Novel biomarkers of type 1 diabetes must be identified and validated in initial, exploratory studies before they can be assessed in proficiency evaluations. Currently, untargeted “-omics” approaches are under-utilized in profiling studies of clinical samples. This report describes the evaluation of capillary liquid chromatography (LC) coupled with mass spectrometry (MS) in a pilot proteomic analysis of human plasma and serum from a subset of control and type 1 diabetic individuals enrolled in the Diabetes Autoantibody Standardization Program with the goal of identifying candidate biomarkers of type 1 diabetes. Initial high-resolution capillary LC-MS/MS experiments were performed to augment an existing plasma peptide database, while subsequent LC-FTICR studies identified quantitative differences in the abundance of plasma proteins. Analysis of LC-FTICR proteomic data identified five candidate protein biomarkers of type 1 diabetes. Alpha-2-glycoprotein 1 (zinc), corticosteroid-binding globulin, and lumican were 2-fold up-regulated in type 1 diabetic samples relative to control samples, whereas clusterin and serotransferrin were 2-fold up-regulated in control samples relative to type 1 diabetic samples. Observed perturbations in the levels of all five proteins are consistent with the metabolic aberrations found in type 1 diabetes. While the discovery of these candidate protein biomarkers of type 1 diabetes is encouraging, follow up studies are required for validation in a larger population of individuals and for determination of laboratory-defined sensitivity and specificity values using blinded samples. PMID:18092746
Metz, Thomas O; Qian, Wei-Jun; Jacobs, Jon M; Gritsenko, Marina A; Moore, Ronald J; Polpitiya, Ashoka D; Monroe, Matthew E; Camp, David G; Mueller, Patricia W; Smith, Richard D
2008-02-01
Novel biomarkers of type 1 diabetes must be identified and validated in initial, exploratory studies before they can be assessed in proficiency evaluations. Currently, untargeted "-omics" approaches are underutilized in profiling studies of clinical samples. This report describes the evaluation of capillary liquid chromatography (LC) coupled with mass spectrometry (MS) in a pilot proteomic analysis of human plasma and serum from a subset of control and type 1 diabetic individuals enrolled in the Diabetes Autoantibody Standardization Program, with the goal of identifying candidate biomarkers of type 1 diabetes. Initial high-resolution capillary LC-MS/MS experiments were performed to augment an existing plasma peptide database, while subsequent LC-FTICR studies identified quantitative differences in the abundance of plasma proteins. Analysis of LC-FTICR proteomic data identified five candidate protein biomarkers of type 1 diabetes. alpha-2-Glycoprotein 1 (zinc), corticosteroid-binding globulin, and lumican were 2-fold up-regulated in type 1 diabetic samples relative to control samples, whereas clusterin and serotransferrin were 2-fold up-regulated in control samples relative to type 1 diabetic samples. Observed perturbations in the levels of all five proteins are consistent with the metabolic aberrations found in type 1 diabetes. While the discovery of these candidate protein biomarkers of type 1 diabetes is encouraging, follow up studies are required for validation in a larger population of individuals and for determination of laboratory-defined sensitivity and specificity values using blinded samples.
Steingruber, Mirjam; Kraut, Alexandra; Socher, Eileen; Sticht, Heinrich; Reichel, Anna; Stamminger, Thomas; Amin, Bushra; Couté, Yohann; Hutterer, Corina; Marschall, Manfred
2016-01-01
The human cytomegalovirus (HCMV)-encoded cyclin-dependent kinase (CDK) ortholog pUL97 associates with human cyclin B1 and other types of cyclins. Here, the question was addressed whether cyclin interaction of pUL97 and additional viral proteins is detectable by mass spectrometry-based approaches. Proteomic data were validated by coimmunoprecipitation (CoIP), Western blot, in vitro kinase and bioinformatic analyses. Our findings suggest that: (i) pUL97 shows differential affinities to human cyclins; (ii) pUL97 inhibitor maribavir (MBV) disrupts the interaction with cyclin B1, but not with other cyclin types; (iii) cyclin H is identified as a new high-affinity interactor of pUL97 in HCMV-infected cells; (iv) even more viral phosphoproteins, including all known substrates of pUL97, are detectable in the cyclin-associated complexes; and (v) a first functional validation of pUL97-cyclin B1 interaction, analyzed by in vitro kinase assay, points to a cyclin-mediated modulation of pUL97 substrate preference. In addition, our bioinformatic analyses suggest individual, cyclin-specific binding interfaces for pUL97-cyclin interaction, which could explain the different strengths of interactions and the selective inhibitory effect of MBV on pUL97-cyclin B1 interaction. Combined, the detection of cyclin-associated proteins in HCMV-infected cells suggests a complex pattern of substrate phosphorylation and a role of cyclins in the fine-modulation of pUL97 activities. PMID:27548200
Xie, Zhihui; Li, Jing; Baker, Jonathan; Eagleson, Kathie L.; Coba, Marcelo P.; Levitt, Pat
2016-01-01
Background Atypical synapse development and plasticity are implicated in many neurodevelopmental disorders (NDDs). NDD-associated, high confidence risk genes have been identified, yet little is known about functional relationships at the level of protein-protein interactions, which are the dominant molecular bases responsible for mediating circuit development. Methods Proteomics in three independent developing neocortical synaptosomal preparations identified putative interacting proteins of the ligand-activated MET receptor tyrosine kinase, an autism risk gene that mediates synapse development. The candidates were translated into interactome networks and analyzed bioinformatically. Additionally, three independent quantitative proximity ligation assays (PLA) in cultured neurons and four independent immunoprecipitation analyses of synaptosomes validated protein interactions. Results Approximately 11% (8/72) of MET-interacting proteins, including SHANK3, SYNGAP1 and GRIN2B, are associated with NDDs. Proteins in the MET interactome were translated into a novel MET interactome network based on human protein-protein interaction databases. High confidence genes from different NDD datasets that encode synaptosomal proteins were analyzed for being enriched in MET interactome proteins. This was found for autism, but not schizophrenia, bipolar disorder, major depressive disorder or attentional deficit hyperactivity disorder. There is correlated gene expression between MET and its interactive partners in developing human temporal and visual neocortices, but not with highly expressed genes that are not in the interactome. PLA and biochemical analyses demonstrate that MET-protein partner interactions are dynamically regulated by receptor activation. Conclusions The results provide a novel molecular framework for deciphering the functional relations of key regulators of synaptogenesis that contribute to both typical cortical development and to NDDs. PMID:27086544
Pandya, Nikhil J; Klaassen, Remco V; van der Schors, Roel C; Slotman, Johan A; Houtsmuller, Adriaan; Smit, August B; Li, Ka Wan
2016-10-01
The group 1 metabotropic glutamate receptors 1 and 5 (mGluR1/5) have been implicated in mechanisms of synaptic plasticity and may serve as potential therapeutic targets in autism spectrum disorders. The interactome of group 1 mGluRs has remained largely unresolved. Using a knockout-controlled interaction proteomics strategy we examined the mGluR5 protein complex in two brain regions, hippocampus and cortex, and identified mGluR1 as its major interactor in addition to the well described Homer proteins. We confirmed the presence of mGluR1/5 complex by (i) reverse immunoprecipitation using an mGluR1 antibody to pulldown mGluR5 from hippocampal tissue, (ii) coexpression in HEK293 cells followed by coimmunoprecipitation to reveal the direct interaction of mGluR1 and 5, and (iii) superresolution microscopy imaging of hippocampal primary neurons to show colocalization of the mGluR1/5 in the synapse. © 2016 The Authors. Proteomics Published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
The developmental proteome of Drosophila melanogaster
Casas-Vila, Nuria; Bluhm, Alina; Sayols, Sergi; Dinges, Nadja; Dejung, Mario; Altenhein, Tina; Kappei, Dennis; Altenhein, Benjamin; Roignant, Jean-Yves; Butter, Falk
2017-01-01
Drosophila melanogaster is a widely used genetic model organism in developmental biology. While this model organism has been intensively studied at the RNA level, a comprehensive proteomic study covering the complete life cycle is still missing. Here, we apply label-free quantitative proteomics to explore proteome remodeling across Drosophila’s life cycle, resulting in 7952 proteins, and provide a high temporal-resolved embryogenesis proteome of 5458 proteins. Our proteome data enabled us to monitor isoform-specific expression of 34 genes during development, to identify the pseudogene Cyp9f3Ψ as a protein-coding gene, and to obtain evidence of 268 small proteins. Moreover, the comparison with available transcriptomic data uncovered examples of poor correlation between mRNA and protein, underscoring the importance of proteomics to study developmental progression. Data integration of our embryogenesis proteome with tissue-specific data revealed spatial and temporal information for further functional studies of yet uncharacterized proteins. Overall, our high resolution proteomes provide a powerful resource and can be explored in detail in our interactive web interface. PMID:28381612
Mining biological databases for candidate disease genes
NASA Astrophysics Data System (ADS)
Braun, Terry A.; Scheetz, Todd; Webster, Gregg L.; Casavant, Thomas L.
2001-07-01
The publicly-funded effort to sequence the complete nucleotide sequence of the human genome, the Human Genome Project (HGP), has currently produced more than 93% of the 3 billion nucleotides of the human genome into a preliminary `draft' format. In addition, several valuable sources of information have been developed as direct and indirect results of the HGP. These include the sequencing of model organisms (rat, mouse, fly, and others), gene discovery projects (ESTs and full-length), and new technologies such as expression analysis and resources (micro-arrays or gene chips). These resources are invaluable for the researchers identifying the functional genes of the genome that transcribe and translate into the transcriptome and proteome, both of which potentially contain orders of magnitude more complexity than the genome itself. Preliminary analyses of this data identified approximately 30,000 - 40,000 human `genes.' However, the bulk of the effort still remains -- to identify the functional and structural elements contained within the transcriptome and proteome, and to associate function in the transcriptome and proteome to genes. A fortuitous consequence of the HGP is the existence of hundreds of databases containing biological information that may contain relevant data pertaining to the identification of disease-causing genes. The task of mining these databases for information on candidate genes is a commercial application of enormous potential. We are developing a system to acquire and mine data from specific databases to aid our efforts to identify disease genes. A high speed cluster of Linux of workstations is used to analyze sequence and perform distributed sequence alignments as part of our data mining and processing. This system has been used to mine GeneMap99 sequences within specific genomic intervals to identify potential candidate disease genes associated with Bardet-Biedle Syndrome (BBS).
Human liver proteome project: plan, progress, and perspectives.
He, Fuchu
2005-12-01
The Human Liver Proteome Project is the first initiative of the human proteome project for human organs/tissues and aims at writing a modern Prometheus myth. Its global scientific objectives are to reveal the "solar system" of the human liver proteome, expression profiles, modification profiles, a protein linkage (protein-protein interaction) map, and a proteome localization map, and to define an ORFeome, physiome, and pathome. Since it was first proposed in April 2002, the Human Liver Proteome Project has attracted more than 100 laboratories from all over the world. In the ensuing 3 years, we set up a management infrastructure, identified reference laboratories, confirmed standard operating procedures, initiated international research collaborations, and finally achieved the first set of expression profile data.
Elucidating the fungal stress response by proteomics.
Kroll, Kristin; Pähtz, Vera; Kniemeyer, Olaf
2014-01-31
Fungal species need to cope with stress, both in the natural environment and during interaction of human- or plant pathogenic fungi with their host. Many regulatory circuits governing the fungal stress response have already been discovered. However, there are still large gaps in the knowledge concerning the changes of the proteome during adaptation to environmental stress conditions. With the application of proteomic methods, particularly 2D-gel and gel-free, LC/MS-based methods, first insights into the composition and dynamic changes of the fungal stress proteome could be obtained. Here, we review the recent proteome data generated for filamentous fungi and yeasts. This article is part of a Special Issue entitled: Trends in Microbial Proteomics. Copyright © 2013 Elsevier B.V. All rights reserved.
Rutkowska-Wlodarczyk, Izabela; Aller, M Isabel; Valbuena, Sergio; Bologna, Jean-Charles; Prézeau, Laurent; Lerma, Juan
2015-04-01
Kainate receptors (KARs) are found ubiquitously in the CNS and are present presynaptically and postsynaptically regulating synaptic transmission and excitability. Functional studies have proven that KARs act as ion channels as well as potentially activating G-proteins, thus indicating the existance of a dual signaling system for KARs. Nevertheless, it is not clear how these ion channels activate G-proteins and which of the KAR subunits is involved. Here we performed a proteomic analysis to define proteins that interact with the C-terminal domain of GluK1 and we identified a variety of proteins with many different functions, including a Go α subunit. These interactions were verified through distinct in vitro and in vivo assays, and the activation of the Go protein by GluK1 was validated in bioluminescence resonance energy transfer experiments, while the specificity of this association was confirmed in GluK1-deficient mice. These data reveal components of the KAR interactome, and they show that GluK1 and Go proteins are natural partners, accounting for the metabotropic effects of KARs. Copyright © 2015 the authors 0270-6474/15/355171-09$15.00/0.
Sardiu, Mihaela E; Gilmore, Joshua M; Carrozza, Michael J; Li, Bing; Workman, Jerry L; Florens, Laurence; Washburn, Michael P
2009-10-06
Protein complexes are key molecular machines executing a variety of essential cellular processes. Despite the availability of genome-wide protein-protein interaction studies, determining the connectivity between proteins within a complex remains a major challenge. Here we demonstrate a method that is able to predict the relationship of proteins within a stable protein complex. We employed a combination of computational approaches and a systematic collection of quantitative proteomics data from wild-type and deletion strain purifications to build a quantitative deletion-interaction network map and subsequently convert the resulting data into an interdependency-interaction model of a complex. We applied this approach to a data set generated from components of the Saccharomyces cerevisiae Rpd3 histone deacetylase complexes, which consists of two distinct small and large complexes that are held together by a module consisting of Rpd3, Sin3 and Ume1. The resulting representation reveals new protein-protein interactions and new submodule relationships, providing novel information for mapping the functional organization of a complex.
Using FlyBase, a Database of Drosophila Genes & Genomes
Marygold, Steven J.; Crosby, Madeline A.; Goodman, Joshua L.
2016-01-01
SUMMARY For nearly 25 years, FlyBase (flybase.org) has provided a freely available online database of biological information about Drosophila species, focusing on the model organism D. melanogaster. The need for a centralized, integrated view of Drosophila research has never been greater as advances in genomic, proteomic and high-throughput technologies add to the quantity and diversity of available data and resources. FlyBase has taken several approaches to respond to these changes in the research landscape. Novel report pages have been generated for new reagent types and physical interaction data; Drosophila models of human disease are now represented and showcased in dedicated Human Disease Model Reports; other integrated reports have been established that bring together related genes, datasets or reagents; Gene Reports have been revised to improve access to new data types and to highlight functional data; links to external sites have been organized and expanded; and new tools have been developed to display and interrogate all these data, including improved batch processing and bulk file availability. In addition, several new community initiatives have served to enhance interactions between researchers and FlyBase, resulting in direct user contributions and improved feedback. This chapter provides an overview of the data content, organization and available tools within FlyBase, focusing on recent improvements. We hope it serves as a guide for our diverse user base, enabling efficient and effective exploration of the database and thereby accelerating research discoveries. PMID:27730573
Mosier, Annika C; Justice, Nicholas B; Bowen, Benjamin P; Baran, Richard; Thomas, Brian C; Northen, Trent R; Banfield, Jillian F
2013-03-12
Microorganisms grow under a remarkable range of extreme conditions. Environmental transcriptomic and proteomic studies have highlighted metabolic pathways active in extremophilic communities. However, metabolites directly linked to their physiology are less well defined because metabolomics methods lag behind other omics technologies due to a wide range of experimental complexities often associated with the environmental matrix. We identified key metabolites associated with acidophilic and metal-tolerant microorganisms using stable isotope labeling coupled with untargeted, high-resolution mass spectrometry. We observed >3,500 metabolic features in biofilms growing in pH ~0.9 acid mine drainage solutions containing millimolar concentrations of iron, sulfate, zinc, copper, and arsenic. Stable isotope labeling improved chemical formula prediction by >50% for larger metabolites (>250 atomic mass units), many of which were unrepresented in metabolic databases and may represent novel compounds. Taurine and hydroxyectoine were identified and likely provide protection from osmotic stress in the biofilms. Community genomic, transcriptomic, and proteomic data implicate fungi in taurine metabolism. Leptospirillum group II bacteria decrease production of ectoine and hydroxyectoine as biofilms mature, suggesting that biofilm structure provides some resistance to high metal and proton concentrations. The combination of taurine, ectoine, and hydroxyectoine may also constitute a sulfur, nitrogen, and carbon currency in the communities. Microbial communities are central to many critical global processes and yet remain enigmatic largely due to their complex and distributed metabolic interactions. Metabolomics has the possibility of providing mechanistic insights into the function and ecology of microbial communities. However, our limited knowledge of microbial metabolites, the difficulty of identifying metabolites from complex samples, and the inability to link metabolites directly to community members have proven to be major limitations in developing advances in systems interactions. Here, we show that combining stable-isotope-enabled metabolomics with genomics, transcriptomics, and proteomics can illuminate the ecology of microorganisms at the community scale.
CEBS object model for systems biology data, SysBio-OM.
Xirasagar, Sandhya; Gustafson, Scott; Merrick, B Alex; Tomer, Kenneth B; Stasiewicz, Stanley; Chan, Denny D; Yost, Kenneth J; Yates, John R; Sumner, Susan; Xiao, Nianqing; Waters, Michael D
2004-09-01
To promote a systems biology approach to understanding the biological effects of environmental stressors, the Chemical Effects in Biological Systems (CEBS) knowledge base is being developed to house data from multiple complex data streams in a systems friendly manner that will accommodate extensive querying from users. Unified data representation via a single object model will greatly aid in integrating data storage and management, and facilitate reuse of software to analyze and display data resulting from diverse differential expression or differential profile technologies. Data streams include, but are not limited to, gene expression analysis (transcriptomics), protein expression and protein-protein interaction analysis (proteomics) and changes in low molecular weight metabolite levels (metabolomics). To enable the integration of microarray gene expression, proteomics and metabolomics data in the CEBS system, we designed an object model, Systems Biology Object Model (SysBio-OM). The model is comprehensive and leverages other open source efforts, namely the MicroArray Gene Expression Object Model (MAGE-OM) and the Proteomics Experiment Data Repository (PEDRo) object model. SysBio-OM is designed by extending MAGE-OM to represent protein expression data elements (including those from PEDRo), protein-protein interaction and metabolomics data. SysBio-OM promotes the standardization of data representation and data quality by facilitating the capture of the minimum annotation required for an experiment. Such standardization refines the accuracy of data mining and interpretation. The open source SysBio-OM model, which can be implemented on varied computing platforms is presented here. A universal modeling language depiction of the entire SysBio-OM is available at http://cebs.niehs.nih.gov/SysBioOM/. The Rational Rose object model package is distributed under an open source license that permits unrestricted academic and commercial use and is available at http://cebs.niehs.nih.gov/cebsdownloads. The database and interface are being built to implement the model and will be available for public use at http://cebs.niehs.nih.gov.
2013-01-01
Background Subunit vaccines based on recombinant proteins have been effective in preventing infectious diseases and are expected to meet the demands of future vaccine development. Computational approach, especially reverse vaccinology (RV) method has enormous potential for identification of protein vaccine candidates (PVCs) from a proteome. The existing protective antigen prediction software and web servers have low prediction accuracy leading to limited applications for vaccine development. Besides machine learning techniques, those software and web servers have considered only protein’s adhesin-likeliness as criterion for identification of PVCs. Several non-adhesin functional classes of proteins involved in host-pathogen interactions and pathogenesis are known to provide protection against bacterial infections. Therefore, knowledge of bacterial pathogenesis has potential to identify PVCs. Results A web server, Jenner-Predict, has been developed for prediction of PVCs from proteomes of bacterial pathogens. The web server targets host-pathogen interactions and pathogenesis by considering known functional domains from protein classes such as adhesin, virulence, invasin, porin, flagellin, colonization, toxin, choline-binding, penicillin-binding, transferring-binding, fibronectin-binding and solute-binding. It predicts non-cytosolic proteins containing above domains as PVCs. It also provides vaccine potential of PVCs in terms of their possible immunogenicity by comparing with experimentally known IEDB epitopes, absence of autoimmunity and conservation in different strains. Predicted PVCs are prioritized so that only few prospective PVCs could be validated experimentally. The performance of web server was evaluated against known protective antigens from diverse classes of bacteria reported in Protegen database and datasets used for VaxiJen server development. The web server efficiently predicted known vaccine candidates reported from Streptococcus pneumoniae and Escherichia coli proteomes. The Jenner-Predict server outperformed NERVE, Vaxign and VaxiJen methods. It has sensitivity of 0.774 and 0.711 for Protegen and VaxiJen dataset, respectively while specificity of 0.940 has been obtained for the latter dataset. Conclusions Better prediction accuracy of Jenner-Predict web server signifies that domains involved in host-pathogen interactions and pathogenesis are better criteria for prediction of PVCs. The web server has successfully predicted maximum known PVCs belonging to different functional classes. Jenner-Predict server is freely accessible at http://117.211.115.67/vaccine/home.html PMID:23815072
Genomic atlas of the human plasma proteome.
Sun, Benjamin B; Maranville, Joseph C; Peters, James E; Stacey, David; Staley, James R; Blackshaw, James; Burgess, Stephen; Jiang, Tao; Paige, Ellie; Surendran, Praveen; Oliver-Williams, Clare; Kamat, Mihir A; Prins, Bram P; Wilcox, Sheri K; Zimmerman, Erik S; Chi, An; Bansal, Narinder; Spain, Sarah L; Wood, Angela M; Morrell, Nicholas W; Bradley, John R; Janjic, Nebojsa; Roberts, David J; Ouwehand, Willem H; Todd, John A; Soranzo, Nicole; Suhre, Karsten; Paul, Dirk S; Fox, Caroline S; Plenge, Robert M; Danesh, John; Runz, Heiko; Butterworth, Adam S
2018-06-01
Although plasma proteins have important roles in biological processes and are the direct targets of many drugs, the genetic factors that control inter-individual variation in plasma protein levels are not well understood. Here we characterize the genetic architecture of the human plasma proteome in healthy blood donors from the INTERVAL study. We identify 1,927 genetic associations with 1,478 proteins, a fourfold increase on existing knowledge, including trans associations for 1,104 proteins. To understand the consequences of perturbations in plasma protein levels, we apply an integrated approach that links genetic variation with biological pathway, disease, and drug databases. We show that protein quantitative trait loci overlap with gene expression quantitative trait loci, as well as with disease-associated loci, and find evidence that protein biomarkers have causal roles in disease using Mendelian randomization analysis. By linking genetic factors to diseases via specific proteins, our analyses highlight potential therapeutic targets, opportunities for matching existing drugs with new disease indications, and potential safety concerns for drugs under development.
Chen, Yao-Yi; Dasari, Surendra; Ma, Ze-Qiang; Vega-Montoto, Lorenzo J.; Li, Ming
2013-01-01
Spectral counting has become a widely used approach for measuring and comparing protein abundance in label-free shotgun proteomics. However, when analyzing complex samples, the ambiguity of matching between peptides and proteins greatly affects the assessment of peptide and protein inventories, differentiation, and quantification. Meanwhile, the configuration of database searching algorithms that assign peptides to MS/MS spectra may produce different results in comparative proteomic analysis. Here, we present three strategies to improve comparative proteomics through spectral counting. We show that comparing spectral counts for peptide groups rather than for protein groups forestalls problems introduced by shared peptides. We demonstrate the advantage and flexibility of this new method in two datasets. We present four models to combine four popular search engines that lead to significant gains in spectral counting differentiation. Among these models, we demonstrate a powerful vote counting model that scales well for multiple search engines. We also show that semi-tryptic searching outperforms tryptic searching for comparative proteomics. Overall, these techniques considerably improve protein differentiation on the basis of spectral count tables. PMID:22552787
Chen, Yao-Yi; Dasari, Surendra; Ma, Ze-Qiang; Vega-Montoto, Lorenzo J; Li, Ming; Tabb, David L
2012-09-01
Spectral counting has become a widely used approach for measuring and comparing protein abundance in label-free shotgun proteomics. However, when analyzing complex samples, the ambiguity of matching between peptides and proteins greatly affects the assessment of peptide and protein inventories, differentiation, and quantification. Meanwhile, the configuration of database searching algorithms that assign peptides to MS/MS spectra may produce different results in comparative proteomic analysis. Here, we present three strategies to improve comparative proteomics through spectral counting. We show that comparing spectral counts for peptide groups rather than for protein groups forestalls problems introduced by shared peptides. We demonstrate the advantage and flexibility of this new method in two datasets. We present four models to combine four popular search engines that lead to significant gains in spectral counting differentiation. Among these models, we demonstrate a powerful vote counting model that scales well for multiple search engines. We also show that semi-tryptic searching outperforms tryptic searching for comparative proteomics. Overall, these techniques considerably improve protein differentiation on the basis of spectral count tables.
Al Kindi, Mahmood A; Colella, Alex D; Chataway, Tim K; Jackson, Michael W; Wang, Jing J; Gordon, Tom P
2016-04-01
The structures of epitopes bound by autoantibodies against RNA-protein complexes have been well-defined over several decades, but little is known of the clonality, immunoglobulin (Ig) variable (V) gene usage and mutational status of the autoantibodies themselves at the level of the secreted (serum) proteome. A novel proteomic workflow is presented based on affinity purification of specific Igs from serum, high-resolution two-dimensional gel electrophoresis, and de novo and database-driven sequencing of V-region proteins by mass spectrometry. Analysis of anti-Ro52/Ro60/La proteomes in primary Sjögren's syndrome (SS) and anti-Sm and anti-ribosomal P proteomes in systemic lupus erythematosus (SLE) has revealed that these antibody responses are dominated by restricted sets of public (shared) clonotypes, consistent with common pathways of production across unrelated individuals. The discovery of shared sets of specific V-region peptides can be exploited for diagnostic biomarkers in targeted mass spectrometry platforms and for tracking and removal of pathogenic clones. Copyright © 2016 Elsevier B.V. All rights reserved.
Comparative Bacterial Proteomics: Analysis of the Core Genome Concept
Callister, Stephen J.; McCue, Lee Ann; Turse, Joshua E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.
2008-01-01
While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits. PMID:18253490
Moretti, Marino; Grunau, Alexander; Minerdi, Daniela; Gehrig, Peter; Roschitzki, Bernd; Eberl, Leo; Garibaldi, Angelo; Gullino, Maria Lodovica; Riedel, Kathrin
2010-09-01
Fusarium oxysporum is an important plant pathogen that causes severe damage of many economically important crop species. Various microorganisms have been shown to inhibit this soil-borne plant pathogen, including non-pathogenic F. oxysporum strains. In this study, F. oxysporum wild-type (WT) MSA 35, a biocontrol multispecies consortium that consists of a fungus and numerous rhizobacteria mainly belonging to gamma-proteobacteria, was analyzed by two complementary metaproteomic approaches (2-DE combined with MALDI-Tof/Tof MS and 1-D PAGE combined with LC-ESI-MS/MS) to identify fungal or bacterial factors potentially involved in antagonistic or synergistic interactions between the consortium members. Moreover, the proteome profiles of F. oxysporum WT MSA 35 and its cured counter-part CU MSA 35 (WT treated with antibiotics) were compared with unravel the bacterial impact on consortium functioning. Our study presents the first proteome mapping of an antagonistic F. oxysporum strain and proposes candidate proteins that might play an important role for the biocontrol activity and the close interrelationship between the fungus and its bacterial partners.
Proteomic analysis of pollination-induced corolla senescence in petunia.
Bai, Shuangyi; Willard, Belinda; Chapin, Laura J; Kinter, Michael T; Francis, David M; Stead, Anthony D; Jones, Michelle L
2010-02-01
Senescence represents the last phase of petal development during which macromolecules and organelles are degraded and nutrients are recycled to developing tissues. To understand better the post-transcriptional changes regulating petal senescence, a proteomic approach was used to profile protein changes during the senescence of Petuniaxhybrida 'Mitchell Diploid' corollas. Total soluble proteins were extracted from unpollinated petunia corollas at 0, 24, 48, and 72 h after flower opening and at 24, 48, and 72 h after pollination. Two-dimensional gel electrophoresis (2-DE) was used to identify proteins that were differentially expressed in non-senescing (unpollinated) and senescing (pollinated) corollas, and image analysis was used to determine which proteins were up- or down-regulated by the experimentally determined cut-off of 2.1-fold for P <0.05. One hundred and thirty-three differentially expressed protein spots were selected for sequencing. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) was used to determine the identity of these proteins. Searching translated EST databases and the NCBI non-redundant protein database, it was possible to assign a putative identification to greater than 90% of these proteins. Many of the senescence up-regulated proteins were putatively involved in defence and stress responses or macromolecule catabolism. Some proteins, not previously characterized during flower senescence, were identified, including an orthologue of the tomato abscisic acid stress ripening protein 4 (ASR4). Gene expression patterns did not always correlate with protein expression, confirming that both proteomic and genomic approaches will be required to obtain a detailed understanding of the regulation of petal senescence.
A mass spectrometry proteomics data management platform.
Sharma, Vagisha; Eng, Jimmy K; Maccoss, Michael J; Riffle, Michael
2012-09-01
Mass spectrometry-based proteomics is increasingly being used in biomedical research. These experiments typically generate a large volume of highly complex data, and the volume and complexity are only increasing with time. There exist many software pipelines for analyzing these data (each typically with its own file formats), and as technology improves, these file formats change and new formats are developed. Files produced from these myriad software programs may accumulate on hard disks or tape drives over time, with older files being rendered progressively more obsolete and unusable with each successive technical advancement and data format change. Although initiatives exist to standardize the file formats used in proteomics, they do not address the core failings of a file-based data management system: (1) files are typically poorly annotated experimentally, (2) files are "organically" distributed across laboratory file systems in an ad hoc manner, (3) files formats become obsolete, and (4) searching the data and comparing and contrasting results across separate experiments is very inefficient (if possible at all). Here we present a relational database architecture and accompanying web application dubbed Mass Spectrometry Data Platform that is designed to address the failings of the file-based mass spectrometry data management approach. The database is designed such that the output of disparate software pipelines may be imported into a core set of unified tables, with these core tables being extended to support data generated by specific pipelines. Because the data are unified, they may be queried, viewed, and compared across multiple experiments using a common web interface. Mass Spectrometry Data Platform is open source and freely available at http://code.google.com/p/msdapl/.
Towards an understanding of wheat chloroplasts: a methodical investigation of thylakoid proteome.
Kamal, Abu Hena Mostafa; Cho, Kun; Komatsu, Setsuko; Uozumi, Nobuyuki; Choi, Jong-Soon; Woo, Sun Hee
2012-05-01
We utilized Percoll density gradient centrifugation to isolate and fractionate chloroplasts of Korean winter wheat cultivar cv. Kumgang (Triticum aestivum L.). The resulting protein fractions were separated by one dimensional polyacrylamide gel electrophoresis (1D-PAGE) coupled with LTQ-FTICR mass spectrometry. This enabled us to detect and identify 767 unique proteins. Our findings represent the most comprehensive exploration of a proteome to date. Based on annotation information from the UniProtKB/Swiss-Prot database and our analyses via WoLF PSORT and PSORT, these proteins are localized in the chloroplast (607 proteins), chloroplast stroma (145), thylakoid membrane (342), lumens (163), and integral membranes (166). In all, 67% were confirmed as chloroplast thylakoid proteins. Although nearly complete protein coverage (89% proteins) has been accomplished for the key chloroplast pathways in wheat, such as for photosynthesis, many other proteins are involved in regulating carbon metabolism. The identified proteins were assigned to 103 functional categories according to a classification system developed by the iProClass database and provided through Protein Information Resources. Those functions include electron transport, energy, cellular organization and biogenesis, transport, stress responses, and other metabolic processes. Whereas most of these proteins are associated with known complexes and metabolic pathways, about 13% of the proteins have unknown functions. The chloroplast proteome contains many proteins that are localized to the thylakoids but as yet have no known function. We propose that some of these familiar proteins participate in the photosynthetic pathway. Thus, our new and comprehensive protein profile may provide clues for better understanding that photosynthetic process in wheat.
Li, Nan; Stein, Richard S L; He, Wei; Komives, Elizabeth; Wang, Wei
2013-10-01
Methylation is one of the important post-translational modifications that play critical roles in regulating protein functions. Proteomic identification of this post-translational modification and understanding how it affects protein activity remain great challenges. We tackled this problem from the aspect of methylation mediating protein-protein interaction. Using the chromodomain of human chromobox protein homolog 6 as a model system, we developed a systematic approach that integrates structure modeling, bioinformatics analysis, and peptide microarray experiments to identify lysine residues that are methylated and recognized by the chromodomain in the human proteome. Given the important role of chromobox protein homolog 6 as a reader of histone modifications, it was interesting to find that the majority of its interacting partners identified via this approach function in chromatin remodeling and transcriptional regulation. Our study not only illustrates a novel angle for identifying methyllysines on a proteome-wide scale and elucidating their potential roles in regulating protein function, but also suggests possible strategies for engineering the chromodomain-peptide interface to enhance the recognition of and manipulate the signal transduction mediated by such interactions.
Cloud parallel processing of tandem mass spectrometry based proteomics data.
Mohammed, Yassene; Mostovenko, Ekaterina; Henneman, Alex A; Marissen, Rob J; Deelder, André M; Palmblad, Magnus
2012-10-05
Data analysis in mass spectrometry based proteomics struggles to keep pace with the advances in instrumentation and the increasing rate of data acquisition. Analyzing this data involves multiple steps requiring diverse software, using different algorithms and data formats. Speed and performance of the mass spectral search engines are continuously improving, although not necessarily as needed to face the challenges of acquired big data. Improving and parallelizing the search algorithms is one possibility; data decomposition presents another, simpler strategy for introducing parallelism. We describe a general method for parallelizing identification of tandem mass spectra using data decomposition that keeps the search engine intact and wraps the parallelization around it. We introduce two algorithms for decomposing mzXML files and recomposing resulting pepXML files. This makes the approach applicable to different search engines, including those relying on sequence databases and those searching spectral libraries. We use cloud computing to deliver the computational power and scientific workflow engines to interface and automate the different processing steps. We show how to leverage these technologies to achieve faster data analysis in proteomics and present three scientific workflows for parallel database as well as spectral library search using our data decomposition programs, X!Tandem and SpectraST.
Turetschek, Reinhard; Lyon, David; Desalegn, Getinet; Kaul, Hans-Peter; Wienkoop, Stefanie
2016-01-01
The proteomic study of non-model organisms, such as many crop plants, is challenging due to the lack of comprehensive genome information. Changing environmental conditions require the study and selection of adapted cultivars. Mutations, inherent to cultivars, hamper protein identification and thus considerably complicate the qualitative and quantitative comparison in large-scale systems biology approaches. With this workflow, cultivar-specific mutations are detected from high-throughput comparative MS analyses, by extracting sequence polymorphisms with de novo sequencing. Stringent criteria are suggested to filter for confidential mutations. Subsequently, these polymorphisms complement the initially used database, which is ready to use with any preferred database search algorithm. In our example, we thereby identified 26 specific mutations in two cultivars of Pisum sativum and achieved an increased number (17 %) of peptide spectrum matches.
BIG: a large-scale data integration tool for renal physiology.
Zhao, Yue; Yang, Chin-Rang; Raghuram, Viswanathan; Parulekar, Jaya; Knepper, Mark A
2016-10-01
Due to recent advances in high-throughput techniques, we and others have generated multiple proteomic and transcriptomic databases to describe and quantify gene expression, protein abundance, or cellular signaling on the scale of the whole genome/proteome in kidney cells. The existence of so much data from diverse sources raises the following question: "How can researchers find information efficiently for a given gene product over all of these data sets without searching each data set individually?" This is the type of problem that has motivated the "Big-Data" revolution in Data Science, which has driven progress in fields such as marketing. Here we present an online Big-Data tool called BIG (Biological Information Gatherer) that allows users to submit a single online query to obtain all relevant information from all indexed databases. BIG is accessible at http://big.nhlbi.nih.gov/.
Vester, Diana; Rapp, Erdmann; Gade, Dörte; Genzel, Yvonne; Reichl, Udo
2009-06-01
Over the last years virus-host cell interactions were investigated in numerous studies. Viral strategies for evasion of innate immune response, inhibition of cellular protein synthesis and permission of viral RNA and protein production were disclosed. With quantitative proteome technology, comprehensive studies concerning the impact of viruses on the cellular machinery of their host cells at protein level are possible. Therefore, 2-D DIGE and nanoHPLC-nanoESI-MS/MS analysis were used to qualitatively and quantitatively determine the dynamic cellular proteome responses of two mammalian cell lines to human influenza A virus infection. A cell line used for vaccine production (MDCK) was compared with a human lung carcinoma cell line (A549) as a reference model. Analyzing 2-D gels of the proteomes of uninfected and influenza-infected host cells, 16 quantitatively altered protein spots (at least +/-1.7-fold change in relative abundance, p<0.001) were identified for both cell lines. Most significant changes were found for keratins, major components of the cytoskeleton system, and for Mx proteins, interferon-induced key components of the host cell defense. Time series analysis of infection processes allowed the identification of further proteins that are described to be involved in protein synthesis, signal transduction and apoptosis events. Most likely, these proteins are required for supporting functions during influenza viral life cycle or host cell stress response. Quantitative proteome-wide profiling of virus infection can provide insights into complexity and dynamics of virus-host cell interactions and may accelerate antiviral research and support optimization of vaccine manufacturing processes.
Vijay, Sonam
2014-01-01
Salivary gland proteins of Anopheles mosquitoes offer attractive targets to understand interactions with sporozoites, blood feeding behavior, homeostasis, and immunological evaluation of malaria vectors and parasite interactions. To date limited studies have been carried out to elucidate salivary proteins of An. stephensi salivary glands. The aim of the present study was to provide detailed analytical attributives of functional salivary gland proteins of urban malaria vector An. stephensi. A proteomic approach combining one-dimensional electrophoresis (1DE), ion trap liquid chromatography mass spectrometry (LC/MS/MS), and computational bioinformatic analysis was adopted to provide the first direct insight into identification and functional characterization of known salivary proteins and novel salivary proteins of An. stephensi. Computational studies by online servers, namely, MASCOT and OMSSA algorithms, identified a total of 36 known salivary proteins and 123 novel proteins analysed by LC/MS/MS. This first report describes a baseline proteomic catalogue of 159 salivary proteins belonging to various categories of signal transduction, regulation of blood coagulation cascade, and various immune and energy pathways of An. stephensi sialotranscriptome by mass spectrometry. Our results may serve as basis to provide a putative functional role of proteins in concept of blood feeding, biting behavior, and other aspects of vector-parasite host interactions for parasite development in anopheline mosquitoes. PMID:25126571
Vijay, Sonam; Rawat, Manmeet; Sharma, Arun
2014-01-01
Salivary gland proteins of Anopheles mosquitoes offer attractive targets to understand interactions with sporozoites, blood feeding behavior, homeostasis, and immunological evaluation of malaria vectors and parasite interactions. To date limited studies have been carried out to elucidate salivary proteins of An. stephensi salivary glands. The aim of the present study was to provide detailed analytical attributives of functional salivary gland proteins of urban malaria vector An. stephensi. A proteomic approach combining one-dimensional electrophoresis (1DE), ion trap liquid chromatography mass spectrometry (LC/MS/MS), and computational bioinformatic analysis was adopted to provide the first direct insight into identification and functional characterization of known salivary proteins and novel salivary proteins of An. stephensi. Computational studies by online servers, namely, MASCOT and OMSSA algorithms, identified a total of 36 known salivary proteins and 123 novel proteins analysed by LC/MS/MS. This first report describes a baseline proteomic catalogue of 159 salivary proteins belonging to various categories of signal transduction, regulation of blood coagulation cascade, and various immune and energy pathways of An. stephensi sialotranscriptome by mass spectrometry. Our results may serve as basis to provide a putative functional role of proteins in concept of blood feeding, biting behavior, and other aspects of vector-parasite host interactions for parasite development in anopheline mosquitoes.
Definitive screening design enables optimization of LC-ESI-MS/MS parameters in proteomics.
Aburaya, Shunsuke; Aoki, Wataru; Minakuchi, Hiroyoshi; Ueda, Mitsuyoshi
2017-12-01
In proteomics, more than 100,000 peptides are generated from the digestion of human cell lysates. Proteome samples have a broad dynamic range in protein abundance; therefore, it is critical to optimize various parameters of LC-ESI-MS/MS to comprehensively identify these peptides. However, there are many parameters for LC-ESI-MS/MS analysis. In this study, we applied definitive screening design to simultaneously optimize 14 parameters in the operation of monolithic capillary LC-ESI-MS/MS to increase the number of identified proteins and/or the average peak area of MS1. The simultaneous optimization enabled the determination of two-factor interactions between LC and MS. Finally, we found two parameter sets of monolithic capillary LC-ESI-MS/MS that increased the number of identified proteins by 8.1% or the average peak area of MS1 by 67%. The definitive screening design would be highly useful for high-throughput analysis of the best parameter set in LC-ESI-MS/MS systems.
Méplan, Catherine; Johnson, Ian T; Polley, Abigael C J; Cockell, Simon; Bradburn, David M; Commane, Daniel M; Arasaradnam, Ramesh P; Mulholland, Francis; Zupanic, Anze; Mathers, John C; Hesketh, John
2016-08-01
Epidemiologic studies highlight the potential role of dietary selenium (Se) in colorectal cancer prevention. Our goal was to elucidate whether expression of factors crucial for colorectal homoeostasis is affected by physiologic differences in Se status. Using transcriptomics and proteomics followed by pathway analysis, we identified pathways affected by Se status in rectal biopsies from 22 healthy adults, including 11 controls with optimal status (mean plasma Se = 1.43 μM) and 11 subjects with suboptimal status (mean plasma Se = 0.86 μM). We observed that 254 genes and 26 proteins implicated in cancer (80%), immune function and inflammatory response (40%), cell growth and proliferation (70%), cellular movement, and cell death (50%) were differentially expressed between the 2 groups. Expression of 69 genes, including selenoproteins W1 and K, which are genes involved in cytoskeleton remodelling and transcription factor NFκB signaling, correlated significantly with Se status. Integrating proteomics and transcriptomics datasets revealed reduced inflammatory and immune responses and cytoskeleton remodelling in the suboptimal Se status group. This is the first study combining omics technologies to describe the impact of differences in Se status on colorectal expression patterns, revealing that suboptimal Se status could alter inflammatory signaling and cytoskeleton in human rectal mucosa and so influence cancer risk.-Méplan, C., Johnson, I. T., Polley, A. C. J., Cockell, S., Bradburn, D. M., Commane, D. M., Arasaradnam, R. P., Mulholland, F., Zupanic, A., Mathers, J. C., Hesketh, J. Transcriptomics and proteomics show that selenium affects inflammation, cytoskeleton, and cancer pathways in human rectal biopsies. © The Author(s).
Small cationic antimicrobial peptides delocalize peripheral membrane proteins
Wenzel, Michaela; Chiriac, Alina Iulia; Otto, Andreas; Zweytick, Dagmar; May, Caroline; Schumacher, Catherine; Gust, Ronald; Albada, H. Bauke; Penkova, Maya; Krämer, Ute; Erdmann, Ralf; Metzler-Nolte, Nils; Straus, Suzana K.; Bremer, Erhard; Becher, Dörte; Brötz-Oesterhelt, Heike; Sahl, Hans-Georg; Bandow, Julia Elisabeth
2014-01-01
Short antimicrobial peptides rich in arginine (R) and tryptophan (W) interact with membranes. To learn how this interaction leads to bacterial death, we characterized the effects of the minimal pharmacophore RWRWRW-NH2. A ruthenium-substituted derivative of this peptide localized to the membrane in vivo, and the peptide also integrated readily into mixed phospholipid bilayers that resemble Gram-positive membranes. Proteome and Western blot analyses showed that integration of the peptide caused delocalization of peripheral membrane proteins essential for respiration and cell-wall biosynthesis, limiting cellular energy and undermining cell-wall integrity. This delocalization phenomenon also was observed with the cyclic peptide gramicidin S, indicating the generality of the mechanism. Exogenous glutamate increases tolerance to the peptide, indicating that osmotic destabilization also contributes to antibacterial efficacy. Bacillus subtilis responds to peptide stress by releasing osmoprotective amino acids, in part via mechanosensitive channels. This response is triggered by membrane-targeting bacteriolytic peptides of different structural classes as well as by hypoosmotic conditions. PMID:24706874
Identifying the missing proteins in human proteome by biological language model.
Dong, Qiwen; Wang, Kai; Liu, Xuan
2016-12-23
With the rapid development of high-throughput sequencing technology, the proteomics research becomes a trendy field in the post genomics era. It is necessary to identify all the native-encoding protein sequences for further function and pathway analysis. Toward that end, the Human Proteome Organization lunched the Human Protein Project in 2011. However many proteins are hard to be detected by experiment methods, which becomes one of the bottleneck in Human Proteome Project. In consideration of the complicatedness of detecting these missing proteins by using wet-experiment approach, here we use bioinformatics method to pre-filter the missing proteins. Since there are analogy between the biological sequences and natural language, the n-gram models from Natural Language Processing field has been used to filter the missing proteins. The dataset used in this study contains 616 missing proteins from the "uncertain" category of the neXtProt database. There are 102 proteins deduced by the n-gram model, which have high probability to be native human proteins. We perform a detail analysis on the predicted structure and function of these missing proteins and also compare the high probability proteins with other mass spectrum datasets. The evaluation shows that the results reported here are in good agreement with those obtained by other well-established databases. The analysis shows that 102 proteins may be native gene-coding proteins and some of the missing proteins are membrane or natively disordered proteins which are hard to be detected by experiment methods.
Islam, Mohammad Tawhidul; Mohamedali, Abidali; Ahn, Seong Beom; Nawar, Ishmam; Baker, Mark S; Ranganathan, Shoba
2017-01-01
In the past decade, proteomics and mass spectrometry have taken tremendous strides forward, particularly in the life sciences, spurred on by rapid advances in technology resulting in generation and conglomeration of vast amounts of data. Though this has led to tremendous advancements in biology, the interpretation of the data poses serious challenges for many practitioners due to the immense size and complexity of the data. Furthermore, the lack of annotation means that a potential gold mine of relevant biological information may be hiding within this data. We present here a simple and intuitive workflow for the research community to investigate and mine this data, not only to extract relevant data but also to segregate usable, quality data to develop hypotheses for investigation and validation. We apply an MS evidence workflow for verifying peptides of proteins from one's own data as well as publicly available databases. We then integrate a suite of freely available bioinformatics analysis and annotation software tools to identify homologues and map putative functional signatures, gene ontology and biochemical pathways. We also provide an example of the functional annotation of missing proteins in human chromosome 7 data from the NeXtProt database, where no evidence is available at the proteomic, antibody, or structural levels. We give examples of protocols, tools and detailed flowcharts that can be extended or tailored to interpret and annotate the proteome of any novel organism.
iTRAQ Quantitative Proteomic Comparison of Metastatic and Non-Metastatic Uveal Melanoma Tumors
Crabb, John W.; Hu, Bo; Crabb, John S.; Triozzi, Pierre; Saunthararajah, Yogen; Singh, Arun D.
2015-01-01
Background Uveal melanoma is the most common malignancy of the adult eye. The overall mortality rate is high because this aggressive cancer often metastasizes before ophthalmic diagnosis. Quantitative proteomic analysis of primary metastasizing and non-metastasizing tumors was pursued for insights into mechanisms and biomarkers of uveal melanoma metastasis. Methods Eight metastatic and 7 non-metastatic human primary uveal melanoma tumors were analyzed by LC MS/MS iTRAQ technology with Bruch’s membrane/choroid complex from normal postmortem eyes as control tissue. Tryptic peptides from tumor and control proteins were labeled with iTRAQ tags, fractionated by cation exchange chromatography, and analyzed by LC MS/MS. Protein identification utilized the Mascot search engine and the human Uni-Prot/Swiss-Protein database with false discovery ≤ 1%; protein quantitation utilized the Mascot weighted average method. Proteins designated differentially expressed exhibited quantitative differences (p ≤ 0.05, t-test) in a training set of five metastatic and five non-metastatic tumors. Logistic regression models developed from the training set were used to classify the metastatic status of five independent tumors. Results Of 1644 proteins identified and quantified in 5 metastatic and 5 non-metastatic tumors, 12 proteins were found uniquely in ≥ 3 metastatic tumors, 28 were found significantly elevated and 30 significantly decreased only in metastatic tumors, and 31 were designated differentially expressed between metastatic and non-metastatic tumors. Logistic regression modeling of differentially expressed collagen alpha-3(VI) and heat shock protein beta-1 allowed correct prediction of metastasis status for each of five independent tumor specimens. Conclusions The present data provide new clues to molecular differences in metastatic and non-metastatic uveal melanoma tumors. While sample size is limited and validation required, the results support collagen alpha-3(VI) and heat shock protein beta-1 as candidate biomarkers of uveal melanoma metastasis and establish a quantitative proteomic database for uveal melanoma primary tumors. PMID:26305875
Xu, Yu; Wang, Hong; Nussinov, Ruth; Ma, Buyong
2013-01-01
We constructed and simulated a ‘minimal proteome’ model using Langevin dynamics. It contains 206 essential protein types which were compiled from the literature. For comparison, we generated six proteomes with randomized concentrations. We found that the net charges and molecular weights of the proteins in the minimal genome are not random. The net charge of a protein decreases linearly with molecular weight, with small proteins being mostly positively charged and large proteins negatively charged. The protein copy numbers in the minimal genome have the tendency to maximize the number of protein-protein interactions in the network. Negatively charged proteins which tend to have larger sizes can provide large collision cross-section allowing them to interact with other proteins; on the other hand, the smaller positively charged proteins could have higher diffusion speed and are more likely to collide with other proteins. Proteomes with random charge/mass populations form less stable clusters than those with experimental protein copy numbers. Our study suggests that ‘proper’ populations of negatively and positively charged proteins are important for maintaining a protein-protein interaction network in a proteome. It is interesting to note that the minimal genome model based on the charge and mass of E. Coli may have a larger protein-protein interaction network than that based on the lower organism M. pneumoniae. PMID:23420643
Medina-Aunon, J. Alberto; Martínez-Bartolomé, Salvador; López-García, Miguel A.; Salazar, Emilio; Navajas, Rosana; Jones, Andrew R.; Paradela, Alberto; Albar, Juan P.
2011-01-01
The development of the HUPO-PSI's (Proteomics Standards Initiative) standard data formats and MIAPE (Minimum Information About a Proteomics Experiment) guidelines should improve proteomics data sharing within the scientific community. Proteomics journals have encouraged the use of these standards and guidelines to improve the quality of experimental reporting and ease the evaluation and publication of manuscripts. However, there is an evident lack of bioinformatics tools specifically designed to create and edit standard file formats and reports, or embed them within proteomics workflows. In this article, we describe a new web-based software suite (The ProteoRed MIAPE web toolkit) that performs several complementary roles related to proteomic data standards. First, it can verify that the reports fulfill the minimum information requirements of the corresponding MIAPE modules, highlighting inconsistencies or missing information. Second, the toolkit can convert several XML-based data standards directly into human readable MIAPE reports stored within the ProteoRed MIAPE repository. Finally, it can also perform the reverse operation, allowing users to export from MIAPE reports into XML files for computational processing, data sharing, or public database submission. The toolkit is thus the first application capable of automatically linking the PSI's MIAPE modules with the corresponding XML data exchange standards, enabling bidirectional conversions. This toolkit is freely available at http://www.proteored.org/MIAPE/. PMID:21983993
Thiele, Herbert; Glandorf, Jörg; Hufnagel, Peter
2010-05-27
With the large variety of Proteomics workflows, as well as the large variety of instruments and data-analysis software available, researchers today face major challenges validating and comparing their Proteomics data. Here we present a new generation of the ProteinScape bioinformatics platform, now enabling researchers to manage Proteomics data from the generation and data warehousing to a central data repository with a strong focus on the improved accuracy, reproducibility and comparability demanded by many researchers in the field. It addresses scientists; current needs in proteomics identification, quantification and validation. But producing large protein lists is not the end point in Proteomics, where one ultimately aims to answer specific questions about the biological condition or disease model of the analyzed sample. In this context, a new tool has been developed at the Spanish Centro Nacional de Biotecnologia Proteomics Facility termed PIKE (Protein information and Knowledge Extractor) that allows researchers to control, filter and access specific information from genomics and proteomic databases, to understand the role and relationships of the proteins identified in the experiments. Additionally, an EU funded project, ProDac, has coordinated systematic data collection in public standards-compliant repositories like PRIDE. This will cover all aspects from generating MS data in the laboratory, assembling the whole annotation information and storing it together with identifications in a standardised format.
Huang, Jingwei; Liu, Tingqi; Li, Ke; Song, Xiaokai; Yan, Ruofeng; Xu, Lixin; Li, Xiangrui
2018-04-04
Eimeria maxima initiates infection by invading the jejunal epithelial cells of chicken. However, the proteins involved in invasion remain unknown. The research of the molecules that participate in the interactions between E. maxima sporozoites and host target cells will fill a gap in our understanding of the invasion system of this parasitic pathogen. In the present study, chicken jejunal epithelial cells were isolated and cultured in vitro. Western blot was employed to analyze the soluble proteins of E. maxima sporozoites that bound to chicken jejunal epithelial cells. Co-immunoprecipitation (co-IP) assay was used to separate the E. maxima proteins that bound to chicken jejunal epithelial cells. Shotgun LC-MS/MS technique was used for proteomics identification and Gene Ontology was employed for the bioinformatics analysis. The results of Western blot analysis showed that four proteins bands from jejunal epithelial cells co-cultured with soluble proteins of E. maxima sporozoites were recognized by the positive sera, with molecular weights of 70, 90, 95 and 130 kDa. The co-IP dilutions were analyzed by shotgun LC-MS/MS. A total of 204 proteins were identified in the E. maxima protein database using the MASCOT search engine. Thirty-five proteins including microneme protein 3 and 7 had more than two unique peptide counts and were annotated using Gene Ontology for molecular function, biological process and cellular localization. The results revealed that of the 35 annotated peptides, 22 (62.86%) were associated with binding activity and 15 (42.86%) were involved in catalytic activity. Our findings provide an insight into the interaction between E. maxima and the corresponding host cells and it is important for the understanding of molecular mechanisms underlying E. maxima invasion.