Sample records for large-scale proteomics analysis

  1. Large-Scale and Deep Quantitative Proteome Profiling Using Isobaric Labeling Coupled with Two-Dimensional LC-MS/MS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gritsenko, Marina A.; Xu, Zhe; Liu, Tao

    Comprehensive, quantitative information on abundances of proteins and their post-translational modifications (PTMs) can potentially provide novel biological insights into diseases pathogenesis and therapeutic intervention. Herein, we introduce a quantitative strategy utilizing isobaric stable isotope-labelling techniques combined with two-dimensional liquid chromatography-tandem mass spectrometry (2D-LC-MS/MS) for large-scale, deep quantitative proteome profiling of biological samples or clinical specimens such as tumor tissues. The workflow includes isobaric labeling of tryptic peptides for multiplexed and accurate quantitative analysis, basic reversed-phase LC fractionation and concatenation for reduced sample complexity, and nano-LC coupled to high resolution and high mass accuracy MS analysis for high confidence identification andmore » quantification of proteins. This proteomic analysis strategy has been successfully applied for in-depth quantitative proteomic analysis of tumor samples, and can also be used for integrated proteome and PTM characterization, as well as comprehensive quantitative proteomic analysis across samples from large clinical cohorts.« less

  2. Large-Scale and Deep Quantitative Proteome Profiling Using Isobaric Labeling Coupled with Two-Dimensional LC-MS/MS.

    PubMed

    Gritsenko, Marina A; Xu, Zhe; Liu, Tao; Smith, Richard D

    2016-01-01

    Comprehensive, quantitative information on abundances of proteins and their posttranslational modifications (PTMs) can potentially provide novel biological insights into diseases pathogenesis and therapeutic intervention. Herein, we introduce a quantitative strategy utilizing isobaric stable isotope-labeling techniques combined with two-dimensional liquid chromatography-tandem mass spectrometry (2D-LC-MS/MS) for large-scale, deep quantitative proteome profiling of biological samples or clinical specimens such as tumor tissues. The workflow includes isobaric labeling of tryptic peptides for multiplexed and accurate quantitative analysis, basic reversed-phase LC fractionation and concatenation for reduced sample complexity, and nano-LC coupled to high resolution and high mass accuracy MS analysis for high confidence identification and quantification of proteins. This proteomic analysis strategy has been successfully applied for in-depth quantitative proteomic analysis of tumor samples and can also be used for integrated proteome and PTM characterization, as well as comprehensive quantitative proteomic analysis across samples from large clinical cohorts.

  3. CPTAC | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    The National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) is a national effort to accelerate the understanding of the molecular basis of cancer through the application of large-scale proteome and genome analysis, or proteogenomics.

  4. HiQuant: Rapid Postquantification Analysis of Large-Scale MS-Generated Proteomics Data.

    PubMed

    Bryan, Kenneth; Jarboui, Mohamed-Ali; Raso, Cinzia; Bernal-Llinares, Manuel; McCann, Brendan; Rauch, Jens; Boldt, Karsten; Lynn, David J

    2016-06-03

    Recent advances in mass-spectrometry-based proteomics are now facilitating ambitious large-scale investigations of the spatial and temporal dynamics of the proteome; however, the increasing size and complexity of these data sets is overwhelming current downstream computational methods, specifically those that support the postquantification analysis pipeline. Here we present HiQuant, a novel application that enables the design and execution of a postquantification workflow, including common data-processing steps, such as assay normalization and grouping, and experimental replicate quality control and statistical analysis. HiQuant also enables the interpretation of results generated from large-scale data sets by supporting interactive heatmap analysis and also the direct export to Cytoscape and Gephi, two leading network analysis platforms. HiQuant may be run via a user-friendly graphical interface and also supports complete one-touch automation via a command-line mode. We evaluate HiQuant's performance by analyzing a large-scale, complex interactome mapping data set and demonstrate a 200-fold improvement in the execution time over current methods. We also demonstrate HiQuant's general utility by analyzing proteome-wide quantification data generated from both a large-scale public tyrosine kinase siRNA knock-down study and an in-house investigation into the temporal dynamics of the KSR1 and KSR2 interactomes. Download HiQuant, sample data sets, and supporting documentation at http://hiquant.primesdb.eu .

  5. Proteomics wants cRacker: automated standardized data analysis of LC-MS derived proteomic data.

    PubMed

    Zauber, Henrik; Schulze, Waltraud X

    2012-11-02

    The large-scale analysis of thousands of proteins under various experimental conditions or in mutant lines has gained more and more importance in hypothesis-driven scientific research and systems biology in the past years. Quantitative analysis by large scale proteomics using modern mass spectrometry usually results in long lists of peptide ion intensities. The main interest for most researchers, however, is to draw conclusions on the protein level. Postprocessing and combining peptide intensities of a proteomic data set requires expert knowledge, and the often repetitive and standardized manual calculations can be time-consuming. The analysis of complex samples can result in very large data sets (lists with several 1000s to 100,000 entries of different peptides) that cannot easily be analyzed using standard spreadsheet programs. To improve speed and consistency of the data analysis of LC-MS derived proteomic data, we developed cRacker. cRacker is an R-based program for automated downstream proteomic data analysis including data normalization strategies for metabolic labeling and label free quantitation. In addition, cRacker includes basic statistical analysis, such as clustering of data, or ANOVA and t tests for comparison between treatments. Results are presented in editable graphic formats and in list files.

  6. Analyzing large-scale proteomics projects with latent semantic indexing.

    PubMed

    Klie, Sebastian; Martens, Lennart; Vizcaíno, Juan Antonio; Côté, Richard; Jones, Phil; Apweiler, Rolf; Hinneburg, Alexander; Hermjakob, Henning

    2008-01-01

    Since the advent of public data repositories for proteomics data, readily accessible results from high-throughput experiments have been accumulating steadily. Several large-scale projects in particular have contributed substantially to the amount of identifications available to the community. Despite the considerable body of information amassed, very few successful analyses have been performed and published on this data, leveling off the ultimate value of these projects far below their potential. A prominent reason published proteomics data is seldom reanalyzed lies in the heterogeneous nature of the original sample collection and the subsequent data recording and processing. To illustrate that at least part of this heterogeneity can be compensated for, we here apply a latent semantic analysis to the data contributed by the Human Proteome Organization's Plasma Proteome Project (HUPO PPP). Interestingly, despite the broad spectrum of instruments and methodologies applied in the HUPO PPP, our analysis reveals several obvious patterns that can be used to formulate concrete recommendations for optimizing proteomics project planning as well as the choice of technologies used in future experiments. It is clear from these results that the analysis of large bodies of publicly available proteomics data by noise-tolerant algorithms such as the latent semantic analysis holds great promise and is currently underexploited.

  7. Processing Shotgun Proteomics Data on the Amazon Cloud with the Trans-Proteomic Pipeline*

    PubMed Central

    Slagel, Joseph; Mendoza, Luis; Shteynberg, David; Deutsch, Eric W.; Moritz, Robert L.

    2015-01-01

    Cloud computing, where scalable, on-demand compute cycles and storage are available as a service, has the potential to accelerate mass spectrometry-based proteomics research by providing simple, expandable, and affordable large-scale computing to all laboratories regardless of location or information technology expertise. We present new cloud computing functionality for the Trans-Proteomic Pipeline, a free and open-source suite of tools for the processing and analysis of tandem mass spectrometry datasets. Enabled with Amazon Web Services cloud computing, the Trans-Proteomic Pipeline now accesses large scale computing resources, limited only by the available Amazon Web Services infrastructure, for all users. The Trans-Proteomic Pipeline runs in an environment fully hosted on Amazon Web Services, where all software and data reside on cloud resources to tackle large search studies. In addition, it can also be run on a local computer with computationally intensive tasks launched onto the Amazon Elastic Compute Cloud service to greatly decrease analysis times. We describe the new Trans-Proteomic Pipeline cloud service components, compare the relative performance and costs of various Elastic Compute Cloud service instance types, and present on-line tutorials that enable users to learn how to deploy cloud computing technology rapidly with the Trans-Proteomic Pipeline. We provide tools for estimating the necessary computing resources and costs given the scale of a job and demonstrate the use of cloud enabled Trans-Proteomic Pipeline by performing over 1100 tandem mass spectrometry files through four proteomic search engines in 9 h and at a very low cost. PMID:25418363

  8. Processing shotgun proteomics data on the Amazon cloud with the trans-proteomic pipeline.

    PubMed

    Slagel, Joseph; Mendoza, Luis; Shteynberg, David; Deutsch, Eric W; Moritz, Robert L

    2015-02-01

    Cloud computing, where scalable, on-demand compute cycles and storage are available as a service, has the potential to accelerate mass spectrometry-based proteomics research by providing simple, expandable, and affordable large-scale computing to all laboratories regardless of location or information technology expertise. We present new cloud computing functionality for the Trans-Proteomic Pipeline, a free and open-source suite of tools for the processing and analysis of tandem mass spectrometry datasets. Enabled with Amazon Web Services cloud computing, the Trans-Proteomic Pipeline now accesses large scale computing resources, limited only by the available Amazon Web Services infrastructure, for all users. The Trans-Proteomic Pipeline runs in an environment fully hosted on Amazon Web Services, where all software and data reside on cloud resources to tackle large search studies. In addition, it can also be run on a local computer with computationally intensive tasks launched onto the Amazon Elastic Compute Cloud service to greatly decrease analysis times. We describe the new Trans-Proteomic Pipeline cloud service components, compare the relative performance and costs of various Elastic Compute Cloud service instance types, and present on-line tutorials that enable users to learn how to deploy cloud computing technology rapidly with the Trans-Proteomic Pipeline. We provide tools for estimating the necessary computing resources and costs given the scale of a job and demonstrate the use of cloud enabled Trans-Proteomic Pipeline by performing over 1100 tandem mass spectrometry files through four proteomic search engines in 9 h and at a very low cost. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

  9. Low Cost, Scalable Proteomics Data Analysis Using Amazon's Cloud Computing Services and Open Source Search Algorithms

    PubMed Central

    Halligan, Brian D.; Geiger, Joey F.; Vallejos, Andrew K.; Greene, Andrew S.; Twigger, Simon N.

    2009-01-01

    One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step by step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center website (http://proteomics.mcw.edu/vipdac). PMID:19358578

  10. Low cost, scalable proteomics data analysis using Amazon's cloud computing services and open source search algorithms.

    PubMed

    Halligan, Brian D; Geiger, Joey F; Vallejos, Andrew K; Greene, Andrew S; Twigger, Simon N

    2009-06-01

    One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step-by-step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center Web site ( http://proteomics.mcw.edu/vipdac ).

  11. CPTAC researchers report first large-scale integrated proteomic and genomic analysis of a human cancer | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    Investigators from the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) who comprehensively analyzed 95 human colorectal tumor samples, have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, provides a more comprehensive view of the biological features that drive cancer than genomic analysis alone and may help identify the most important targets for cancer detection and intervention.

  12. Enrichment and separation techniques for large-scale proteomics analysis of the protein post-translational modifications.

    PubMed

    Huang, Junfeng; Wang, Fangjun; Ye, Mingliang; Zou, Hanfa

    2014-11-06

    Comprehensive analysis of the post-translational modifications (PTMs) on proteins at proteome level is crucial to elucidate the regulatory mechanisms of various biological processes. In the past decades, thanks to the development of specific PTM enrichment techniques and efficient multidimensional liquid chromatography (LC) separation strategy, the identification of protein PTMs have made tremendous progress. A huge number of modification sites for some major protein PTMs have been identified by proteomics analysis. In this review, we first introduced the recent progresses of PTM enrichment methods for the analysis of several major PTMs including phosphorylation, glycosylation, ubiquitination, acetylation, methylation, and oxidation/reduction status. We then briefly summarized the challenges for PTM enrichment. Finally, we introduced the fractionation and separation techniques for efficient separation of PTM peptides in large-scale PTM analysis. Copyright © 2014 Elsevier B.V. All rights reserved.

  13. ApoptoProteomics, an integrated database for analysis of proteomics data obtained from apoptotic cells.

    PubMed

    Arntzen, Magnus Ø; Thiede, Bernd

    2012-02-01

    Apoptosis is the most commonly described form of programmed cell death, and dysfunction is implicated in a large number of human diseases. Many quantitative proteome analyses of apoptosis have been performed to gain insight in proteins involved in the process. This resulted in large and complex data sets that are difficult to evaluate. Therefore, we developed the ApoptoProteomics database for storage, browsing, and analysis of the outcome of large scale proteome analyses of apoptosis derived from human, mouse, and rat. The proteomics data of 52 publications were integrated and unified with protein annotations from UniProt-KB, the caspase substrate database homepage (CASBAH), and gene ontology. Currently, more than 2300 records of more than 1500 unique proteins were included, covering a large proportion of the core signaling pathways of apoptosis. Analysis of the data set revealed a high level of agreement between the reported changes in directionality reported in proteomics studies and expected apoptosis-related function and may disclose proteins without a current recognized involvement in apoptosis based on gene ontology. Comparison between induction of apoptosis by the intrinsic and the extrinsic apoptotic signaling pathway revealed slight differences. Furthermore, proteomics has significantly contributed to the field of apoptosis in identifying hundreds of caspase substrates. The database is available at http://apoptoproteomics.uio.no.

  14. ApoptoProteomics, an Integrated Database for Analysis of Proteomics Data Obtained from Apoptotic Cells*

    PubMed Central

    Arntzen, Magnus Ø.; Thiede, Bernd

    2012-01-01

    Apoptosis is the most commonly described form of programmed cell death, and dysfunction is implicated in a large number of human diseases. Many quantitative proteome analyses of apoptosis have been performed to gain insight in proteins involved in the process. This resulted in large and complex data sets that are difficult to evaluate. Therefore, we developed the ApoptoProteomics database for storage, browsing, and analysis of the outcome of large scale proteome analyses of apoptosis derived from human, mouse, and rat. The proteomics data of 52 publications were integrated and unified with protein annotations from UniProt-KB, the caspase substrate database homepage (CASBAH), and gene ontology. Currently, more than 2300 records of more than 1500 unique proteins were included, covering a large proportion of the core signaling pathways of apoptosis. Analysis of the data set revealed a high level of agreement between the reported changes in directionality reported in proteomics studies and expected apoptosis-related function and may disclose proteins without a current recognized involvement in apoptosis based on gene ontology. Comparison between induction of apoptosis by the intrinsic and the extrinsic apoptotic signaling pathway revealed slight differences. Furthermore, proteomics has significantly contributed to the field of apoptosis in identifying hundreds of caspase substrates. The database is available at http://apoptoproteomics.uio.no. PMID:22067098

  15. Systems Proteomics for Translational Network Medicine

    PubMed Central

    Arrell, D. Kent; Terzic, Andre

    2012-01-01

    Universal principles underlying network science, and their ever-increasing applications in biomedicine, underscore the unprecedented capacity of systems biology based strategies to synthesize and resolve massive high throughput generated datasets. Enabling previously unattainable comprehension of biological complexity, systems approaches have accelerated progress in elucidating disease prediction, progression, and outcome. Applied to the spectrum of states spanning health and disease, network proteomics establishes a collation, integration, and prioritization algorithm to guide mapping and decoding of proteome landscapes from large-scale raw data. Providing unparalleled deconvolution of protein lists into global interactomes, integrative systems proteomics enables objective, multi-modal interpretation at molecular, pathway, and network scales, merging individual molecular components, their plurality of interactions, and functional contributions for systems comprehension. As such, network systems approaches are increasingly exploited for objective interpretation of cardiovascular proteomics studies. Here, we highlight network systems proteomic analysis pipelines for integration and biological interpretation through protein cartography, ontological categorization, pathway and functional enrichment and complex network analysis. PMID:22896016

  16. Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics

    PubMed Central

    Deutsch, Eric W.; Mendoza, Luis; Shteynberg, David; Slagel, Joseph; Sun, Zhi; Moritz, Robert L.

    2015-01-01

    Democratization of genomics technologies has enabled the rapid determination of genotypes. More recently the democratization of comprehensive proteomics technologies is enabling the determination of the cellular phenotype and the molecular events that define its dynamic state. Core proteomic technologies include mass spectrometry to define protein sequence, protein:protein interactions, and protein post-translational modifications. Key enabling technologies for proteomics are bioinformatic pipelines to identify, quantitate, and summarize these events. The Trans-Proteomics Pipeline (TPP) is a robust open-source standardized data processing pipeline for large-scale reproducible quantitative mass spectrometry proteomics. It supports all major operating systems and instrument vendors via open data formats. Here we provide a review of the overall proteomics workflow supported by the TPP, its major tools, and how it can be used in its various modes from desktop to cloud computing. We describe new features for the TPP, including data visualization functionality. We conclude by describing some common perils that affect the analysis of tandem mass spectrometry datasets, as well as some major upcoming features. PMID:25631240

  17. Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics.

    PubMed

    Deutsch, Eric W; Mendoza, Luis; Shteynberg, David; Slagel, Joseph; Sun, Zhi; Moritz, Robert L

    2015-08-01

    Democratization of genomics technologies has enabled the rapid determination of genotypes. More recently the democratization of comprehensive proteomics technologies is enabling the determination of the cellular phenotype and the molecular events that define its dynamic state. Core proteomic technologies include MS to define protein sequence, protein:protein interactions, and protein PTMs. Key enabling technologies for proteomics are bioinformatic pipelines to identify, quantitate, and summarize these events. The Trans-Proteomics Pipeline (TPP) is a robust open-source standardized data processing pipeline for large-scale reproducible quantitative MS proteomics. It supports all major operating systems and instrument vendors via open data formats. Here, we provide a review of the overall proteomics workflow supported by the TPP, its major tools, and how it can be used in its various modes from desktop to cloud computing. We describe new features for the TPP, including data visualization functionality. We conclude by describing some common perils that affect the analysis of MS/MS datasets, as well as some major upcoming features. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. FunRich proteomics software analysis, let the fun begin!

    PubMed

    Benito-Martin, Alberto; Peinado, Héctor

    2015-08-01

    Protein MS analysis is the preferred method for unbiased protein identification. It is normally applied to a large number of both small-scale and high-throughput studies. However, user-friendly computational tools for protein analysis are still needed. In this issue, Mathivanan and colleagues (Proteomics 2015, 15, 2597-2601) report the development of FunRich software, an open-access software that facilitates the analysis of proteomics data, providing tools for functional enrichment and interaction network analysis of genes and proteins. FunRich is a reinterpretation of proteomic software, a standalone tool combining ease of use with customizable databases, free access, and graphical representations. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. Mechanism of Arachidonic Acid Accumulation during Aging in Mortierella alpina: A Large-Scale Label-Free Comparative Proteomics Study.

    PubMed

    Yu, Yadong; Li, Tao; Wu, Na; Ren, Lujing; Jiang, Ling; Ji, Xiaojun; Huang, He

    2016-11-30

    Arachidonic acid (ARA) is an important polyunsaturated fatty acid having various beneficial physiological effects on the human body. The aging of Mortierella alpina has long been known to significantly improve ARA yield, but the exact mechanism is still elusive. Herein, multiple approaches including large-scale label-free comparative proteomics were employed to systematically investigate the mechanism mentioned above. Upon ultrastructural observation, abnormal mitochondria were found to aggregate around shrunken lipid droplets. Proteomics analysis revealed a total of 171 proteins with significant alterations of expression during aging. Pathway analysis suggested that reactive oxygen species (ROS) were accumulated and stimulated the activation of the malate/pyruvate cycle and isocitrate dehydrogenase, which might provide additional NADPH for ARA synthesis. EC 4.2.1.17-hydratase might be a key player in ARA accumulation during aging. These findings provide a valuable resource for efforts to further improve the ARA content in the oil produced by aging M. alpina.

  20. Large-Scale Interaction Profiling of Protein Domains Through Proteomic Peptide-Phage Display Using Custom Peptidomes.

    PubMed

    Seo, Moon-Hyeong; Nim, Satra; Jeon, Jouhyun; Kim, Philip M

    2017-01-01

    Protein-protein interactions are essential to cellular functions and signaling pathways. We recently combined bioinformatics and custom oligonucleotide arrays to construct custom-made peptide-phage libraries for screening peptide-protein interactions, an approach we call proteomic peptide-phage display (ProP-PD). In this chapter, we describe protocols for phage display for the identification of natural peptide binders for a given protein. We finally describe deep sequencing for the analysis of the proteomic peptide-phage display.

  1. Assessment and improvement of statistical tools for comparative proteomics analysis of sparse data sets with few experimental replicates.

    PubMed

    Schwämmle, Veit; León, Ileana Rodríguez; Jensen, Ole Nørregaard

    2013-09-06

    Large-scale quantitative analyses of biological systems are often performed with few replicate experiments, leading to multiple nonidentical data sets due to missing values. For example, mass spectrometry driven proteomics experiments are frequently performed with few biological or technical replicates due to sample-scarcity or due to duty-cycle or sensitivity constraints, or limited capacity of the available instrumentation, leading to incomplete results where detection of significant feature changes becomes a challenge. This problem is further exacerbated for the detection of significant changes on the peptide level, for example, in phospho-proteomics experiments. In order to assess the extent of this problem and the implications for large-scale proteome analysis, we investigated and optimized the performance of three statistical approaches by using simulated and experimental data sets with varying numbers of missing values. We applied three tools, including standard t test, moderated t test, also known as limma, and rank products for the detection of significantly changing features in simulated and experimental proteomics data sets with missing values. The rank product method was improved to work with data sets containing missing values. Extensive analysis of simulated and experimental data sets revealed that the performance of the statistical analysis tools depended on simple properties of the data sets. High-confidence results were obtained by using the limma and rank products methods for analyses of triplicate data sets that exhibited more than 1000 features and more than 50% missing values. The maximum number of differentially represented features was identified by using limma and rank products methods in a complementary manner. We therefore recommend combined usage of these methods as a novel and optimal way to detect significantly changing features in these data sets. This approach is suitable for large quantitative data sets from stable isotope labeling and mass spectrometry experiments and should be applicable to large data sets of any type. An R script that implements the improved rank products algorithm and the combined analysis is available.

  2. Comparative bioinformatics analyses and profiling of lysosome-related organelle proteomes

    NASA Astrophysics Data System (ADS)

    Hu, Zhang-Zhi; Valencia, Julio C.; Huang, Hongzhan; Chi, An; Shabanowitz, Jeffrey; Hearing, Vincent J.; Appella, Ettore; Wu, Cathy

    2007-01-01

    Complete and accurate profiling of cellular organelle proteomes, while challenging, is important for the understanding of detailed cellular processes at the organelle level. Mass spectrometry technologies coupled with bioinformatics analysis provide an effective approach for protein identification and functional interpretation of organelle proteomes. In this study, we have compiled human organelle reference datasets from large-scale proteomic studies and protein databases for seven lysosome-related organelles (LROs), as well as the endoplasmic reticulum and mitochondria, for comparative organelle proteome analysis. Heterogeneous sources of human organelle proteins and rodent homologs are mapped to human UniProtKB protein entries based on ID and/or peptide mappings, followed by functional annotation and categorization using the iProXpress proteomic expression analysis system. Cataloging organelle proteomes allows close examination of both shared and unique proteins among various LROs and reveals their functional relevance. The proteomic comparisons show that LROs are a closely related family of organelles. The shared proteins indicate the dynamic and hybrid nature of LROs, while the unique transmembrane proteins may represent additional candidate marker proteins for LROs. This comparative analysis, therefore, provides a basis for hypothesis formulation and experimental validation of organelle proteins and their functional roles.

  3. The Escherichia coli Proteome: Past, Present, and Future Prospects†

    PubMed Central

    Han, Mee-Jung; Lee, Sang Yup

    2006-01-01

    Proteomics has emerged as an indispensable methodology for large-scale protein analysis in functional genomics. The Escherichia coli proteome has been extensively studied and is well defined in terms of biochemical, biological, and biotechnological data. Even before the entire E. coli proteome was fully elucidated, the largest available data set had been integrated to decipher regulatory circuits and metabolic pathways, providing valuable insights into global cellular physiology and the development of metabolic and cellular engineering strategies. With the recent advent of advanced proteomic technologies, the E. coli proteome has been used for the validation of new technologies and methodologies such as sample prefractionation, protein enrichment, two-dimensional gel electrophoresis, protein detection, mass spectrometry (MS), combinatorial assays with n-dimensional chromatographies and MS, and image analysis software. These important technologies will not only provide a great amount of additional information on the E. coli proteome but also synergistically contribute to other proteomic studies. Here, we review the past development and current status of E. coli proteome research in terms of its biological, biotechnological, and methodological significance and suggest future prospects. PMID:16760308

  4. Assembling proteomics data as a prerequisite for the analysis of large scale experiments

    PubMed Central

    Schmidt, Frank; Schmid, Monika; Thiede, Bernd; Pleißner, Klaus-Peter; Böhme, Martina; Jungblut, Peter R

    2009-01-01

    Background Despite the complete determination of the genome sequence of a huge number of bacteria, their proteomes remain relatively poorly defined. Beside new methods to increase the number of identified proteins new database applications are necessary to store and present results of large- scale proteomics experiments. Results In the present study, a database concept has been developed to address these issues and to offer complete information via a web interface. In our concept, the Oracle based data repository system SQL-LIMS plays the central role in the proteomics workflow and was applied to the proteomes of Mycobacterium tuberculosis, Helicobacter pylori, Salmonella typhimurium and protein complexes such as 20S proteasome. Technical operations of our proteomics labs were used as the standard for SQL-LIMS template creation. By means of a Java based data parser, post-processed data of different approaches, such as LC/ESI-MS, MALDI-MS and 2-D gel electrophoresis (2-DE), were stored in SQL-LIMS. A minimum set of the proteomics data were transferred in our public 2D-PAGE database using a Java based interface (Data Transfer Tool) with the requirements of the PEDRo standardization. Furthermore, the stored proteomics data were extractable out of SQL-LIMS via XML. Conclusion The Oracle based data repository system SQL-LIMS played the central role in the proteomics workflow concept. Technical operations of our proteomics labs were used as standards for SQL-LIMS templates. Using a Java based parser, post-processed data of different approaches such as LC/ESI-MS, MALDI-MS and 1-DE and 2-DE were stored in SQL-LIMS. Thus, unique data formats of different instruments were unified and stored in SQL-LIMS tables. Moreover, a unique submission identifier allowed fast access to all experimental data. This was the main advantage compared to multi software solutions, especially if personnel fluctuations are high. Moreover, large scale and high-throughput experiments must be managed in a comprehensive repository system such as SQL-LIMS, to query results in a systematic manner. On the other hand, these database systems are expensive and require at least one full time administrator and specialized lab manager. Moreover, the high technical dynamics in proteomics may cause problems to adjust new data formats. To summarize, SQL-LIMS met the requirements of proteomics data handling especially in skilled processes such as gel-electrophoresis or mass spectrometry and fulfilled the PSI standardization criteria. The data transfer into a public domain via DTT facilitated validation of proteomics data. Additionally, evaluation of mass spectra by post-processing using MS-Screener improved the reliability of mass analysis and prevented storage of data junk. PMID:19166578

  5. Tools for phospho- and glycoproteomics of plasma membranes.

    PubMed

    Wiśniewski, Jacek R

    2011-07-01

    Analysis of plasma membrane proteins and their posttranslational modifications is considered as important for identification of disease markers and targets for drug treatment. Due to their insolubility in water, studying of plasma membrane proteins using mass spectrometry has been difficult for a long time. Recent technological developments in sample preparation together with important improvements in mass spectrometric analysis have facilitated analysis of these proteins and their posttranslational modifications. Now, large scale proteomic analyses allow identification of thousands of membrane proteins from minute amounts of sample. Optimized protocols for affinity enrichment of phosphorylated and glycosylated peptides have set new dimensions in the depth of characterization of these posttranslational modifications of plasma membrane proteins. Here, I summarize recent advances in proteomic technology for the characterization of the cell surface proteins and their modifications. In the focus are approaches allowing large scale mapping rather than analytical methods suitable for studying individual proteins or non-complex mixtures.

  6. CPTAC Evaluates Long-Term Reproducibility of Quantitative Proteomics Using Breast Cancer Xenografts | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    Liquid chromatography tandem-mass spectrometry (LC-MS/MS)- based methods such as isobaric tags for relative and absolute quantification (iTRAQ) and tandem mass tags (TMT) have been shown to provide overall better quantification accuracy and reproducibility over other LC-MS/MS techniques. However, large scale projects like the Clinical Proteomic Tumor Analysis Consortium (CPTAC) require comparisons across many genomically characterized clinical specimens in a single study and often exceed the capability of traditional iTRAQ-based quantification.

  7. Determination of burn patient outcome by large-scale quantitative discovery proteomics

    PubMed Central

    Finnerty, Celeste C.; Jeschke, Marc G.; Qian, Wei-Jun; Kaushal, Amit; Xiao, Wenzhong; Liu, Tao; Gritsenko, Marina A.; Moore, Ronald J.; Camp, David G.; Moldawer, Lyle L.; Elson, Constance; Schoenfeld, David; Gamelli, Richard; Gibran, Nicole; Klein, Matthew; Arnoldo, Brett; Remick, Daniel; Smith, Richard D.; Davis, Ronald; Tompkins, Ronald G.; Herndon, David N.

    2013-01-01

    Objective Emerging proteomics techniques can be used to establish proteomic outcome signatures and to identify candidate biomarkers for survival following traumatic injury. We applied high-resolution liquid chromatography-mass spectrometry (LC-MS) and multiplex cytokine analysis to profile the plasma proteome of survivors and non-survivors of massive burn injury to determine the proteomic survival signature following a major burn injury. Design Proteomic discovery study. Setting Five burn hospitals across the U.S. Patients Thirty-two burn patients (16 non-survivors and 16 survivors), 19–89 years of age, were admitted within 96 h of injury to the participating hospitals with burns covering >20% of the total body surface area and required at least one surgical intervention. Interventions None. Measurements and Main Results We found differences in circulating levels of 43 proteins involved in the acute phase response, hepatic signaling, the complement cascade, inflammation, and insulin resistance. Thirty-two of the proteins identified were not previously known to play a role in the response to burn. IL-4, IL-8, GM-CSF, MCP-1, and β2-microglobulin correlated well with survival and may serve as clinical biomarkers. Conclusions These results demonstrate the utility of these techniques for establishing proteomic survival signatures and for use as a discovery tool to identify candidate biomarkers for survival. This is the first clinical application of a high-throughput, large-scale LC-MS-based quantitative plasma proteomic approach for biomarker discovery for the prediction of patient outcome following burn, trauma or critical illness. PMID:23507713

  8. PACOM: A Versatile Tool for Integrating, Filtering, Visualizing, and Comparing Multiple Large Mass Spectrometry Proteomics Data Sets.

    PubMed

    Martínez-Bartolomé, Salvador; Medina-Aunon, J Alberto; López-García, Miguel Ángel; González-Tejedo, Carmen; Prieto, Gorka; Navajas, Rosana; Salazar-Donate, Emilio; Fernández-Costa, Carolina; Yates, John R; Albar, Juan Pablo

    2018-04-06

    Mass-spectrometry-based proteomics has evolved into a high-throughput technology in which numerous large-scale data sets are generated from diverse analytical platforms. Furthermore, several scientific journals and funding agencies have emphasized the storage of proteomics data in public repositories to facilitate its evaluation, inspection, and reanalysis. (1) As a consequence, public proteomics data repositories are growing rapidly. However, tools are needed to integrate multiple proteomics data sets to compare different experimental features or to perform quality control analysis. Here, we present a new Java stand-alone tool, Proteomics Assay COMparator (PACOM), that is able to import, combine, and simultaneously compare numerous proteomics experiments to check the integrity of the proteomic data as well as verify data quality. With PACOM, the user can detect source of errors that may have been introduced in any step of a proteomics workflow and that influence the final results. Data sets can be easily compared and integrated, and data quality and reproducibility can be visually assessed through a rich set of graphical representations of proteomics data features as well as a wide variety of data filters. Its flexibility and easy-to-use interface make PACOM a unique tool for daily use in a proteomics laboratory. PACOM is available at https://github.com/smdb21/pacom .

  9. Computer aided manual validation of mass spectrometry-based proteomic data.

    PubMed

    Curran, Timothy G; Bryson, Bryan D; Reigelhaupt, Michael; Johnson, Hannah; White, Forest M

    2013-06-15

    Advances in mass spectrometry-based proteomic technologies have increased the speed of analysis and the depth provided by a single analysis. Computational tools to evaluate the accuracy of peptide identifications from these high-throughput analyses have not kept pace with technological advances; currently the most common quality evaluation methods are based on statistical analysis of the likelihood of false positive identifications in large-scale data sets. While helpful, these calculations do not consider the accuracy of each identification, thus creating a precarious situation for biologists relying on the data to inform experimental design. Manual validation is the gold standard approach to confirm accuracy of database identifications, but is extremely time-intensive. To palliate the increasing time required to manually validate large proteomic datasets, we provide computer aided manual validation software (CAMV) to expedite the process. Relevant spectra are collected, catalogued, and pre-labeled, allowing users to efficiently judge the quality of each identification and summarize applicable quantitative information. CAMV significantly reduces the burden associated with manual validation and will hopefully encourage broader adoption of manual validation in mass spectrometry-based proteomics. Copyright © 2013 Elsevier Inc. All rights reserved.

  10. Computational Omics Pre-Awardees | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    The National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) is pleased to announce the pre-awardees of the Computational Omics solicitation. Working with NVIDIA Foundation's Compute the Cure initiative and Leidos Biomedical Research Inc., the NCI, through this solicitation, seeks to leverage computational efforts to provide tools for the mining and interpretation of large-scale publicly available ‘omics’ datasets.

  11. Comparative evaluation of saliva collection methods for proteome analysis.

    PubMed

    Golatowski, Claas; Salazar, Manuela Gesell; Dhople, Vishnu Mukund; Hammer, Elke; Kocher, Thomas; Jehmlich, Nico; Völker, Uwe

    2013-04-18

    Saliva collection devices are widely used for large-scale screening approaches. This study was designed to compare the suitability of three different whole-saliva collection approaches for subsequent proteome analyses. From 9 young healthy volunteers (4 women and 5 men) saliva samples were collected either unstimulated by passive drooling or stimulated using a paraffin gum or Salivette® (cotton swab). Saliva volume, protein concentration and salivary protein patterns were analyzed comparatively. Samples collected using paraffin gum showed the highest saliva volume (4.1±1.5 ml) followed by Salivette® collection (1.8±0.4 ml) and drooling (1.0±0.4 ml). Saliva protein concentrations (average 1145 μg/ml) showed no significant differences between the three sampling schemes. Each collection approach facilitated the identification of about 160 proteins (≥2 distinct peptides) per subject, but collection-method dependent variations in protein composition were observed. Passive drooling, paraffin gum and Salivette® each allows similar coverage of the whole saliva proteome, but the specific proteins observed depended on the collection approach. Thus, only one type of collection device should be used for quantitative proteome analysis in one experiment, especially when performing large-scale cross-sectional or multi-centric studies. Copyright © 2013 Elsevier B.V. All rights reserved.

  12. freeQuant: A Mass Spectrometry Label-Free Quantification Software Tool for Complex Proteome Analysis.

    PubMed

    Deng, Ning; Li, Zhenye; Pan, Chao; Duan, Huilong

    2015-01-01

    Study of complex proteome brings forward higher request for the quantification method using mass spectrometry technology. In this paper, we present a mass spectrometry label-free quantification tool for complex proteomes, called freeQuant, which integrated quantification with functional analysis effectively. freeQuant consists of two well-integrated modules: label-free quantification and functional analysis with biomedical knowledge. freeQuant supports label-free quantitative analysis which makes full use of tandem mass spectrometry (MS/MS) spectral count, protein sequence length, shared peptides, and ion intensity. It adopts spectral count for quantitative analysis and builds a new method for shared peptides to accurately evaluate abundance of isoforms. For proteins with low abundance, MS/MS total ion count coupled with spectral count is included to ensure accurate protein quantification. Furthermore, freeQuant supports the large-scale functional annotations for complex proteomes. Mitochondrial proteomes from the mouse heart, the mouse liver, and the human heart were used to evaluate the usability and performance of freeQuant. The evaluation showed that the quantitative algorithms implemented in freeQuant can improve accuracy of quantification with better dynamic range.

  13. From the genome sequence to the protein inventory of Bacillus subtilis.

    PubMed

    Becher, Dörte; Büttner, Knut; Moche, Martin; Hessling, Bernd; Hecker, Michael

    2011-08-01

    Owing to the low number of proteins necessary to render a bacterial cell viable, bacteria are extremely attractive model systems to understand how the genome sequence is translated into actual life processes. One of the most intensively investigated model organisms is Bacillus subtilis. It has attracted world-wide research interest, addressing cell differentiation and adaptation on a molecular scale as well as biotechnological production processes. Meanwhile, we are looking back on more than 25 years of B. subtilis proteomics. A wide range of methods have been developed during this period for the large-scale qualitative and quantitative proteome analysis. Currently, it is possible to identify and quantify more than 50% of the predicted proteome in different cellular subfractions. In this review, we summarize the development of B. subtilis proteomics during the past 25 years. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Automated image alignment for 2D gel electrophoresis in a high-throughput proteomics pipeline.

    PubMed

    Dowsey, Andrew W; Dunn, Michael J; Yang, Guang-Zhong

    2008-04-01

    The quest for high-throughput proteomics has revealed a number of challenges in recent years. Whilst substantial improvements in automated protein separation with liquid chromatography and mass spectrometry (LC/MS), aka 'shotgun' proteomics, have been achieved, large-scale open initiatives such as the Human Proteome Organization (HUPO) Brain Proteome Project have shown that maximal proteome coverage is only possible when LC/MS is complemented by 2D gel electrophoresis (2-DE) studies. Moreover, both separation methods require automated alignment and differential analysis to relieve the bioinformatics bottleneck and so make high-throughput protein biomarker discovery a reality. The purpose of this article is to describe a fully automatic image alignment framework for the integration of 2-DE into a high-throughput differential expression proteomics pipeline. The proposed method is based on robust automated image normalization (RAIN) to circumvent the drawbacks of traditional approaches. These use symbolic representation at the very early stages of the analysis, which introduces persistent errors due to inaccuracies in modelling and alignment. In RAIN, a third-order volume-invariant B-spline model is incorporated into a multi-resolution schema to correct for geometric and expression inhomogeneity at multiple scales. The normalized images can then be compared directly in the image domain for quantitative differential analysis. Through evaluation against an existing state-of-the-art method on real and synthetically warped 2D gels, the proposed analysis framework demonstrates substantial improvements in matching accuracy and differential sensitivity. High-throughput analysis is established through an accelerated GPGPU (general purpose computation on graphics cards) implementation. Supplementary material, software and images used in the validation are available at http://www.proteomegrid.org/rain/.

  15. TUBEs-Mass Spectrometry for Identification and Analysis of the Ubiquitin-Proteome.

    PubMed

    Azkargorta, Mikel; Escobes, Iraide; Elortza, Felix; Matthiesen, Rune; Rodríguez, Manuel S

    2016-01-01

    Mass spectrometry (MS) has become the method of choice for the large-scale analysis of protein ubiquitylation. There exist a number of proposed methods for mapping ubiquitin sites, each with different pros and cons. We present here a protocol for the MS analysis of the ubiquitin-proteome captured by TUBEs and subsequent data analysis. Using dedicated software and algorithms, specific information on the presence of ubiquitylated peptides can be obtained from the MS search results. In addition, a quantitative and functional analysis of the ubiquitylated proteins and their interacting partners helps to unravel the biological and molecular processes they are involved in.

  16. Ubiquitinated Proteome: Ready for Global?*

    PubMed Central

    Shi, Yi; Xu, Ping; Qin, Jun

    2011-01-01

    Ubiquitin (Ub) is a small and highly conserved protein that can covalently modify protein substrates. Ubiquitination is one of the major post-translational modifications that regulate a broad spectrum of cellular functions. The advancement of mass spectrometers as well as the development of new affinity purification tools has greatly expedited proteome-wide analysis of several post-translational modifications (e.g. phosphorylation, glycosylation, and acetylation). In contrast, large-scale profiling of lysine ubiquitination remains a challenge. Most recently, new Ub affinity reagents such as Ub remnant antibody and tandem Ub binding domains have been developed, allowing for relatively large-scale detection of several hundreds of lysine ubiquitination events in human cells. Here we review different strategies for the identification of ubiquitination site and discuss several issues associated with data analysis. We suggest that careful interpretation and orthogonal confirmation of MS spectra is necessary to minimize false positive assignments by automatic searching algorithms. PMID:21339389

  17. Interlaboratory Study Characterizing a Yeast Performance Standard for Benchmarking LC-MS Platform Performance*

    PubMed Central

    Paulovich, Amanda G.; Billheimer, Dean; Ham, Amy-Joan L.; Vega-Montoto, Lorenzo; Rudnick, Paul A.; Tabb, David L.; Wang, Pei; Blackman, Ronald K.; Bunk, David M.; Cardasis, Helene L.; Clauser, Karl R.; Kinsinger, Christopher R.; Schilling, Birgit; Tegeler, Tony J.; Variyath, Asokan Mulayath; Wang, Mu; Whiteaker, Jeffrey R.; Zimmerman, Lisa J.; Fenyo, David; Carr, Steven A.; Fisher, Susan J.; Gibson, Bradford W.; Mesri, Mehdi; Neubert, Thomas A.; Regnier, Fred E.; Rodriguez, Henry; Spiegelman, Cliff; Stein, Stephen E.; Tempst, Paul; Liebler, Daniel C.

    2010-01-01

    Optimal performance of LC-MS/MS platforms is critical to generating high quality proteomics data. Although individual laboratories have developed quality control samples, there is no widely available performance standard of biological complexity (and associated reference data sets) for benchmarking of platform performance for analysis of complex biological proteomes across different laboratories in the community. Individual preparations of the yeast Saccharomyces cerevisiae proteome have been used extensively by laboratories in the proteomics community to characterize LC-MS platform performance. The yeast proteome is uniquely attractive as a performance standard because it is the most extensively characterized complex biological proteome and the only one associated with several large scale studies estimating the abundance of all detectable proteins. In this study, we describe a standard operating protocol for large scale production of the yeast performance standard and offer aliquots to the community through the National Institute of Standards and Technology where the yeast proteome is under development as a certified reference material to meet the long term needs of the community. Using a series of metrics that characterize LC-MS performance, we provide a reference data set demonstrating typical performance of commonly used ion trap instrument platforms in expert laboratories; the results provide a basis for laboratories to benchmark their own performance, to improve upon current methods, and to evaluate new technologies. Additionally, we demonstrate how the yeast reference, spiked with human proteins, can be used to benchmark the power of proteomics platforms for detection of differentially expressed proteins at different levels of concentration in a complex matrix, thereby providing a metric to evaluate and minimize preanalytical and analytical variation in comparative proteomics experiments. PMID:19858499

  18. ProteoSign: an end-user online differential proteomics statistical analysis platform.

    PubMed

    Efstathiou, Georgios; Antonakis, Andreas N; Pavlopoulos, Georgios A; Theodosiou, Theodosios; Divanach, Peter; Trudgian, David C; Thomas, Benjamin; Papanikolaou, Nikolas; Aivaliotis, Michalis; Acuto, Oreste; Iliopoulos, Ioannis

    2017-07-03

    Profiling of proteome dynamics is crucial for understanding cellular behavior in response to intrinsic and extrinsic stimuli and maintenance of homeostasis. Over the last 20 years, mass spectrometry (MS) has emerged as the most powerful tool for large-scale identification and characterization of proteins. Bottom-up proteomics, the most common MS-based proteomics approach, has always been challenging in terms of data management, processing, analysis and visualization, with modern instruments capable of producing several gigabytes of data out of a single experiment. Here, we present ProteoSign, a freely available web application, dedicated in allowing users to perform proteomics differential expression/abundance analysis in a user-friendly and self-explanatory way. Although several non-commercial standalone tools have been developed for post-quantification statistical analysis of proteomics data, most of them are not end-user appealing as they often require very stringent installation of programming environments, third-party software packages and sometimes further scripting or computer programming. To avoid this bottleneck, we have developed a user-friendly software platform accessible via a web interface in order to enable proteomics laboratories and core facilities to statistically analyse quantitative proteomics data sets in a resource-efficient manner. ProteoSign is available at http://bioinformatics.med.uoc.gr/ProteoSign and the source code at https://github.com/yorgodillo/ProteoSign. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Statistical Model to Analyze Quantitative Proteomics Data Obtained by 18O/16O Labeling and Linear Ion Trap Mass Spectrometry

    PubMed Central

    Jorge, Inmaculada; Navarro, Pedro; Martínez-Acedo, Pablo; Núñez, Estefanía; Serrano, Horacio; Alfranca, Arántzazu; Redondo, Juan Miguel; Vázquez, Jesús

    2009-01-01

    Statistical models for the analysis of protein expression changes by stable isotope labeling are still poorly developed, particularly for data obtained by 16O/18O labeling. Besides large scale test experiments to validate the null hypothesis are lacking. Although the study of mechanisms underlying biological actions promoted by vascular endothelial growth factor (VEGF) on endothelial cells is of considerable interest, quantitative proteomics studies on this subject are scarce and have been performed after exposing cells to the factor for long periods of time. In this work we present the largest quantitative proteomics study to date on the short term effects of VEGF on human umbilical vein endothelial cells by 18O/16O labeling. Current statistical models based on normality and variance homogeneity were found unsuitable to describe the null hypothesis in a large scale test experiment performed on these cells, producing false expression changes. A random effects model was developed including four different sources of variance at the spectrum-fitting, scan, peptide, and protein levels. With the new model the number of outliers at scan and peptide levels was negligible in three large scale experiments, and only one false protein expression change was observed in the test experiment among more than 1000 proteins. The new model allowed the detection of significant protein expression changes upon VEGF stimulation for 4 and 8 h. The consistency of the changes observed at 4 h was confirmed by a replica at a smaller scale and further validated by Western blot analysis of some proteins. Most of the observed changes have not been described previously and are consistent with a pattern of protein expression that dynamically changes over time following the evolution of the angiogenic response. With this statistical model the 18O labeling approach emerges as a very promising and robust alternative to perform quantitative proteomics studies at a depth of several thousand proteins. PMID:19181660

  20. ProteinInferencer: Confident protein identification and multiple experiment comparison for large scale proteomics projects.

    PubMed

    Zhang, Yaoyang; Xu, Tao; Shan, Bing; Hart, Jonathan; Aslanian, Aaron; Han, Xuemei; Zong, Nobel; Li, Haomin; Choi, Howard; Wang, Dong; Acharya, Lipi; Du, Lisa; Vogt, Peter K; Ping, Peipei; Yates, John R

    2015-11-03

    Shotgun proteomics generates valuable information from large-scale and target protein characterizations, including protein expression, protein quantification, protein post-translational modifications (PTMs), protein localization, and protein-protein interactions. Typically, peptides derived from proteolytic digestion, rather than intact proteins, are analyzed by mass spectrometers because peptides are more readily separated, ionized and fragmented. The amino acid sequences of peptides can be interpreted by matching the observed tandem mass spectra to theoretical spectra derived from a protein sequence database. Identified peptides serve as surrogates for their proteins and are often used to establish what proteins were present in the original mixture and to quantify protein abundance. Two major issues exist for assigning peptides to their originating protein. The first issue is maintaining a desired false discovery rate (FDR) when comparing or combining multiple large datasets generated by shotgun analysis and the second issue is properly assigning peptides to proteins when homologous proteins are present in the database. Herein we demonstrate a new computational tool, ProteinInferencer, which can be used for protein inference with both small- or large-scale data sets to produce a well-controlled protein FDR. In addition, ProteinInferencer introduces confidence scoring for individual proteins, which makes protein identifications evaluable. This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015. Published by Elsevier B.V.

  1. Applications of Proteomic Technologies to Toxicology

    EPA Science Inventory

    Proteomics is the large-scale study of gene expression at the protein level. This cutting edge technology has been extensively applied to toxicology research recently. The up-to-date development of proteomics has presented the toxicology community with an unprecedented opportunit...

  2. MAPU: Max-Planck Unified database of organellar, cellular, tissue and body fluid proteomes

    PubMed Central

    Zhang, Yanling; Zhang, Yong; Adachi, Jun; Olsen, Jesper V.; Shi, Rong; de Souza, Gustavo; Pasini, Erica; Foster, Leonard J.; Macek, Boris; Zougman, Alexandre; Kumar, Chanchal; Wiśniewski, Jacek R.; Jun, Wang; Mann, Matthias

    2007-01-01

    Mass spectrometry (MS)-based proteomics has become a powerful technology to map the protein composition of organelles, cell types and tissues. In our department, a large-scale effort to map these proteomes is complemented by the Max-Planck Unified (MAPU) proteome database. MAPU contains several body fluid proteomes; including plasma, urine, and cerebrospinal fluid. Cell lines have been mapped to a depth of several thousand proteins and the red blood cell proteome has also been analyzed in depth. The liver proteome is represented with 3200 proteins. By employing high resolution MS and stringent validation criteria, false positive identification rates in MAPU are lower than 1:1000. Thus MAPU datasets can serve as reference proteomes in biomarker discovery. MAPU contains the peptides identifying each protein, measured masses, scores and intensities and is freely available at using a clickable interface of cell or body parts. Proteome data can be queried across proteomes by protein name, accession number, sequence similarity, peptide sequence and annotation information. More than 4500 mouse and 2500 human proteins have already been identified in at least one proteome. Basic annotation information and links to other public databases are provided in MAPU and we plan to add further analysis tools. PMID:17090601

  3. Quality Assessments of Long-Term Quantitative Proteomic Analysis of Breast Cancer Xenograft Tissues

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhou, Jian-Ying; Chen, Lijun; Zhang, Bai

    The identification of protein biomarkers requires large-scale analysis of human specimens to achieve statistical significance. In this study, we evaluated the long-term reproducibility of an iTRAQ (isobaric tags for relative and absolute quantification) based quantitative proteomics strategy using one channel for universal normalization across all samples. A total of 307 liquid chromatography tandem mass spectrometric (LC-MS/MS) analyses were completed, generating 107 one-dimensional (1D) LC-MS/MS datasets and 8 offline two-dimensional (2D) LC-MS/MS datasets (25 fractions for each set) for human-in-mouse breast cancer xenograft tissues representative of basal and luminal subtypes. Such large-scale studies require the implementation of robust metrics to assessmore » the contributions of technical and biological variability in the qualitative and quantitative data. Accordingly, we developed a quantification confidence score based on the quality of each peptide-spectrum match (PSM) to remove quantification outliers from each analysis. After combining confidence score filtering and statistical analysis, reproducible protein identification and quantitative results were achieved from LC-MS/MS datasets collected over a 16 month period.« less

  4. Brucella proteomes--a review.

    PubMed

    DelVecchio, Vito G; Wagner, Mary Ann; Eschenbrenner, Michel; Horn, Troy A; Kraycer, Jo Ann; Estock, Frank; Elzer, Phil; Mujer, Cesar V

    2002-12-20

    The proteomes of selected Brucella spp. have been extensively analyzed by utilizing current proteomic technology involving 2-DE and MALDI-MS. In Brucella melitensis, more than 500 proteins were identified. The rapid and large-scale identification of proteins in this organism was accomplished by using the annotated B. melitensis genome which is now available in the GenBank. Coupled with new and powerful tools for data analysis, differentially expressed proteins were identified and categorized into several classes. A global overview of protein expression patterns emerged, thereby facilitating the simultaneous analysis of different metabolic pathways in B. melitensis. Such a global characterization would not have been possible by using time consuming and traditional biochemical approaches. The era of post-genomic technology offers new and exciting opportunities to understand the complete biology of different Brucella species.

  5. Cell-free protein synthesis: applications in proteomics and biotechnology.

    PubMed

    He, Mingyue

    2008-01-01

    Protein production is one of the key steps in biotechnology and functional proteomics. Expression of proteins in heterologous hosts (such as in E. coli) is generally lengthy and costly. Cell-free protein synthesis is thus emerging as an attractive alternative. In addition to the simplicity and speed for protein production, cell-free expression allows generation of functional proteins that are difficult to produce by in vivo systems. Recent exploitation of cell-free systems enables novel development of technologies for rapid discovery of proteins with desirable properties from very large libraries. This article reviews the recent development in cell-free systems and their application in the large scale protein analysis.

  6. Automated selected reaction monitoring software for accurate label-free protein quantification.

    PubMed

    Teleman, Johan; Karlsson, Christofer; Waldemarson, Sofia; Hansson, Karin; James, Peter; Malmström, Johan; Levander, Fredrik

    2012-07-06

    Selected reaction monitoring (SRM) is a mass spectrometry method with documented ability to quantify proteins accurately and reproducibly using labeled reference peptides. However, the use of labeled reference peptides becomes impractical if large numbers of peptides are targeted and when high flexibility is desired when selecting peptides. We have developed a label-free quantitative SRM workflow that relies on a new automated algorithm, Anubis, for accurate peak detection. Anubis efficiently removes interfering signals from contaminating peptides to estimate the true signal of the targeted peptides. We evaluated the algorithm on a published multisite data set and achieved results in line with manual data analysis. In complex peptide mixtures from whole proteome digests of Streptococcus pyogenes we achieved a technical variability across the entire proteome abundance range of 6.5-19.2%, which was considerably below the total variation across biological samples. Our results show that the label-free SRM workflow with automated data analysis is feasible for large-scale biological studies, opening up new possibilities for quantitative proteomics and systems biology.

  7. iTRAQ-based quantitative proteomic analysis reveals proteomic changes in three fenoxaprop-P-ethyl-resistant Beckmannia syzigachne biotypes with differing ACCase mutations.

    PubMed

    Pan, Lang; Zhang, Jian; Wang, Junzhi; Yu, Qin; Bai, Lianyang; Dong, Liyao

    2017-05-08

    American sloughgrass (Beckmannia syzigachne Steud.) is a weed widely distributed in wheat fields of China. In recent years, the evolution of herbicide (fenoxaprop-P-ethyl)-resistant populations has decreased the susceptibility of B. syzigachne. This study compared 4 B. syzigachne populations (3 resistant and 1 susceptible) using iTRAQ to characterize fenoxaprop-P-ethyl resistance in B. syzigachne at the proteomic level. Through searching the UniProt database, 3104 protein species were identified from 13,335 unique peptides. Approximately 2834 protein species were assigned to 23 functional classifications provided by the COG database. Among these, 2299 protein species were assigned to 125 predicted pathways. The resistant biotype contained 8 protein species that changed in abundance relative to the susceptible biotype; they were involved in photosynthesis, oxidative phosphorylation, and fatty acid biosynthesis pathways. In contrast to previous studies comparing only 1 resistant and 1 susceptible population, our use of 3 fenoxaprop-resistant B. syzigachne populations with different genetic backgrounds minimized irrelevant differential expression and eliminated false positives. Therefore, we could more confidently link the differentially expressed proteins to herbicide resistance. Proteomic analysis demonstrated that fenoxaprop-P-ethyl resistance is associated with photosynthetic capacity, a connection that might be related to the target-site mutations in resistant B. syzigachne. This is the first large-scale proteomics study examining herbicide stress responses in different B. syzigachne biotypes. This study has biological relevance because it is the first to employ proteomic analysis for understanding the mechanisms underlying Beckmannia syzigachne herbicide resistance. The plant is a major weed in China and negatively affects crop yield, but has developed considerable resistance to the most common herbicide, fenoxaprop-P-ethyl. Through comparisons of resistant and sensitive biotypes, our study identified multiple proteins (involved in photosynthesis, oxidative phosphorylation, and fatty acid biosynthesis) that are putatively linked to B. syzigachne herbicide response. This large-scale proteomics study, sorely lacking in weed science, contributes valuable data that can be applied to more fine-tuned analyses on the functions of specific proteins in herbicide resistance. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Automation of nanoflow liquid chromatography-tandem mass spectrometry for proteome analysis by using a strong cation exchange trap column.

    PubMed

    Jiang, Xiaogang; Feng, Shun; Tian, Ruijun; Han, Guanghui; Jiang, Xinning; Ye, Mingliang; Zou, Hanfa

    2007-02-01

    An approach was developed to automate sample introduction for nanoflow LC-MS/MS (microLC-MS/MS) analysis using a strong cation exchange (SCX) trap column. The system consisted of a 100 microm id x 2 cm SCX trap column and a 75 microm id x 12 cm C18 RP analytical column. During the sample loading step, the flow passing through the SCX trap column was directed to waste for loading a large volume of sample at high flow rate. Then the peptides bound on the SCX trap column were eluted onto the RP analytical column by a high salt buffer followed by RP chromatographic separation of the peptides at nanoliter flow rate. It was observed that higher performance of separation could be achieved with the system using SCX trap column than with the system using C18 trap column. The high proteomic coverage using this approach was demonstrated in the analysis of tryptic digest of BSA and yeast cell lysate. In addition, this system was also applied to two-dimensional separation of tryptic digest of human hepatocellular carcinoma cell line SMMC-7721 for large scale proteome analysis. This system was fully automated and required minimum changes on current microLC-MS/MS system. This system represented a promising platform for routine proteome analysis.

  9. Proteome Characterization of Leaves in Common Bean

    PubMed Central

    Robison, Faith M.; Heuberger, Adam L.; Brick, Mark A.; Prenni, Jessica E.

    2015-01-01

    Dry edible bean (Phaseolus vulgaris L.) is a globally relevant food crop. The bean genome was recently sequenced and annotated allowing for proteomics investigations aimed at characterization of leaf phenotypes important to agriculture. The objective of this study was to utilize a shotgun proteomics approach to characterize the leaf proteome and to identify protein abundance differences between two bean lines with known variation in their physiological resistance to biotic stresses. Overall, 640 proteins were confidently identified. Among these are proteins known to be involved in a variety of molecular functions including oxidoreductase activity, binding peroxidase activity, and hydrolase activity. Twenty nine proteins were found to significantly vary in abundance (p-value < 0.05) between the two bean lines, including proteins associated with biotic stress. To our knowledge, this work represents the first large scale shotgun proteomic analysis of beans and our results lay the groundwork for future studies designed to investigate the molecular mechanisms involved in pathogen resistance. PMID:28248269

  10. Large-scale inference of protein tissue origin in gram-positive sepsis plasma using quantitative targeted proteomics

    PubMed Central

    Malmström, Erik; Kilsgård, Ola; Hauri, Simon; Smeds, Emanuel; Herwald, Heiko; Malmström, Lars; Malmström, Johan

    2016-01-01

    The plasma proteome is highly dynamic and variable, composed of proteins derived from surrounding tissues and cells. To investigate the complex processes that control the composition of the plasma proteome, we developed a mass spectrometry-based proteomics strategy to infer the origin of proteins detected in murine plasma. The strategy relies on the construction of a comprehensive protein tissue atlas from cells and highly vascularized organs using shotgun mass spectrometry. The protein tissue atlas was transformed to a spectral library for highly reproducible quantification of tissue-specific proteins directly in plasma using SWATH-like data-independent mass spectrometry analysis. We show that the method can determine drastic changes of tissue-specific protein profiles in blood plasma from mouse animal models with sepsis. The strategy can be extended to several other species advancing our understanding of the complex processes that contribute to the plasma proteome dynamics. PMID:26732734

  11. CPTC and KIST Join Efforts to Solve Complex Proteomic Issues | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    The National Cancer Institute's (NCI) Clinical Proteomic Technologies for Cancer (CPTC) initiative at the National Institutes of Health has entered into a memorandum of understanding (MOU) with the Korea Institute of Science and Technology (KIST). This MOU promotes proteomic technology optimization and standards implementation in large-scale international programs.

  12. Large scale systematic proteomic quantification from non-metastatic to metastatic colorectal cancer

    NASA Astrophysics Data System (ADS)

    Yin, Xuefei; Zhang, Yang; Guo, Shaowen; Jin, Hong; Wang, Wenhai; Yang, Pengyuan

    2015-07-01

    A systematic proteomic quantification of formalin-fixed, paraffin-embedded (FFPE) colorectal cancer tissues from stage I to stage IIIC was performed in large scale. 1017 proteins were identified with 338 proteins in quantitative changes by label free method, while 341 proteins were quantified with significant expression changes among 6294 proteins by iTRAQ method. We found that proteins related to migration expression increased and those for binding and adherent decreased during the colorectal cancer development according to the gene ontology (GO) annotation and ingenuity pathway analysis (IPA). The integrin alpha 5 (ITA5) in integrin family was focused, which was consistent with the metastasis related pathway. The expression level of ITA5 decreased in metastasis tissues and the result has been further verified by Western blotting. Another two cell migration related proteins vitronectin (VTN) and actin-related protein (ARP3) were also proved to be up-regulated by both mass spectrometry (MS) based quantification results and Western blotting. Up to now, our result shows one of the largest dataset in colorectal cancer proteomics research. Our strategy reveals a disease driven omics-pattern for the metastasis colorectal cancer.

  13. Comparison of Collisional and Electron-Based Dissociation Modes for Middle-Down Analysis of Multiply Glycosylated Peptides

    NASA Astrophysics Data System (ADS)

    Khatri, Kshitij; Pu, Yi; Klein, Joshua A.; Wei, Juan; Costello, Catherine E.; Lin, Cheng; Zaia, Joseph

    2018-04-01

    Analysis of singly glycosylated peptides has evolved to a point where large-scale LC-MS analyses can be performed at almost the same scale as proteomics experiments. While collisionally activated dissociation (CAD) remains the mainstay of bottom-up analyses, it performs poorly for the middle-down analysis of multiply glycosylated peptides. With improvements in instrumentation, electron-activated dissociation (ExD) modes are becoming increasingly prevalent for proteomics experiments and for the analysis of fragile modifications such as glycosylation. While these methods have been applied for glycopeptide analysis in isolated studies, an organized effort to compare their efficiencies, particularly for analysis of multiply glycosylated peptides (termed here middle-down glycoproteomics), has not been made. We therefore compared the performance of different ExD modes for middle-down glycopeptide analyses. We identified key features among the different dissociation modes and show that increased electron energy and supplemental activation provide the most useful data for middle-down glycopeptide analysis. [Figure not available: see fulltext.

  14. Large-scale label-free quantitative proteomics of the pea aphid-Buchnera symbiosis.

    PubMed

    Poliakov, Anton; Russell, Calum W; Ponnala, Lalit; Hoops, Harold J; Sun, Qi; Douglas, Angela E; van Wijk, Klaas J

    2011-06-01

    Many insects are nutritionally dependent on symbiotic microorganisms that have tiny genomes and are housed in specialized host cells called bacteriocytes. The obligate symbiosis between the pea aphid Acyrthosiphon pisum and the γ-proteobacterium Buchnera aphidicola (only 584 predicted proteins) is particularly amenable for molecular analysis because the genomes of both partners have been sequenced. To better define the symbiotic relationship between this aphid and Buchnera, we used large-scale, high accuracy tandem mass spectrometry (nanoLC-LTQ-Orbtrap) to identify aphid and Buchnera proteins in the whole aphid body, purified bacteriocytes, isolated Buchnera cells and the residual bacteriocyte fraction. More than 1900 aphid and 400 Buchnera proteins were identified. All enzymes in amino acid metabolism annotated in the Buchnera genome were detected, reflecting the high (68%) coverage of the proteome and supporting the core function of Buchnera in the aphid symbiosis. Transporters mediating the transport of predicted metabolites were present in the bacteriocyte. Label-free spectral counting combined with hierarchical clustering, allowed to define the quantitative distribution of a subset of these proteins across both symbiotic partners, yielding no evidence for the selective transfer of protein among the partners in either direction. This is the first quantitative proteome analysis of bacteriocyte symbiosis, providing a wealth of information about molecular function of both the host cell and bacterial symbiont.

  15. Identification of Maturation-Specific Proteins by Single-Cell Proteomics of Human Oocytes

    PubMed Central

    Virant-Klun, Irma; Leicht, Stefan; Hughes, Christopher; Krijgsveld, Jeroen

    2016-01-01

    Oocytes undergo a range of complex processes via oogenesis, maturation, fertilization, and early embryonic development, eventually giving rise to a fully functioning organism. To understand proteome composition and diversity during maturation of human oocytes, here we have addressed crucial aspects of oocyte collection and proteome analysis, resulting in the first proteome and secretome maps of human oocytes. Starting from 100 oocytes collected via a novel serum-free hanging drop culture system, we identified 2,154 proteins, whose function indicate that oocytes are largely resting cells with a proteome that is tailored for homeostasis, cellular attachment, and interaction with its environment via secretory factors. In addition, we have identified 158 oocyte-enriched proteins (such as ECAT1, PIWIL3, NLRP7)1 not observed in high-coverage proteomics studies of other human cell lines or tissues. Exploiting SP3, a novel technology for proteomic sample preparation using magnetic beads, we scaled down proteome analysis to single cells. Despite the low protein content of only ∼100 ng per cell, we consistently identified ∼450 proteins from individual oocytes. When comparing individual oocytes at the germinal vesicle (GV) and metaphase II (MII) stage, we found that the Tudor and KH domain-containing protein (TDRKH) is preferentially expressed in immature oocytes, while Wee2, PCNA, and DNMT1 were enriched in mature cells, collectively indicating that maintenance of genome integrity is crucial during oocyte maturation. This study demonstrates that an innovative proteomics workflow facilitates analysis of single human oocytes to investigate human oocyte biology and preimplantation development. The approach presented here paves the way for quantitative proteomics in other quantity-limited tissues and cell types. Data associated with this study are available via ProteomeXchange with identifier PXD004142. PMID:27215607

  16. Identification of Maturation-Specific Proteins by Single-Cell Proteomics of Human Oocytes.

    PubMed

    Virant-Klun, Irma; Leicht, Stefan; Hughes, Christopher; Krijgsveld, Jeroen

    2016-08-01

    Oocytes undergo a range of complex processes via oogenesis, maturation, fertilization, and early embryonic development, eventually giving rise to a fully functioning organism. To understand proteome composition and diversity during maturation of human oocytes, here we have addressed crucial aspects of oocyte collection and proteome analysis, resulting in the first proteome and secretome maps of human oocytes. Starting from 100 oocytes collected via a novel serum-free hanging drop culture system, we identified 2,154 proteins, whose function indicate that oocytes are largely resting cells with a proteome that is tailored for homeostasis, cellular attachment, and interaction with its environment via secretory factors. In addition, we have identified 158 oocyte-enriched proteins (such as ECAT1, PIWIL3, NLRP7)(1) not observed in high-coverage proteomics studies of other human cell lines or tissues. Exploiting SP3, a novel technology for proteomic sample preparation using magnetic beads, we scaled down proteome analysis to single cells. Despite the low protein content of only ∼100 ng per cell, we consistently identified ∼450 proteins from individual oocytes. When comparing individual oocytes at the germinal vesicle (GV) and metaphase II (MII) stage, we found that the Tudor and KH domain-containing protein (TDRKH) is preferentially expressed in immature oocytes, while Wee2, PCNA, and DNMT1 were enriched in mature cells, collectively indicating that maintenance of genome integrity is crucial during oocyte maturation. This study demonstrates that an innovative proteomics workflow facilitates analysis of single human oocytes to investigate human oocyte biology and preimplantation development. The approach presented here paves the way for quantitative proteomics in other quantity-limited tissues and cell types. Data associated with this study are available via ProteomeXchange with identifier PXD004142. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  17. Top-Down Characterization of the Post-Translationally Modified Intact Periplasmic Proteome from the Bacterium Novosphingobium aromaticivorans

    DOE PAGES

    Wu, Si; Brown, Roslyn N.; Payne, Samuel H.; ...

    2013-01-01

    The periplasm of Gram-negative bacteria is a dynamic and physiologically important subcellular compartment where the constant exposure to potential environmental insults amplifies the need for proper protein folding and modifications. Top-down proteomics analysis of the periplasmic fraction at the intact protein level provides unrestricted characterization and annotation of the periplasmic proteome, including the post-translational modifications (PTMs) on these proteins. Here, we used single-dimension ultra-high pressure liquid chromatography coupled with the Fourier transform mass spectrometry (FTMS) to investigate the intact periplasmic proteome of Novosphingobium aromaticivorans . Our top-down analysis provided the confident identification of 55 proteins in the periplasm and characterizedmore » their PTMs including signal peptide removal, N-terminal methionine excision, acetylation, glutathionylation, pyroglutamate, and disulfide bond formation. This study provides the first experimental evidence for the expression and periplasmic localization of many hypothetical and uncharacterized proteins and the first unrestrictive, large-scale data on PTMs in the bacterial periplasm.« less

  18. MAPU: Max-Planck Unified database of organellar, cellular, tissue and body fluid proteomes.

    PubMed

    Zhang, Yanling; Zhang, Yong; Adachi, Jun; Olsen, Jesper V; Shi, Rong; de Souza, Gustavo; Pasini, Erica; Foster, Leonard J; Macek, Boris; Zougman, Alexandre; Kumar, Chanchal; Wisniewski, Jacek R; Jun, Wang; Mann, Matthias

    2007-01-01

    Mass spectrometry (MS)-based proteomics has become a powerful technology to map the protein composition of organelles, cell types and tissues. In our department, a large-scale effort to map these proteomes is complemented by the Max-Planck Unified (MAPU) proteome database. MAPU contains several body fluid proteomes; including plasma, urine, and cerebrospinal fluid. Cell lines have been mapped to a depth of several thousand proteins and the red blood cell proteome has also been analyzed in depth. The liver proteome is represented with 3200 proteins. By employing high resolution MS and stringent validation criteria, false positive identification rates in MAPU are lower than 1:1000. Thus MAPU datasets can serve as reference proteomes in biomarker discovery. MAPU contains the peptides identifying each protein, measured masses, scores and intensities and is freely available at http://www.mapuproteome.com using a clickable interface of cell or body parts. Proteome data can be queried across proteomes by protein name, accession number, sequence similarity, peptide sequence and annotation information. More than 4500 mouse and 2500 human proteins have already been identified in at least one proteome. Basic annotation information and links to other public databases are provided in MAPU and we plan to add further analysis tools.

  19. Next-Generation Proteomics and Its Application to Clinical Breast Cancer Research.

    PubMed

    Mardamshina, Mariya; Geiger, Tamar

    2017-10-01

    Proteomics technology aims to map the protein landscapes of biological samples, and it can be applied to a variety of samples, including cells, tissues, and body fluids. Because the proteins are the main functional molecules in the cells, their levels reflect much more accurately the cellular phenotype and the regulatory processes within them than gene levels, mutations, and even mRNA levels. With the advancement in the technology, it is possible now to obtain comprehensive views of the biological systems and to study large patient cohorts in a streamlined manner. In this review we discuss the technological advancements in mass spectrometry-based proteomics, which allow analysis of breast cancer tissue samples, leading to the first large-scale breast cancer proteomics studies. Furthermore, we discuss the technological developments in blood-based biomarker discovery, which provide the basis for future development of assays for routine clinical use. Although these are only the first steps in implementation of proteomics into the clinic, extensive collaborative work between these worlds will undoubtedly lead to major discoveries and advances in clinical practice. Copyright © 2017 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.

  20. Transformative Impact of Proteomics on Cardiovascular Health and Disease: A Scientific Statement From the American Heart Association.

    PubMed

    Lindsey, Merry L; Mayr, Manuel; Gomes, Aldrin V; Delles, Christian; Arrell, D Kent; Murphy, Anne M; Lange, Richard A; Costello, Catherine E; Jin, Yu-Fang; Laskowitz, Daniel T; Sam, Flora; Terzic, Andre; Van Eyk, Jennifer; Srinivas, Pothur R

    2015-09-01

    The year 2014 marked the 20th anniversary of the coining of the term proteomics. The purpose of this scientific statement is to summarize advances over this period that have catalyzed our capacity to address the experimental, translational, and clinical implications of proteomics as applied to cardiovascular health and disease and to evaluate the current status of the field. Key successes that have energized the field are delineated; opportunities for proteomics to drive basic science research, facilitate clinical translation, and establish diagnostic and therapeutic healthcare algorithms are discussed; and challenges that remain to be solved before proteomic technologies can be readily translated from scientific discoveries to meaningful advances in cardiovascular care are addressed. Proteomics is the result of disruptive technologies, namely, mass spectrometry and database searching, which drove protein analysis from 1 protein at a time to protein mixture analyses that enable large-scale analysis of proteins and facilitate paradigm shifts in biological concepts that address important clinical questions. Over the past 20 years, the field of proteomics has matured, yet it is still developing rapidly. The scope of this statement will extend beyond the reaches of a typical review article and offer guidance on the use of next-generation proteomics for future scientific discovery in the basic research laboratory and clinical settings. © 2015 American Heart Association, Inc.

  1. Intact mass detection, interpretation, and visualization to automate Top-Down proteomics on a large scale

    PubMed Central

    Durbin, Kenneth R.; Tran, John C.; Zamdborg, Leonid; Sweet, Steve M. M.; Catherman, Adam D.; Lee, Ji Eun; Li, Mingxi; Kellie, John F.; Kelleher, Neil L.

    2011-01-01

    Applying high-throughput Top-Down MS to an entire proteome requires a yet-to-be-established model for data processing. Since Top-Down is becoming possible on a large scale, we report our latest software pipeline dedicated to capturing the full value of intact protein data in automated fashion. For intact mass detection, we combine algorithms for processing MS1 data from both isotopically resolved (FT) and charge-state resolved (ion trap) LC-MS data, which are then linked to their fragment ions for database searching using ProSight. Automated determination of human keratin and tubulin isoforms is one result. Optimized for the intricacies of whole proteins, new software modules visualize proteome-scale data based on the LC retention time and intensity of intact masses and enable selective detection of PTMs to automatically screen for acetylation, phosphorylation, and methylation. Software functionality was demonstrated using comparative LC-MS data from yeast strains in addition to human cells undergoing chemical stress. We further these advances as a key aspect of realizing Top-Down MS on a proteomic scale. PMID:20848673

  2. Current trends in quantitative proteomics - an update.

    PubMed

    Li, H; Han, J; Pan, J; Liu, T; Parker, C E; Borchers, C H

    2017-05-01

    Proteins can provide insights into biological processes at the functional level, so they are very promising biomarker candidates. The quantification of proteins in biological samples has been routinely used for the diagnosis of diseases and monitoring the treatment. Although large-scale protein quantification in complex samples is still a challenging task, a great amount of effort has been made to advance the technologies that enable quantitative proteomics. Seven years ago, in 2009, we wrote an article about the current trends in quantitative proteomics. In writing this current paper, we realized that, today, we have an even wider selection of potential tools for quantitative proteomics. These tools include new derivatization reagents, novel sampling formats, new types of analyzers and scanning techniques, and recently developed software to assist in assay development and data analysis. In this review article, we will discuss these innovative methods, and their current and potential applications in proteomics. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  3. A multi-protease, multi-dissociation, bottom-up-to-top-down proteomic view of the Loxosceles intermedia venom

    PubMed Central

    Trevisan-Silva, Dilza; Bednaski, Aline V.; Fischer, Juliana S.G.; Veiga, Silvio S.; Bandeira, Nuno; Guthals, Adrian; Marchini, Fabricio K.; Leprevost, Felipe V.; Barbosa, Valmir C.; Senff-Ribeiro, Andrea; Carvalho, Paulo C.

    2017-01-01

    Venoms are a rich source for the discovery of molecules with biotechnological applications, but their analysis is challenging even for state-of-the-art proteomics. Here we report on a large-scale proteomic assessment of the venom of Loxosceles intermedia, the so-called brown spider. Venom was extracted from 200 spiders and fractioned into two aliquots relative to a 10 kDa cutoff mass. Each of these was further fractioned and digested with trypsin (4 h), trypsin (18 h), pepsin (18 h), and chymotrypsin (18 h), then analyzed by MudPIT on an LTQ-Orbitrap XL ETD mass spectrometer fragmenting precursors by CID, HCD, and ETD. Aliquots of undigested samples were also analyzed. Our experimental design allowed us to apply spectral networks, thus enabling us to obtain meta-contig assemblies, and consequently de novo sequencing of practically complete proteins, culminating in a deep proteome assessment of the venom. Data are available via ProteomeXchange, with identifier PXD005523. PMID:28696408

  4. MALDI versus ESI: The Impact of the Ion Source on Peptide Identification.

    PubMed

    Nadler, Wiebke Maria; Waidelich, Dietmar; Kerner, Alexander; Hanke, Sabrina; Berg, Regina; Trumpp, Andreas; Rösli, Christoph

    2017-03-03

    For mass spectrometry-based proteomic analyses, electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI) are the commonly used ionization techniques. To investigate the influence of the ion source on peptide detection in large-scale proteomics, an optimized GeLC/MS workflow was developed and applied either with ESI/MS or with MALDI/MS for the proteomic analysis of different human cell lines of pancreatic origin. Statistical analysis of the resulting data set with more than 72 000 peptides emphasized the complementary character of the two methods, as the percentage of peptides identified with both approaches was as low as 39%. Significant differences between the resulting peptide sets were observed with respect to amino acid composition, charge-related parameters, hydrophobicity, and modifications of the detected peptides and could be linked to factors governing the respective ion yields in ESI and MALDI.

  5. Advancing Cell Biology Through Proteomics in Space and Time (PROSPECTS)*

    PubMed Central

    Lamond, Angus I.; Uhlen, Mathias; Horning, Stevan; Makarov, Alexander; Robinson, Carol V.; Serrano, Luis; Hartl, F. Ulrich; Baumeister, Wolfgang; Werenskiold, Anne Katrin; Andersen, Jens S.; Vorm, Ole; Linial, Michal; Aebersold, Ruedi; Mann, Matthias

    2012-01-01

    The term “proteomics” encompasses the large-scale detection and analysis of proteins and their post-translational modifications. Driven by major improvements in mass spectrometric instrumentation, methodology, and data analysis, the proteomics field has burgeoned in recent years. It now provides a range of sensitive and quantitative approaches for measuring protein structures and dynamics that promise to revolutionize our understanding of cell biology and molecular mechanisms in both human cells and model organisms. The Proteomics Specification in Time and Space (PROSPECTS) Network is a unique EU-funded project that brings together leading European research groups, spanning from instrumentation to biomedicine, in a collaborative five year initiative to develop new methods and applications for the functional analysis of cellular proteins. This special issue of Molecular and Cellular Proteomics presents 16 research papers reporting major recent progress by the PROSPECTS groups, including improvements to the resolution and sensitivity of the Orbitrap family of mass spectrometers, systematic detection of proteins using highly characterized antibody collections, and new methods for absolute as well as relative quantification of protein levels. Manuscripts in this issue exemplify approaches for performing quantitative measurements of cell proteomes and for studying their dynamic responses to perturbation, both during normal cellular responses and in disease mechanisms. Here we present a perspective on how the proteomics field is moving beyond simply identifying proteins with high sensitivity toward providing a powerful and versatile set of assay systems for characterizing proteome dynamics and thereby creating a new “third generation” proteomics strategy that offers an indispensible tool for cell biology and molecular medicine. PMID:22311636

  6. TRIC: an automated alignment strategy for reproducible protein quantification in targeted proteomics

    PubMed Central

    Röst, Hannes L.; Liu, Yansheng; D’Agostino, Giuseppe; Zanella, Matteo; Navarro, Pedro; Rosenberger, George; Collins, Ben C.; Gillet, Ludovic; Testa, Giuseppe; Malmström, Lars; Aebersold, Ruedi

    2016-01-01

    Large scale, quantitative proteomic studies have become essential for the analysis of clinical cohorts, large perturbation experiments and systems biology studies. While next-generation mass spectrometric techniques such as SWATH-MS have substantially increased throughput and reproducibility, ensuring consistent quantification of thousands of peptide analytes across multiple LC-MS/MS runs remains a challenging and laborious manual process. To produce highly consistent and quantitatively accurate proteomics data matrices in an automated fashion, we have developed the TRIC software which utilizes fragment ion data to perform cross-run alignment, consistent peak-picking and quantification for high throughput targeted proteomics. TRIC uses a graph-based alignment strategy based on non-linear retention time correction to integrate peak elution information from all LC-MS/MS runs acquired in a study. When compared to state-of-the-art SWATH-MS data analysis, the algorithm was able to reduce the identification error by more than 3-fold at constant recall, while correcting for highly non-linear chromatographic effects. On a pulsed-SILAC experiment performed on human induced pluripotent stem (iPS) cells, TRIC was able to automatically align and quantify thousands of light and heavy isotopic peak groups and substantially increased the quantitative completeness and biological information in the data, providing insights into protein dynamics of iPS cells. Overall, this study demonstrates the importance of consistent quantification in highly challenging experimental setups, and proposes an algorithm to automate this task, constituting the last missing piece in a pipeline for automated analysis of massively parallel targeted proteomics datasets. PMID:27479329

  7. Nanoliter-Scale Oil-Air-Droplet Chip-Based Single Cell Proteomic Analysis.

    PubMed

    Li, Zi-Yi; Huang, Min; Wang, Xiu-Kun; Zhu, Ying; Li, Jin-Song; Wong, Catherine C L; Fang, Qun

    2018-04-17

    Single cell proteomic analysis provides crucial information on cellular heterogeneity in biological systems. Herein, we describe a nanoliter-scale oil-air-droplet (OAD) chip for achieving multistep complex sample pretreatment and injection for single cell proteomic analysis in the shotgun mode. By using miniaturized stationary droplet microreaction and manipulation techniques, our system allows all sample pretreatment and injection procedures to be performed in a nanoliter-scale droplet with minimum sample loss and a high sample injection efficiency (>99%), thus substantially increasing the analytical sensitivity for single cell samples. We applied the present system in the proteomic analysis of 100 ± 10, 50 ± 5, 10, and 1 HeLa cell(s), and protein IDs of 1360, 612, 192, and 51 were identified, respectively. The OAD chip-based system was further applied in single mouse oocyte analysis, with 355 protein IDs identified at the single oocyte level, which demonstrated its special advantages of high enrichment of sequence coverage, hydrophobic proteins, and enzymatic digestion efficiency over the traditional in-tube system.

  8. Stable isotope dimethyl labelling for quantitative proteomics and beyond

    PubMed Central

    Hsu, Jue-Liang; Chen, Shu-Hui

    2016-01-01

    Stable-isotope reductive dimethylation, a cost-effective, simple, robust, reliable and easy-to- multiplex labelling method, is widely applied to quantitative proteomics using liquid chromatography-mass spectrometry. This review focuses on biological applications of stable-isotope dimethyl labelling for a large-scale comparative analysis of protein expression and post-translational modifications based on its unique properties of the labelling chemistry. Some other applications of the labelling method for sample preparation and mass spectrometry-based protein identification and characterization are also summarized. This article is part of the themed issue ‘Quantitative mass spectrometry’. PMID:27644970

  9. Explore, Visualize, and Analyze Functional Cancer Proteomic Data Using the Cancer Proteome Atlas. | Office of Cancer Genomics

    Cancer.gov

    Reverse-phase protein arrays (RPPA) represent a powerful functional proteomic approach to elucidate cancer-related molecular mechanisms and to develop novel cancer therapies. To facilitate community-based investigation of the large-scale protein expression data generated by this platform, we have developed a user-friendly, open-access bioinformatic resource, The Cancer Proteome Atlas (TCPA, http://tcpaportal.org), which contains two separate web applications.

  10. QC-ART: A tool for real-time quality control assessment of mass spectrometry-based proteomics data.

    PubMed

    Stanfill, Bryan A; Nakayasu, Ernesto S; Bramer, Lisa M; Thompson, Allison M; Ansong, Charles K; Clauss, Therese; Gritsenko, Marina A; Monroe, Matthew E; Moore, Ronald J; Orton, Daniel J; Piehowski, Paul D; Schepmoes, Athena A; Smith, Richard D; Webb-Robertson, Bobbie-Jo; Metz, Thomas O

    2018-04-17

    Liquid chromatography-mass spectrometry (LC-MS)-based proteomics studies of large sample cohorts can easily require from months to years to complete. Acquiring consistent, high-quality data in such large-scale studies is challenging because of normal variations in instrumentation performance over time, as well as artifacts introduced by the samples themselves, such as those due to collection, storage and processing. Existing quality control methods for proteomics data primarily focus on post-hoc analysis to remove low-quality data that would degrade downstream statistics; they are not designed to evaluate the data in near real-time, which would allow for interventions as soon as deviations in data quality are detected.  In addition to flagging analyses that demonstrate outlier behavior, evaluating how the data structure changes over time can aide in understanding typical instrument performance or identify issues such as a degradation in data quality due to the need for instrument cleaning and/or re-calibration.  To address this gap for proteomics, we developed Quality Control Analysis in Real-Time (QC-ART), a tool for evaluating data as they are acquired in order to dynamically flag potential issues with instrument performance or sample quality.  QC-ART has similar accuracy as standard post-hoc analysis methods with the additional benefit of real-time analysis.  We demonstrate the utility and performance of QC-ART in identifying deviations in data quality due to both instrument and sample issues in near real-time for LC-MS-based plasma proteomics analyses of a sample subset of The Environmental Determinants of Diabetes in the Young cohort. We also present a case where QC-ART facilitated the identification of oxidative modifications, which are often underappreciated in proteomic experiments. Published under license by The American Society for Biochemistry and Molecular Biology, Inc.

  11. Functional Module Search in Protein Networks based on Semantic Similarity Improves the Analysis of Proteomics Data*

    PubMed Central

    Boyanova, Desislava; Nilla, Santosh; Klau, Gunnar W.; Dandekar, Thomas; Müller, Tobias; Dittrich, Marcus

    2014-01-01

    The continuously evolving field of proteomics produces increasing amounts of data while improving the quality of protein identifications. Albeit quantitative measurements are becoming more popular, many proteomic studies are still based on non-quantitative methods for protein identification. These studies result in potentially large sets of identified proteins, where the biological interpretation of proteins can be challenging. Systems biology develops innovative network-based methods, which allow an integrated analysis of these data. Here we present a novel approach, which combines prior knowledge of protein-protein interactions (PPI) with proteomics data using functional similarity measurements of interacting proteins. This integrated network analysis exactly identifies network modules with a maximal consistent functional similarity reflecting biological processes of the investigated cells. We validated our approach on small (H9N2 virus-infected gastric cells) and large (blood constituents) proteomic data sets. Using this novel algorithm, we identified characteristic functional modules in virus-infected cells, comprising key signaling proteins (e.g. the stress-related kinase RAF1) and demonstrate that this method allows a module-based functional characterization of cell types. Analysis of a large proteome data set of blood constituents resulted in clear separation of blood cells according to their developmental origin. A detailed investigation of the T-cell proteome further illustrates how the algorithm partitions large networks into functional subnetworks each representing specific cellular functions. These results demonstrate that the integrated network approach not only allows a detailed analysis of proteome networks but also yields a functional decomposition of complex proteomic data sets and thereby provides deeper insights into the underlying cellular processes of the investigated system. PMID:24807868

  12. A Review: Proteomics in Retinal Artery Occlusion, Retinal Vein Occlusion, Diabetic Retinopathy and Acquired Macular Disorders.

    PubMed

    Cehofski, Lasse Jørgensen; Honoré, Bent; Vorum, Henrik

    2017-04-28

    Retinal artery occlusion (RAO), retinal vein occlusion (RVO), diabetic retinopathy (DR) and age-related macular degeneration (AMD) are frequent ocular diseases with potentially sight-threatening outcomes. In the present review we discuss major findings of proteomic studies of RAO, RVO, DR and AMD, including an overview of ocular proteome changes associated with anti-vascular endothelial growth factor (VEGF) treatments. Despite the severe outcomes of RAO, the proteome of the disease remains largely unstudied. There is also limited knowledge about the proteome of RVO, but proteomic studies suggest that RVO is associated with remodeling of the extracellular matrix and adhesion processes. Proteomic studies of DR have resulted in the identification of potential therapeutic targets such as carbonic anhydrase-I. Proliferative diabetic retinopathy is the most intensively studied stage of DR. Proteomic studies have established VEGF, pigment epithelium-derived factor (PEDF) and complement components as key factors associated with AMD. The aim of this review is to highlight the major milestones in proteomics in RAO, RVO, DR and AMD. Through large-scale protein analyses, proteomics is bringing new important insights into these complex pathological conditions.

  13. PTMscape: an open source tool to predict generic post-translational modifications and map modification crosstalk in protein domains and biological processes.

    PubMed

    Li, Ginny X H; Vogel, Christine; Choi, Hyungwon

    2018-06-07

    While tandem mass spectrometry can detect post-translational modifications (PTM) at the proteome scale, reported PTM sites are often incomplete and include false positives. Computational approaches can complement these datasets by additional predictions, but most available tools use prediction models pre-trained for single PTM type by the developers and it remains a difficult task to perform large-scale batch prediction for multiple PTMs with flexible user control, including the choice of training data. We developed an R package called PTMscape which predicts PTM sites across the proteome based on a unified and comprehensive set of descriptors of the physico-chemical microenvironment of modified sites, with additional downstream analysis modules to test enrichment of individual or pairs of PTMs in protein domains. PTMscape is flexible in the ability to process any major modifications, such as phosphorylation and ubiquitination, while achieving the sensitivity and specificity comparable to single-PTM methods and outperforming other multi-PTM tools. Applying this framework, we expanded proteome-wide coverage of five major PTMs affecting different residues by prediction, especially for lysine and arginine modifications. Using a combination of experimentally acquired sites (PSP) and newly predicted sites, we discovered that the crosstalk among multiple PTMs occur more frequently than by random chance in key protein domains such as histone, protein kinase, and RNA recognition motifs, spanning various biological processes such as RNA processing, DNA damage response, signal transduction, and regulation of cell cycle. These results provide a proteome-scale analysis of crosstalk among major PTMs and can be easily extended to other types of PTM.

  14. Highly multiplexed targeted proteomics using precise control of peptide retention time.

    PubMed

    Gallien, Sebastien; Peterman, Scott; Kiyonami, Reiko; Souady, Jamal; Duriez, Elodie; Schoen, Alan; Domon, Bruno

    2012-04-01

    Large-scale proteomics applications using SRM analysis on triple quadrupole mass spectrometers present new challenges to LC-MS/MS experimental design. Despite the automation of building large-scale LC-SRM methods, the increased numbers of targeted peptides can compromise the balance between sensitivity and selectivity. To facilitate large target numbers, time-scheduled SRM transition acquisition is performed. Previously published results have demonstrated incorporation of a well-characterized set of synthetic peptides enabled chromatographic characterization of the elution profile for most endogenous peptides. We have extended this application of peptide trainer kits to not only build SRM methods but to facilitate real-time elution profile characterization that enables automated adjustment of the scheduled detection windows. Incorporation of dynamic retention time adjustments better facilitate targeted assays lasting several days without the need for constant supervision. This paper provides an overview of how the dynamic retention correction approach identifies and corrects for commonly observed LC variations. This adjustment dramatically improves robustness in targeted discovery experiments as well as routine quantification experiments. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. Design and analysis issues in quantitative proteomics studies.

    PubMed

    Karp, Natasha A; Lilley, Kathryn S

    2007-09-01

    Quantitative proteomics is the comparison of distinct proteomes which enables the identification of protein species which exhibit changes in expression or post-translational state in response to a given stimulus. Many different quantitative techniques are being utilized and generate large datasets. Independent of the technique used, these large datasets need robust data analysis to ensure valid conclusions are drawn from such studies. Approaches to address the problems that arise with large datasets are discussed to give insight into the types of statistical analyses of data appropriate for the various experimental strategies that can be employed by quantitative proteomic studies. This review also highlights the importance of employing a robust experimental design and highlights various issues surrounding the design of experiments. The concepts and examples discussed within will show how robust design and analysis will lead to confident results that will ensure quantitative proteomics delivers.

  16. Proteomics and Systems Biology: Current and Future Applications in the Nutritional Sciences1

    PubMed Central

    Moore, J. Bernadette; Weeks, Mark E.

    2011-01-01

    In the last decade, advances in genomics, proteomics, and metabolomics have yielded large-scale datasets that have driven an interest in global analyses, with the objective of understanding biological systems as a whole. Systems biology integrates computational modeling and experimental biology to predict and characterize the dynamic properties of biological systems, which are viewed as complex signaling networks. Whereas the systems analysis of disease-perturbed networks holds promise for identification of drug targets for therapy, equally the identified critical network nodes may be targeted through nutritional intervention in either a preventative or therapeutic fashion. As such, in the context of the nutritional sciences, it is envisioned that systems analysis of normal and nutrient-perturbed signaling networks in combination with knowledge of underlying genetic polymorphisms will lead to a future in which the health of individuals will be improved through predictive and preventative nutrition. Although high-throughput transcriptomic microarray data were initially most readily available and amenable to systems analysis, recent technological and methodological advances in MS have contributed to a linear increase in proteomic investigations. It is now commonplace for combined proteomic technologies to generate complex, multi-faceted datasets, and these will be the keystone of future systems biology research. This review will define systems biology, outline current proteomic methodologies, highlight successful applications of proteomics in nutrition research, and discuss the challenges for future applications of systems biology approaches in the nutritional sciences. PMID:22332076

  17. Affordable proteomics: the two-hybrid systems.

    PubMed

    Gillespie, Marc

    2003-06-01

    Numerous proteomic methodologies exist, but most require a heavy investment in expertise and technology. This puts these approaches out of reach for many laboratories and small companies, rarely allowing proteomics to be used as a pilot approach for biomarker or target identification. Two proteomic approaches, 2D gel electrophoresis and the two-hybrid systems, are currently available to most researchers. The two-hybrid systems, though accommodating to large-scale experiments, were originally designed as practical screens, that by comparison to current proteomics tools were small-scale, affordable and technically feasible. The screens rapidly generated data, identifying protein interactions that were previously uncharacterized. The foundation for a two-hybrid proteomic investigation can be purchased as separate kits from a number of companies. The true power of the technique lies not in its affordability, but rather in its portability. The two-hybrid system puts proteomics back into laboratories where the output of the screens can be evaluated by researchers with experience in the particular fields of basic research, cancer biology, toxicology or drug development.

  18. From protein-protein interactions to protein co-expression networks: a new perspective to evaluate large-scale proteomic data.

    PubMed

    Vella, Danila; Zoppis, Italo; Mauri, Giancarlo; Mauri, Pierluigi; Di Silvestre, Dario

    2017-12-01

    The reductionist approach of dissecting biological systems into their constituents has been successful in the first stage of the molecular biology to elucidate the chemical basis of several biological processes. This knowledge helped biologists to understand the complexity of the biological systems evidencing that most biological functions do not arise from individual molecules; thus, realizing that the emergent properties of the biological systems cannot be explained or be predicted by investigating individual molecules without taking into consideration their relations. Thanks to the improvement of the current -omics technologies and the increasing understanding of the molecular relationships, even more studies are evaluating the biological systems through approaches based on graph theory. Genomic and proteomic data are often combined with protein-protein interaction (PPI) networks whose structure is routinely analyzed by algorithms and tools to characterize hubs/bottlenecks and topological, functional, and disease modules. On the other hand, co-expression networks represent a complementary procedure that give the opportunity to evaluate at system level including organisms that lack information on PPIs. Based on these premises, we introduce the reader to the PPI and to the co-expression networks, including aspects of reconstruction and analysis. In particular, the new idea to evaluate large-scale proteomic data by means of co-expression networks will be discussed presenting some examples of application. Their use to infer biological knowledge will be shown, and a special attention will be devoted to the topological and module analysis.

  19. Metaproteomics as a Complementary Approach to Gut Microbiota in Health and Disease

    NASA Astrophysics Data System (ADS)

    Petriz, Bernardo A.; Franco, Octávio L.

    2017-01-01

    Classic studies on phylotype profiling are limited to the identification of microbial constituents, where information is lacking about the molecular interaction of these bacterial communities with the host genome and the possible outcomes in host biology. A range of OMICs approaches have provided great progress linking the microbiota to health and disease. However, the investigation of this context through proteomic mass spectrometry-based tools is still being improved. Therefore, metaproteomics or community proteogenomics has emerged as a complementary approach to metagenomic data, as a field in proteomics aiming to perform large-scale characterization of proteins from environmental microbiota such as the human gut. The advances in molecular separation methods coupled with mass spectrometry (e.g. LC-MS/MS) and proteome bioinformatics have been fundamental in these novel large-scale metaproteomic studies, which have further been performed in a wide range of samples including soil, plant and human environments. Metaproteomic studies will make major progress if a comprehensive database covering the genes and expresses proteins from all gut microbial species is developed. To this end, we here present some of the main limitations of metaproteomic studies in complex microbiota environments such as the gut, also addressing the up-to-date pipelines in sample preparation prior to fractionation/separation and mass spectrometry analysis. In addition, a novel approach to the limitations of metagenomic databases is also discussed. Finally, prospects are addressed regarding the application of metaproteomic analysis using a unified host-microbiome gene database and other meta-OMICs platforms.

  20. A Scalable Approach for Protein False Discovery Rate Estimation in Large Proteomic Data Sets.

    PubMed

    Savitski, Mikhail M; Wilhelm, Mathias; Hahne, Hannes; Kuster, Bernhard; Bantscheff, Marcus

    2015-09-01

    Calculating the number of confidently identified proteins and estimating false discovery rate (FDR) is a challenge when analyzing very large proteomic data sets such as entire human proteomes. Biological and technical heterogeneity in proteomic experiments further add to the challenge and there are strong differences in opinion regarding the conceptual validity of a protein FDR and no consensus regarding the methodology for protein FDR determination. There are also limitations inherent to the widely used classic target-decoy strategy that particularly show when analyzing very large data sets and that lead to a strong over-representation of decoy identifications. In this study, we investigated the merits of the classic, as well as a novel target-decoy-based protein FDR estimation approach, taking advantage of a heterogeneous data collection comprised of ∼19,000 LC-MS/MS runs deposited in ProteomicsDB (https://www.proteomicsdb.org). The "picked" protein FDR approach treats target and decoy sequences of the same protein as a pair rather than as individual entities and chooses either the target or the decoy sequence depending on which receives the highest score. We investigated the performance of this approach in combination with q-value based peptide scoring to normalize sample-, instrument-, and search engine-specific differences. The "picked" target-decoy strategy performed best when protein scoring was based on the best peptide q-value for each protein yielding a stable number of true positive protein identifications over a wide range of q-value thresholds. We show that this simple and unbiased strategy eliminates a conceptual issue in the commonly used "classic" protein FDR approach that causes overprediction of false-positive protein identification in large data sets. The approach scales from small to very large data sets without losing performance, consistently increases the number of true-positive protein identifications and is readily implemented in proteomics analysis software. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

  1. A Scalable Approach for Protein False Discovery Rate Estimation in Large Proteomic Data Sets

    PubMed Central

    Savitski, Mikhail M.; Wilhelm, Mathias; Hahne, Hannes; Kuster, Bernhard; Bantscheff, Marcus

    2015-01-01

    Calculating the number of confidently identified proteins and estimating false discovery rate (FDR) is a challenge when analyzing very large proteomic data sets such as entire human proteomes. Biological and technical heterogeneity in proteomic experiments further add to the challenge and there are strong differences in opinion regarding the conceptual validity of a protein FDR and no consensus regarding the methodology for protein FDR determination. There are also limitations inherent to the widely used classic target–decoy strategy that particularly show when analyzing very large data sets and that lead to a strong over-representation of decoy identifications. In this study, we investigated the merits of the classic, as well as a novel target–decoy-based protein FDR estimation approach, taking advantage of a heterogeneous data collection comprised of ∼19,000 LC-MS/MS runs deposited in ProteomicsDB (https://www.proteomicsdb.org). The “picked” protein FDR approach treats target and decoy sequences of the same protein as a pair rather than as individual entities and chooses either the target or the decoy sequence depending on which receives the highest score. We investigated the performance of this approach in combination with q-value based peptide scoring to normalize sample-, instrument-, and search engine-specific differences. The “picked” target–decoy strategy performed best when protein scoring was based on the best peptide q-value for each protein yielding a stable number of true positive protein identifications over a wide range of q-value thresholds. We show that this simple and unbiased strategy eliminates a conceptual issue in the commonly used “classic” protein FDR approach that causes overprediction of false-positive protein identification in large data sets. The approach scales from small to very large data sets without losing performance, consistently increases the number of true-positive protein identifications and is readily implemented in proteomics analysis software. PMID:25987413

  2. pGlyco 2.0 enables precision N-glycoproteomics with comprehensive quality control and one-step mass spectrometry for intact glycopeptide identification.

    PubMed

    Liu, Ming-Qi; Zeng, Wen-Feng; Fang, Pan; Cao, Wei-Qian; Liu, Chao; Yan, Guo-Quan; Zhang, Yang; Peng, Chao; Wu, Jian-Qiang; Zhang, Xiao-Jin; Tu, Hui-Jun; Chi, Hao; Sun, Rui-Xiang; Cao, Yong; Dong, Meng-Qiu; Jiang, Bi-Yun; Huang, Jiang-Ming; Shen, Hua-Li; Wong, Catherine C L; He, Si-Min; Yang, Peng-Yuan

    2017-09-05

    The precise and large-scale identification of intact glycopeptides is a critical step in glycoproteomics. Owing to the complexity of glycosylation, the current overall throughput, data quality and accessibility of intact glycopeptide identification lack behind those in routine proteomic analyses. Here, we propose a workflow for the precise high-throughput identification of intact N-glycopeptides at the proteome scale using stepped-energy fragmentation and a dedicated search engine. pGlyco 2.0 conducts comprehensive quality control including false discovery rate evaluation at all three levels of matches to glycans, peptides and glycopeptides, improving the current level of accuracy of intact glycopeptide identification. The N-glycoproteome of samples metabolically labeled with 15 N/ 13 C were analyzed quantitatively and utilized to validate the glycopeptide identification, which could be used as a novel benchmark pipeline to compare different search engines. Finally, we report a large-scale glycoproteome dataset consisting of 10,009 distinct site-specific N-glycans on 1988 glycosylation sites from 955 glycoproteins in five mouse tissues.Protein glycosylation is a heterogeneous post-translational modification that generates greater proteomic diversity that is difficult to analyze. Here the authors describe pGlyco 2.0, a workflow for the precise one step identification of intact N-glycopeptides at the proteome scale.

  3. Recognizing millions of consistently unidentified spectra across hundreds of shotgun proteomics datasets

    PubMed Central

    Griss, Johannes; Perez-Riverol, Yasset; Lewis, Steve; Tabb, David L.; Dianes, José A.; del-Toro, Noemi; Rurik, Marc; Walzer, Mathias W.; Kohlbacher, Oliver; Hermjakob, Henning; Wang, Rui; Vizcaíno, Juan Antonio

    2016-01-01

    Mass spectrometry (MS) is the main technology used in proteomics approaches. However, on average 75% of spectra analysed in an MS experiment remain unidentified. We propose to use spectrum clustering at a large-scale to shed a light on these unidentified spectra. PRoteomics IDEntifications database (PRIDE) Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in PRIDE Archive, coming from hundreds of datasets, we were able to consistently characterize three distinct groups of spectra: 1) incorrectly identified spectra, 2) spectra correctly identified but below the set scoring threshold, and 3) truly unidentified spectra. Using a multitude of complementary analysis approaches, we were able to identify less than 20% of the consistently unidentified spectra. The complete spectrum clustering results are available through the new version of the PRIDE Cluster resource (http://www.ebi.ac.uk/pride/cluster). This resource is intended, among other aims, to encourage and simplify further investigation into these unidentified spectra. PMID:27493588

  4. Recognizing millions of consistently unidentified spectra across hundreds of shotgun proteomics datasets.

    PubMed

    Griss, Johannes; Perez-Riverol, Yasset; Lewis, Steve; Tabb, David L; Dianes, José A; Del-Toro, Noemi; Rurik, Marc; Walzer, Mathias W; Kohlbacher, Oliver; Hermjakob, Henning; Wang, Rui; Vizcaíno, Juan Antonio

    2016-08-01

    Mass spectrometry (MS) is the main technology used in proteomics approaches. However, on average 75% of spectra analysed in an MS experiment remain unidentified. We propose to use spectrum clustering at a large-scale to shed a light on these unidentified spectra. PRoteomics IDEntifications database (PRIDE) Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in PRIDE Archive, coming from hundreds of datasets, we were able to consistently characterize three distinct groups of spectra: 1) incorrectly identified spectra, 2) spectra correctly identified but below the set scoring threshold, and 3) truly unidentified spectra. Using a multitude of complementary analysis approaches, we were able to identify less than 20% of the consistently unidentified spectra. The complete spectrum clustering results are available through the new version of the PRIDE Cluster resource (http://www.ebi.ac.uk/pride/cluster). This resource is intended, among other aims, to encourage and simplify further investigation into these unidentified spectra.

  5. A Review: Proteomics in Retinal Artery Occlusion, Retinal Vein Occlusion, Diabetic Retinopathy and Acquired Macular Disorders

    PubMed Central

    Cehofski, Lasse Jørgensen; Honoré, Bent; Vorum, Henrik

    2017-01-01

    Retinal artery occlusion (RAO), retinal vein occlusion (RVO), diabetic retinopathy (DR) and age-related macular degeneration (AMD) are frequent ocular diseases with potentially sight-threatening outcomes. In the present review we discuss major findings of proteomic studies of RAO, RVO, DR and AMD, including an overview of ocular proteome changes associated with anti-vascular endothelial growth factor (VEGF) treatments. Despite the severe outcomes of RAO, the proteome of the disease remains largely unstudied. There is also limited knowledge about the proteome of RVO, but proteomic studies suggest that RVO is associated with remodeling of the extracellular matrix and adhesion processes. Proteomic studies of DR have resulted in the identification of potential therapeutic targets such as carbonic anhydrase-I. Proliferative diabetic retinopathy is the most intensively studied stage of DR. Proteomic studies have established VEGF, pigment epithelium-derived factor (PEDF) and complement components as key factors associated with AMD. The aim of this review is to highlight the major milestones in proteomics in RAO, RVO, DR and AMD. Through large-scale protein analyses, proteomics is bringing new important insights into these complex pathological conditions. PMID:28452939

  6. Content Is King: Databases Preserve the Collective Information of Science.

    PubMed

    Yates, John R

    2018-04-01

    Databases store sequence information experimentally gathered to create resources that further science. In the last 20 years databases have become critical components of fields like proteomics where they provide the basis for large-scale and high-throughput proteomic informatics. Amos Bairoch, winner of the Association of Biomolecular Resource Facilities Frederick Sanger Award, has created some of the important databases proteomic research depends upon for accurate interpretation of data.

  7. The peripheral blood proteome signature of idiopathic pulmonary fibrosis is distinct from normal and is associated with novel immunological processes.

    PubMed

    O'Dwyer, David N; Norman, Katy C; Xia, Meng; Huang, Yong; Gurczynski, Stephen J; Ashley, Shanna L; White, Eric S; Flaherty, Kevin R; Martinez, Fernando J; Murray, Susan; Noth, Imre; Arnold, Kelly B; Moore, Bethany B

    2017-04-25

    Idiopathic pulmonary fibrosis (IPF) is a progressive and fatal interstitial pneumonia. The disease pathophysiology is poorly understood and the etiology remains unclear. Recent advances have generated new therapies and improved knowledge of the natural history of IPF. These gains have been brokered by advances in technology and improved insight into the role of various genes in mediating disease, but gene expression and protein levels do not always correlate. Thus, in this paper we apply a novel large scale high throughput aptamer approach to identify more than 1100 proteins in the peripheral blood of well-characterized IPF patients and normal volunteers. We use systems biology approaches to identify a unique IPF proteome signature and give insight into biological processes driving IPF. We found IPF plasma to be altered and enriched for proteins involved in defense response, wound healing and protein phosphorylation when compared to normal human plasma. Analysis also revealed a minimal protein signature that differentiated IPF patients from normal controls, which may allow for accurate diagnosis of IPF based on easily-accessible peripheral blood. This report introduces large scale unbiased protein discovery analysis to IPF and describes distinct biological processes that further inform disease biology.

  8. Recent advances in methods for the analysis of protein o-glycosylation at proteome level.

    PubMed

    You, Xin; Qin, Hongqiang; Ye, Mingliang

    2018-01-01

    O-Glycosylation, which refers to the glycosylation of the hydroxyl group of side chains of Serine/Threonine/Tyrosine residues, is one of the most common post-translational modifications. Compared with N-linked glycosylation, O-glycosylation is less explored because of its complex structure and relatively low abundance. Recently, O-glycosylation has drawn more and more attention for its various functions in many sophisticated biological processes. To obtain a deep understanding of O-glycosylation, many efforts have been devoted to develop effective strategies to analyze the two most abundant types of O-glycosylation, i.e. O-N-acetylgalactosamine and O-N-acetylglucosamine glycosylation. In this review, we summarize the proteomics workflows to analyze these two types of O-glycosylation. For the large-scale analysis of mucin-type glycosylation, the glycan simplification strategies including the ''SimpleCell'' technology were introduced. A variety of enrichment methods including lectin affinity chromatography, hydrophilic interaction chromatography, hydrazide chemistry, and chemoenzymatic method were introduced for the proteomics analysis of O-N-acetylgalactosamine and O-N-acetylglucosamine glycosylation. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. How many proteins can be identified in a 2DE gel spot within an analysis of a complex human cancer tissue proteome?

    PubMed

    Zhan, Xianquan; Yang, Haiyan; Peng, Fang; Li, Jianglin; Mu, Yun; Long, Ying; Cheng, Tingting; Huang, Yuda; Li, Zhao; Lu, Miaolong; Li, Na; Li, Maoyu; Liu, Jianping; Jungblut, Peter R

    2018-04-01

    Two-dimensional gel electrophoresis (2DE) in proteomics is traditionally assumed to contain only one or two proteins in each 2DE spot. However, 2DE resolution is being complemented by the rapid development of high sensitivity mass spectrometers. Here we compared MALDI-MS, LC-Q-TOF MS and LC-Orbitrap Velos MS for the identification of proteins within one spot. With LC-Orbitrap Velos MS each Coomassie Blue-stained 2DE spot contained an average of at least 42 and 63 proteins/spot in an analysis of a human glioblastoma proteome and a human pituitary adenoma proteome, respectively, if a single gel spot was analyzed. If a pool of three matched gel spots was analyzed this number further increased up to an average of 230 and 118 proteins/spot for glioblastoma and pituitary adenoma proteome, respectively. Multiple proteins per spot confirm the necessity of isotopic labeling in large-scale quantification of different protein species in a proteome. Furthermore, a protein abundance analysis revealed that most of the identified proteins in each analyzed 2DE spot were low-abundance proteins. Many proteins were present in several of the analyzed spots showing the ability of 2DE-MS to separate at the protein species level. Therefore, 2DE coupled with high-sensitivity LC-MS has a clearly higher sensitivity as expected until now to detect, identify and quantify low abundance proteins in a complex human proteome with an estimated resolution of about 500 000 protein species. This clearly exceeds the resolution power of bottom-up LC-MS investigations. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. A Community Standard Format for the Representation of Protein Affinity Reagents*

    PubMed Central

    Gloriam, David E.; Orchard, Sandra; Bertinetti, Daniela; Björling, Erik; Bongcam-Rudloff, Erik; Borrebaeck, Carl A. K.; Bourbeillon, Julie; Bradbury, Andrew R. M.; de Daruvar, Antoine; Dübel, Stefan; Frank, Ronald; Gibson, Toby J.; Gold, Larry; Haslam, Niall; Herberg, Friedrich W.; Hiltke, Tara; Hoheisel, Jörg D.; Kerrien, Samuel; Koegl, Manfred; Konthur, Zoltán; Korn, Bernhard; Landegren, Ulf; Montecchi-Palazzi, Luisa; Palcy, Sandrine; Rodriguez, Henry; Schweinsberg, Sonja; Sievert, Volker; Stoevesandt, Oda; Taussig, Michael J.; Ueffing, Marius; Uhlén, Mathias; van der Maarel, Silvère; Wingren, Christer; Woollard, Peter; Sherman, David J.; Hermjakob, Henning

    2010-01-01

    Protein affinity reagents (PARs), most commonly antibodies, are essential reagents for protein characterization in basic research, biotechnology, and diagnostics as well as the fastest growing class of therapeutics. Large numbers of PARs are available commercially; however, their quality is often uncertain. In addition, currently available PARs cover only a fraction of the human proteome, and their cost is prohibitive for proteome scale applications. This situation has triggered several initiatives involving large scale generation and validation of antibodies, for example the Swedish Human Protein Atlas and the German Antibody Factory. Antibodies targeting specific subproteomes are being pursued by members of Human Proteome Organisation (plasma and liver proteome projects) and the United States National Cancer Institute (cancer-associated antigens). ProteomeBinders, a European consortium, aims to set up a resource of consistently quality-controlled protein-binding reagents for the whole human proteome. An ultimate PAR database resource would allow consumers to visit one on-line warehouse and find all available affinity reagents from different providers together with documentation that facilitates easy comparison of their cost and quality. However, in contrast to, for example, nucleotide databases among which data are synchronized between the major data providers, current PAR producers, quality control centers, and commercial companies all use incompatible formats, hindering data exchange. Here we propose Proteomics Standards Initiative (PSI)-PAR as a global community standard format for the representation and exchange of protein affinity reagent data. The PSI-PAR format is maintained by the Human Proteome Organisation PSI and was developed within the context of ProteomeBinders by building on a mature proteomics standard format, PSI-molecular interaction, which is a widely accepted and established community standard for molecular interaction data. Further information and documentation are available on the PSI-PAR web site. PMID:19674966

  11. Quantitative Proteomics Reveals Temporal Proteomic Changes in Signaling Pathways during BV2 Mouse Microglial Cell Activation.

    PubMed

    Woo, Jongmin; Han, Dohyun; Wang, Joseph Injae; Park, Joonho; Kim, Hyunsoo; Kim, Youngsoo

    2017-09-01

    The development of systematic proteomic quantification techniques in systems biology research has enabled one to perform an in-depth analysis of cellular systems. We have developed a systematic proteomic approach that encompasses the spectrum from global to targeted analysis on a single platform. We have applied this technique to an activated microglia cell system to examine changes in the intracellular and extracellular proteomes. Microglia become activated when their homeostatic microenvironment is disrupted. There are varying degrees of microglial activation, and we chose to focus on the proinflammatory reactive state that is induced by exposure to such stimuli as lipopolysaccharide (LPS) and interferon-gamma (IFN-γ). Using an improved shotgun proteomics approach, we identified 5497 proteins in the whole-cell proteome and 4938 proteins in the secretome that were associated with the activation of BV2 mouse microglia by LPS or IFN-γ. Of the differentially expressed proteins in stimulated microglia, we classified pathways that were related to immune-inflammatory responses and metabolism. Our label-free parallel reaction monitoring (PRM) approach made it possible to comprehensively measure the hyper-multiplex quantitative value of each protein by high-resolution mass spectrometry. Over 450 peptides that corresponded to pathway proteins and direct or indirect interactors via the STRING database were quantified by label-free PRM in a single run. Moreover, we performed a longitudinal quantification of secreted proteins during microglial activation, in which neurotoxic molecules that mediate neuronal cell loss in the brain are released. These data suggest that latent pathways that are associated with neurodegenerative diseases can be discovered by constructing and analyzing a pathway network model of proteins. Furthermore, this systematic quantification platform has tremendous potential for applications in large-scale targeted analyses. The proteomics data for discovery and label-free PRM analysis have been deposited to the ProteomeXchange Consortium with identifiers and , respectively.

  12. Assessing signal-to-noise in quantitative proteomics: multivariate statistical analysis in DIGE experiments.

    PubMed

    Friedman, David B

    2012-01-01

    All quantitative proteomics experiments measure variation between samples. When performing large-scale experiments that involve multiple conditions or treatments, the experimental design should include the appropriate number of individual biological replicates from each condition to enable the distinction between a relevant biological signal from technical noise. Multivariate statistical analyses, such as principal component analysis (PCA), provide a global perspective on experimental variation, thereby enabling the assessment of whether the variation describes the expected biological signal or the unanticipated technical/biological noise inherent in the system. Examples will be shown from high-resolution multivariable DIGE experiments where PCA was instrumental in demonstrating biologically significant variation as well as sample outliers, fouled samples, and overriding technical variation that would not be readily observed using standard univariate tests.

  13. Molecular phenotype of zebrafish ovarian follicle by serial analysis of gene expression and proteomic profiling, and comparison with the transcriptomes of other animals

    PubMed Central

    Knoll-Gellida, Anja; André, Michèle; Gattegno, Tamar; Forgue, Jean; Admon, Arie; Babin, Patrick J

    2006-01-01

    Background The ability of an oocyte to develop into a viable embryo depends on the accumulation of specific maternal information and molecules, such as RNAs and proteins. A serial analysis of gene expression (SAGE) was carried out in parallel with proteomic analysis on fully-grown ovarian follicles from zebrafish (Danio rerio). The data obtained were compared with ovary/follicle/egg molecular phenotypes of other animals, published or available in public sequence databases. Results Sequencing of 27,486 SAGE tags identified 11,399 different ones, including 3,329 tags with an occurrence superior to one. Fifty-eight genes were expressed at over 0.15% of the total population and represented 17.34% of the mRNA population identified. The three most expressed transcripts were a rhamnose-binding lectin, beta-actin 2, and a transcribed locus similar to the H2B histone family. Comparison with the large-scale expressed sequence tags sequencing approach revealed highly expressed transcripts that were not previously known to be expressed at high levels in fish ovaries, like the short-sized polarized metallothionein 2 transcript. A higher sensitivity for the detection of transcripts with a characterized maternal genetic contribution was also demonstrated compared to large-scale sequencing of cDNA libraries. Ferritin heavy polypeptide 1, heat shock protein 90-beta, lactate dehydrogenase B4, beta-actin isoforms, tubulin beta 2, ATP synthase subunit 9, together with 40 S ribosomal protein S27a, were common highly-expressed transcripts of vertebrate ovary/unfertilized egg. Comparison of transcriptome and proteome data revealed that transcript levels provide little predictive value with respect to the extent of protein abundance. All the proteins identified by proteomic analysis of fully-grown zebrafish follicles had at least one transcript counterpart, with two exceptions: eosinophil chemotactic cytokine and nothepsin. Conclusion This study provides a complete sequence data set of maternal mRNA stored in zebrafish germ cells at the end of oogenesis. This catalogue contains highly-expressed transcripts that are part of a vertebrate ovarian expressed gene signature. Comparison of transcriptome and proteome data identified downregulated transcripts or proteins potentially incorporated in the oocyte by endocytosis. The molecular phenotype described provides groundwork for future experimental approaches aimed at identifying functionally important stored maternal transcripts and proteins involved in oogenesis and early stages of embryo development. PMID:16526958

  14. Proteomics Analysis of the Causative Agent of Typhoid Fever

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ansong, Charles; Yoon, Hyunjin; Norbeck, Angela D.

    2008-02-01

    Typhoid fever is a potentially fatal disease caused by the bacterial pathogen Salmonella enterica serovar Typhi (S. typhi). S. typhi infection is a complex process that involves numerous bacterially-encoded virulence determinants, and these are thought to confer both stringent human host specificity and a high mortality rate. In the present study we used a liquid chromatography-mass spectrometry (LC-MS) based proteomics strategy to investigate the proteome of logarithmic, stationary phase, and low pH/low magnesium (MgM) S. typhi cultures. This represents the first large scale comprehensive characterization of the S. typhi proteome. Our analysis identified a total of 2066 S. typhi proteins.more » In an effort to identify putative S. typhi-specific virulence factors, we then compared our S. typhi results to those obtained in a previously published study of the S. typhimurium proteome under similar conditions (Adkins J.N. et al (2006) Mol Cell Prot). Comparative proteomic analysis of S. typhi (strain Ty2) and S. typhimurium (strains LT2 and 14028) revealed a subset of highly expressed proteins unique to S. typhi that were exclusively detected under conditions that mimic the infective state in macrophage cells. These proteins included CdtB, HlyE, and a conserved protein encoded by t1476. The differential expression of selected proteins was confirmed by Western blot analysis. Taken together with the current literature, our observations suggest that this subset of proteins may play a role in S. typhi pathogenesis and human host specificity. In addition, we observed products of the biotin (bio) operon displayed a higher abundance in the more virulent strains S. typhi-Ty2 and S. typhimurium-14028 compared to the virulence attenuated S. typhimurium strain LT2, suggesting bio proteins may contribute to Salmonella pathogenesis.« less

  15. Single-cell-type quantitative proteomic and ionomic analysis of epidermal bladder cells from the halophyte model plant Mesembryanthemum crystallinum to identify salt-responsive proteins.

    PubMed

    Barkla, Bronwyn J; Vera-Estrella, Rosario; Raymond, Carolyn

    2016-05-10

    Epidermal bladder cells (EBC) are large single-celled, specialized, and modified trichomes found on the aerial parts of the halophyte Mesembryanthemum crystallinum. Recent development of a simple but high throughput technique to extract the contents from these cells has provided an opportunity to conduct detailed single-cell-type analyses of their molecular characteristics at high resolution to gain insight into the role of these cells in the salt tolerance of the plant. In this study, we carry out large-scale complementary quantitative proteomic studies using both a label (DIGE) and label-free (GeLC-MS) approach to identify salt-responsive proteins in the EBC extract. Additionally we perform an ionomics analysis (ICP-MS) to follow changes in the amounts of 27 different elements. Using these methods, we were able to identify 54 proteins and nine elements that showed statistically significant changes in the EBC from salt-treated plants. GO enrichment analysis identified a large number of transport proteins but also proteins involved in photosynthesis, primary metabolism and Crassulacean acid metabolism (CAM). Validation of results by western blot, confocal microscopy and enzyme analysis helped to strengthen findings and further our understanding into the role of these specialized cells. As expected EBC accumulated large quantities of sodium, however, the most abundant element was chloride suggesting the sequestration of this ion into the EBC vacuole is just as important for salt tolerance. This single-cell type omics approach shows that epidermal bladder cells of M. crystallinum are metabolically active modified trichomes, with primary metabolism supporting cell growth, ion accumulation, compatible solute synthesis and CAM. Data are available via ProteomeXchange with identifier PXD004045.

  16. Background | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    The term "proteomics" refers to a large-scale comprehensive study of a specific proteome resulting from its genome, including abundances of proteins, their variations and modifications, and interacting partners and networks in order to understand cellular processes involved.  Similarly, “Cancer proteomics” refers to comprehensive analyses of proteins and their derivatives translated from a specific cancer genome using a human biospecimen or a preclinical model (e.g., cultured cell or animal model).

  17. Large-scale identification of target proteins of a glycosyltransferase isozyme by Lectin-IGOT-LC/MS, an LC/MS-based glycoproteomic approach

    PubMed Central

    Sugahara, Daisuke; Kaji, Hiroyuki; Sugihara, Kazushi; Asano, Masahide; Narimatsu, Hisashi

    2012-01-01

    Model organisms containing deletion or mutation in a glycosyltransferase-gene exhibit various physiological abnormalities, suggesting that specific glycan motifs on certain proteins play important roles in vivo. Identification of the target proteins of glycosyltransferase isozymes is the key to understand the roles of glycans. Here, we demonstrated the proteome-scale identification of the target proteins specific for a glycosyltransferase isozyme, β1,4-galactosyltransferase-I (β4GalT-I). Although β4GalT-I is the most characterized glycosyltransferase, its distinctive contribution to β1,4-galactosylation has been hardly described so far. We identified a large number of candidates for the target proteins specific to β4GalT-I by comparative analysis of β4GalT-I-deleted and wild-type mice using the LC/MS-based technique with the isotope-coded glycosylation site-specific tagging (IGOT) of lectin-captured N-glycopeptides. Our approach to identify the target proteins in a proteome-scale offers common features and trends in the target proteins, which facilitate understanding of the mechanism that controls assembly of a particular glycan motif on specific proteins. PMID:23002422

  18. A Proteomics Approach to the Protein Normalization Problem: Selection of Unvarying Proteins for MS-Based Proteomics and Western Blotting.

    PubMed

    Wiśniewski, Jacek R; Mann, Matthias

    2016-07-01

    Proteomics and other protein-based analysis methods such as Western blotting all face the challenge of discriminating changes in the levels of proteins of interest from inadvertent changes in the amount loaded for analysis. Mass-spectrometry-based proteomics can now estimate the relative and absolute amounts of thousands of proteins across diverse biological systems. We reasoned that this new technology could prove useful for selection of very stably expressed proteins that could serve as better loading controls than those traditionally employed. Large-scale proteomic analyses of SDS lysates of cultured cells and tissues revealed deglycase DJ-1 as the protein with the lowest variability in abundance among different cell types in human, mouse, and amphibian cells. The protein constitutes 0.069 ± 0.017% of total cellular protein and occurs at a specific concentration of 34.6 ± 8.7 pmol/mg of total protein. Since DJ-1 is ubiquitous and therefore easily detectable with several peptides, it can be helpful in normalization of proteomic data sets. In addition, DJ-1 appears to be an advantageous loading control for Western blot that is superior to those used commonly used, allowing comparisons between tissues and cells originating from evolutionarily distant vertebrate species. Notably, this is not possible by the detection and quantitation of housekeeping proteins, which are often used in the Western blot technique. The approach introduced here can be applied to select the most appropriate loading controls for MS-based proteomics or Western blotting in any biological system.

  19. A Matter of Time: Faster Percolator Analysis via Efficient SVM Learning for Large-Scale Proteomics.

    PubMed

    Halloran, John T; Rocke, David M

    2018-05-04

    Percolator is an important tool for greatly improving the results of a database search and subsequent downstream analysis. Using support vector machines (SVMs), Percolator recalibrates peptide-spectrum matches based on the learned decision boundary between targets and decoys. To improve analysis time for large-scale data sets, we update Percolator's SVM learning engine through software and algorithmic optimizations rather than heuristic approaches that necessitate the careful study of their impact on learned parameters across different search settings and data sets. We show that by optimizing Percolator's original learning algorithm, l 2 -SVM-MFN, large-scale SVM learning requires nearly only a third of the original runtime. Furthermore, we show that by employing the widely used Trust Region Newton (TRON) algorithm instead of l 2 -SVM-MFN, large-scale Percolator SVM learning is reduced to nearly only a fifth of the original runtime. Importantly, these speedups only affect the speed at which Percolator converges to a global solution and do not alter recalibration performance. The upgraded versions of both l 2 -SVM-MFN and TRON are optimized within the Percolator codebase for multithreaded and single-thread use and are available under Apache license at bitbucket.org/jthalloran/percolator_upgrade .

  20. Advances in Proteomics Data Analysis and Display Using an Accurate Mass and Time Tag Approach

    PubMed Central

    Zimmer, Jennifer S.D.; Monroe, Matthew E.; Qian, Wei-Jun; Smith, Richard D.

    2007-01-01

    Proteomics has recently demonstrated utility in understanding cellular processes on the molecular level as a component of systems biology approaches and for identifying potential biomarkers of various disease states. The large amount of data generated by utilizing high efficiency (e.g., chromatographic) separations coupled to high mass accuracy mass spectrometry for high-throughput proteomics analyses presents challenges related to data processing, analysis, and display. This review focuses on recent advances in nanoLC-FTICR-MS-based proteomics approaches and the accompanying data processing tools that have been developed to display and interpret the large volumes of data being produced. PMID:16429408

  1. The emergence of top-down proteomics in clinical research

    PubMed Central

    2013-01-01

    Proteomic technology has advanced steadily since the development of 'soft-ionization' techniques for mass-spectrometry-based molecular identification more than two decades ago. Now, the large-scale analysis of proteins (proteomics) is a mainstay of biological research and clinical translation, with researchers seeking molecular diagnostics, as well as protein-based markers for personalized medicine. Proteomic strategies using the protease trypsin (known as bottom-up proteomics) were the first to be developed and optimized and form the dominant approach at present. However, researchers are now beginning to understand the limitations of bottom-up techniques, namely the inability to characterize and quantify intact protein molecules from a complex mixture of digested peptides. To overcome these limitations, several laboratories are taking a whole-protein-based approach, in which intact protein molecules are the analytical targets for characterization and quantification. We discuss these top-down techniques and how they have been applied to clinical research and are likely to be applied in the near future. Given the recent improvements in mass-spectrometry-based proteomics and stronger cooperation between researchers, clinicians and statisticians, both peptide-based (bottom-up) strategies and whole-protein-based (top-down) strategies are set to complement each other and help researchers and clinicians better understand and detect complex disease phenotypes. PMID:23806018

  2. An object model and database for functional genomics.

    PubMed

    Jones, Andrew; Hunt, Ela; Wastling, Jonathan M; Pizarro, Angel; Stoeckert, Christian J

    2004-07-10

    Large-scale functional genomics analysis is now feasible and presents significant challenges in data analysis, storage and querying. Data standards are required to enable the development of public data repositories and to improve data sharing. There is an established data format for microarrays (microarray gene expression markup language, MAGE-ML) and a draft standard for proteomics (PEDRo). We believe that all types of functional genomics experiments should be annotated in a consistent manner, and we hope to open up new ways of comparing multiple datasets used in functional genomics. We have created a functional genomics experiment object model (FGE-OM), developed from the microarray model, MAGE-OM and two models for proteomics, PEDRo and our own model (Gla-PSI-Glasgow Proposal for the Proteomics Standards Initiative). FGE-OM comprises three namespaces representing (i) the parts of the model common to all functional genomics experiments; (ii) microarray-specific components; and (iii) proteomics-specific components. We believe that FGE-OM should initiate discussion about the contents and structure of the next version of MAGE and the future of proteomics standards. A prototype database called RNA And Protein Abundance Database (RAPAD), based on FGE-OM, has been implemented and populated with data from microbial pathogenesis. FGE-OM and the RAPAD schema are available from http://www.gusdb.org/fge.html, along with a set of more detailed diagrams. RAPAD can be accessed by registration at the site.

  3. Activity-based protein profiling for biochemical pathway discovery in cancer

    PubMed Central

    Nomura, Daniel K.; Dix, Melissa M.; Cravatt, Benjamin F.

    2011-01-01

    Large-scale profiling methods have uncovered numerous gene and protein expression changes that correlate with tumorigenesis. However, determining the relevance of these expression changes and which biochemical pathways they affect has been hindered by our incomplete understanding of the proteome and its myriad functions and modes of regulation. Activity-based profiling platforms enable both the discovery of cancer-relevant enzymes and selective pharmacological probes to perturb and characterize these proteins in tumour cells. When integrated with other large-scale profiling methods, activity-based proteomics can provide insight into the metabolic and signalling pathways that support cancer pathogenesis and illuminate new strategies for disease diagnosis and treatment. PMID:20703252

  4. Application of Large-Scale Aptamer-Based Proteomic Profiling to Planned Myocardial Infarctions.

    PubMed

    Jacob, Jaison; Ngo, Debby; Finkel, Nancy; Pitts, Rebecca; Gleim, Scott; Benson, Mark D; Keyes, Michelle J; Farrell, Laurie A; Morgan, Thomas; Jennings, Lori L; Gerszten, Robert E

    2018-03-20

    Emerging proteomic technologies using novel affinity-based reagents allow for efficient multiplexing with high-sample throughput. To identify early biomarkers of myocardial injury, we recently applied an aptamer-based proteomic profiling platform that measures 1129 proteins to samples from patients undergoing septal alcohol ablation for hypertrophic cardiomyopathy, a human model of planned myocardial injury. Here, we examined the scalability of this approach using a markedly expanded platform to study a far broader range of human proteins in the context of myocardial injury. We applied a highly multiplexed, expanded proteomic technique that uses single-stranded DNA aptamers to assay 4783 human proteins (4137 distinct human gene targets) to derivation and validation cohorts of planned myocardial injury, individuals with spontaneous myocardial infarction, and at-risk controls. We found 376 target proteins that significantly changed in the blood after planned myocardial injury in a derivation cohort (n=20; P <1.05E-05, 1-way repeated measures analysis of variance, Bonferroni threshold). Two hundred forty-seven of these proteins were validated in an independent planned myocardial injury cohort (n=15; P <1.33E-04, 1-way repeated measures analysis of variance); >90% were directionally consistent and reached nominal significance in the validation cohort. Among the validated proteins that were increased within 1 hour after planned myocardial injury, 29 were also elevated in patients with spontaneous myocardial infarction (n=63; P <6.17E-04). Many of the novel markers identified in our study are intracellular proteins not previously identified in the peripheral circulation or have functional roles relevant to myocardial injury. For example, the cardiac LIM protein, cysteine- and glycine-rich protein 3, is thought to mediate cardiac mechanotransduction and stress responses, whereas the mitochondrial ATP synthase F 0 subunit component is a vasoactive peptide on its release from cells. Last, we performed aptamer-affinity enrichment coupled with mass spectrometry to technically verify aptamer specificity for a subset of the new biomarkers. Our results demonstrate the feasibility of large-scale aptamer multiplexing at a level that has not previously been reported and with sample throughput that greatly exceeds other existing proteomic methods. The expanded aptamer-based proteomic platform provides a unique opportunity for biomarker and pathway discovery after myocardial injury. © 2017 American Heart Association, Inc.

  5. Evolutionary conservation of the polyproline II conformation surrounding intrinsically disordered phosphorylation sites.

    PubMed

    Elam, W Austin; Schrank, Travis P; Campagnolo, Andrew J; Hilser, Vincent J

    2013-04-01

    Intrinsically disordered (ID) proteins function in the absence of a unique stable structure and appear to challenge the classic structure-function paradigm. The extent to which ID proteins take advantage of subtle conformational biases to perform functions, and whether signals for such mechanism can be identified in proteome-wide studies is not well understood. Of particular interest is the polyproline II (PII) conformation, suggested to be highly populated in unfolded proteins. We experimentally determine a complete calorimetric propensity scale for the PII conformation. Projection of the scale into representative eukaryotic proteomes reveals significant PII bias in regions coding for ID proteins. Importantly, enrichment of PII in ID proteins, or protein segments, is also captured by other PII scales, indicating that this enrichment is robustly encoded and universally detectable regardless of the method of PII propensity determination. Gene ontology (GO) terms obtained using our PII scale and other scales demonstrate a consensus for molecular functions performed by high PII proteins across the proteome. Perhaps the most striking result of the GO analysis is conserved enrichment (P < 10(-8) ) of phosphorylation sites in high PII regions found by all PII scales. Subsequent conformational analysis reveals a phosphorylation-dependent modulation of PII, suggestive of a conserved "tunability" within these regions. In summary, the application of an experimentally determined polyproline II (PII) propensity scale to proteome-wide sequence analysis and gene ontology reveals an enrichment of PII bias near disordered phosphorylation sites that is conserved throughout eukaryotes. Copyright © 2013 The Protein Society.

  6. Large-Scale Proteome Comparative Analysis of Developing Rhizomes of the Ancient Vascular Plant Equisetum Hyemale

    PubMed Central

    Balbuena, Tiago Santana; He, Ruifeng; Salvato, Fernanda; Gang, David R.; Thelen, Jay J.

    2012-01-01

    Horsetail (Equisetum hyemale) is a widespread vascular plant species, whose reproduction is mainly dependent on the growth and development of the rhizomes. Due to its key evolutionary position, the identification of factors that could be involved in the existence of the rhizomatous trait may contribute to a better understanding of the role of this underground organ for the successful propagation of this and other plant species. In the present work, we characterized the proteome of E. hyemale rhizomes using a GeLC-MS spectral-counting proteomics strategy. A total of 1,911 and 1,860 non-redundant proteins were identified in the rhizomes apical tip and elongation zone, respectively. Rhizome-characteristic proteins were determined by comparisons of the developing rhizome tissues to developing roots. A total of 87 proteins were found to be up-regulated in both horsetail rhizome tissues in relation to developing roots. Hierarchical clustering indicated a vast dynamic range in the regulation of the 87 characteristic proteins and revealed, based on the regulation profile, the existence of nine major protein groups. Gene ontology analyses suggested an over-representation of the terms involved in macromolecular and protein biosynthetic processes, gene expression, and nucleotide and protein binding functions. Spatial difference analysis between the rhizome apical tip and the elongation zone revealed that only eight proteins were up-regulated in the apical tip including RNA-binding proteins and an acyl carrier protein, as well as a KH domain protein and a T-complex subunit; while only seven proteins were up-regulated in the elongation zone including phosphomannomutase, galactomannan galactosyltransferase, endoglucanase 10 and 25, and mannose-1-phosphate guanyltransferase subunits alpha and beta. This is the first large-scale characterization of the proteome of a plant rhizome. Implications of the findings were discussed in relation to other underground organs and related species. PMID:22740841

  7. Rescuing discarded spectra: Full comprehensive analysis of a minimal proteome.

    PubMed

    Lluch-Senar, Maria; Mancuso, Francesco M; Climente-González, Héctor; Peña-Paz, Marcia I; Sabido, Eduard; Serrano, Luis

    2016-02-01

    A common problem encountered when performing large-scale MS proteome analysis is the loss of information due to the high percentage of unassigned spectra. To determine the causes behind this loss we have analyzed the proteome of one of the smallest living bacteria that can be grown axenically, Mycoplasma pneumoniae (729 ORFs). The proteome of M. pneumoniae cells, grown in defined media, was analyzed by MS. An initial search with both Mascot and a species-specific NCBInr database with common contaminants (NCBImpn), resulted in around 79% of the acquired spectra not having an assignment. The percentage of non-assigned spectra was reduced to 27% after re-analysis of the data with the PEAKS software, thereby increasing the proteome coverage of M. pneumoniae from the initial 60% to over 76%. Nonetheless, 33,413 spectra with assigned amino acid sequences could not be mapped to any NCBInr database protein sequence. Approximately, 1% of these unassigned peptides corresponded to PTMs and 4% to M. pneumoniae protein variants (deamidation and translation inaccuracies). The most abundant peptide sequence variants (Phe-Tyr and Ala-Ser) could be explained by alterations in the editing capacity of the corresponding tRNA synthases. About another 1% of the peptides not associated to any protein had repetitions of the same aromatic/hydrophobic amino acid at the N-terminus, or had Arg/Lys at the C-terminus. Thus, in a model system, we have maximized the number of assigned spectra to 73% (51,453 out of the 70,040 initial acquired spectra). All MS data have been deposited in the ProteomeXchange with identifier PXD002779 (http://proteomecentral.proteomexchange.org/dataset/PXD002779). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Identification of Phosphorylated Proteins on a Global Scale.

    PubMed

    Iliuk, Anton

    2018-05-31

    Liquid chromatography (LC) coupled with tandem mass spectrometry (MS/MS) has enabled researchers to analyze complex biological samples with unprecedented depth. It facilitates the identification and quantification of modifications within thousands of proteins in a single large-scale proteomic experiment. Analysis of phosphorylation, one of the most common and important post-translational modifications, has particularly benefited from such progress in the field. Here, detailed protocols are provided for a few well-regarded, common sample preparation methods for an effective phosphoproteomic experiment. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.

  9. CPTC and NIST-sponsored Yeast Reference Material Now Publicly Available | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    The yeast protein extract (RM8323) developed by National Institute of Standards and Technology (NIST) under the auspices of NCI's CPTC initiative is currently available to the public at https://www-s.nist.gov/srmors/view_detail.cfm?srm=8323. The yeast proteome offers researchers a unique biological reference material. RM8323 is the most extensively characterized complex biological proteome and the only one associated with several large-scale studies to estimate protein abundance across a wide concentration range.

  10. Guidelines for reporting quantitative mass spectrometry based experiments in proteomics.

    PubMed

    Martínez-Bartolomé, Salvador; Deutsch, Eric W; Binz, Pierre-Alain; Jones, Andrew R; Eisenacher, Martin; Mayer, Gerhard; Campos, Alex; Canals, Francesc; Bech-Serra, Joan-Josep; Carrascal, Montserrat; Gay, Marina; Paradela, Alberto; Navajas, Rosana; Marcilla, Miguel; Hernáez, María Luisa; Gutiérrez-Blázquez, María Dolores; Velarde, Luis Felipe Clemente; Aloria, Kerman; Beaskoetxea, Jabier; Medina-Aunon, J Alberto; Albar, Juan P

    2013-12-16

    Mass spectrometry is already a well-established protein identification tool and recent methodological and technological developments have also made possible the extraction of quantitative data of protein abundance in large-scale studies. Several strategies for absolute and relative quantitative proteomics and the statistical assessment of quantifications are possible, each having specific measurements and therefore, different data analysis workflows. The guidelines for Mass Spectrometry Quantification allow the description of a wide range of quantitative approaches, including labeled and label-free techniques and also targeted approaches such as Selected Reaction Monitoring (SRM). The HUPO Proteomics Standards Initiative (HUPO-PSI) has invested considerable efforts to improve the standardization of proteomics data handling, representation and sharing through the development of data standards, reporting guidelines, controlled vocabularies and tooling. In this manuscript, we describe a key output from the HUPO-PSI-namely the MIAPE Quant guidelines, which have developed in parallel with the corresponding data exchange format mzQuantML [1]. The MIAPE Quant guidelines describe the HUPO-PSI proposal concerning the minimum information to be reported when a quantitative data set, derived from mass spectrometry (MS), is submitted to a database or as supplementary information to a journal. The guidelines have been developed with input from a broad spectrum of stakeholders in the proteomics field to represent a true consensus view of the most important data types and metadata, required for a quantitative experiment to be analyzed critically or a data analysis pipeline to be reproduced. It is anticipated that they will influence or be directly adopted as part of journal guidelines for publication and by public proteomics databases and thus may have an impact on proteomics laboratories across the world. This article is part of a Special Issue entitled: Standardization and Quality Control. Copyright © 2013 Elsevier B.V. All rights reserved.

  11. Science, marketing and wishful thinking in quantitative proteomics.

    PubMed

    Hackett, Murray

    2008-11-01

    In a recent editorial (J. Proteome Res. 2007, 6, 1633) and elsewhere questions have been raised regarding the lack of attention paid to good analytical practice with respect to the reporting of quantitative results in proteomics. Using those comments as a starting point, several issues are discussed that relate to the challenges involved in achieving adequate sampling with MS-based methods in order to generate valid data for large-scale studies. The discussion touches on the relationships that connect sampling depth and the power to detect protein abundance change, conflict of interest, and strategies to overcome bureaucratic obstacles that impede the use of peer-to-peer technologies for transfer and storage of large data files generated in such experiments.

  12. Evolution of complete proteomes: guanine-cytosine pressure, phylogeny and environmental influences blend the proteomic architecture

    PubMed Central

    2013-01-01

    Background Guanine-cytosine (GC) composition is an important feature of genomes. Likewise, amino acid composition is a distinct, but less valued, feature of proteomes. A major concern is that it is not clear what valuable information can be acquired from amino acid composition data. To address this concern, in-depth analyses of the amino acid composition of the complete proteomes from 63 archaea, 270 bacteria, and 128 eukaryotes were performed. Results Principal component analysis of the amino acid matrices showed that the main contributors to proteomic architecture were genomic GC variation, phylogeny, and environmental influences. GC pressure drove positive selection on Ala, Arg, Gly, Pro, Trp, and Val, and adverse selection on Asn, Lys, Ile, Phe, and Tyr. The physico-chemical framework of the complete proteomes withstood GC pressure by frequency complementation of GC-dependent amino acid pairs with similar physico-chemical properties. Gln, His, Ser, and Val were responsible for phylogeny and their constituted components could differentiate archaea, bacteria, and eukaryotes. Environmental niche was also a significant factor in determining proteomic architecture, especially for archaea for which the main amino acids were Cys, Leu, and Thr. In archaea, hyperthermophiles, acidophiles, mesophiles, psychrophiles, and halophiles gathered successively along the environment-based principal component. Concordance between proteomic architecture and the genetic code was also related closely to genomic GC content, phylogeny, and lifestyles. Conclusions Large-scale analyses of the complete proteomes of a wide range of organisms suggested that amino acid composition retained the trace of GC variation, phylogeny, and environmental influences during evolution. The findings from this study will help in the development of a global understanding of proteome evolution, and even biological evolution. PMID:24088322

  13. Cloud-based solution to identify statistically significant MS peaks differentiating sample categories.

    PubMed

    Ji, Jun; Ling, Jeffrey; Jiang, Helen; Wen, Qiaojun; Whitin, John C; Tian, Lu; Cohen, Harvey J; Ling, Xuefeng B

    2013-03-23

    Mass spectrometry (MS) has evolved to become the primary high throughput tool for proteomics based biomarker discovery. Until now, multiple challenges in protein MS data analysis remain: large-scale and complex data set management; MS peak identification, indexing; and high dimensional peak differential analysis with the concurrent statistical tests based false discovery rate (FDR). "Turnkey" solutions are needed for biomarker investigations to rapidly process MS data sets to identify statistically significant peaks for subsequent validation. Here we present an efficient and effective solution, which provides experimental biologists easy access to "cloud" computing capabilities to analyze MS data. The web portal can be accessed at http://transmed.stanford.edu/ssa/. Presented web application supplies large scale MS data online uploading and analysis with a simple user interface. This bioinformatic tool will facilitate the discovery of the potential protein biomarkers using MS.

  14. Nano-LC FTICR tandem mass spectrometry for top-down proteomics: routine baseline unit mass resolution of whole cell lysate proteins up to 72 kDa.

    PubMed

    Tipton, Jeremiah D; Tran, John C; Catherman, Adam D; Ahlf, Dorothy R; Durbin, Kenneth R; Lee, Ji Eun; Kellie, John F; Kelleher, Neil L; Hendrickson, Christopher L; Marshall, Alan G

    2012-03-06

    Current high-throughput top-down proteomic platforms provide routine identification of proteins less than 25 kDa with 4-D separations. This short communication reports the application of technological developments over the past few years that improve protein identification and characterization for masses greater than 25 kDa. Advances in separation science have allowed increased numbers of proteins to be identified, especially by nanoliquid chromatography (nLC) prior to mass spectrometry (MS) analysis. Further, a goal of high-throughput top-down proteomics is to extend the mass range for routine nLC MS analysis up to 80 kDa because gene sequence analysis predicts that ~70% of the human proteome is transcribed to be less than 80 kDa. Normally, large proteins greater than 50 kDa are identified and characterized by top-down proteomics through fraction collection and direct infusion at relatively low throughput. Further, other MS-based techniques provide top-down protein characterization, however at low resolution for intact mass measurement. Here, we present analysis of standard (up to 78 kDa) and whole cell lysate proteins by Fourier transform ion cyclotron resonance mass spectrometry (nLC electrospray ionization (ESI) FTICR MS). The separation platform reduced the complexity of the protein matrix so that, at 14.5 T, proteins from whole cell lysate up to 72 kDa are baseline mass resolved on a nano-LC chromatographic time scale. Further, the results document routine identification of proteins at improved throughput based on accurate mass measurement (less than 10 ppm mass error) of precursor and fragment ions for proteins up to 50 kDa.

  15. Systematic Analysis of Compositional Order of Proteins Reveals New Characteristics of Biological Functions and a Universal Correlate of Macroevolution

    PubMed Central

    Persi, Erez; Horn, David

    2013-01-01

    We present a novel analysis of compositional order (CO) based on the occurrence of Frequent amino-acid Triplets (FTs) that appear much more than random in protein sequences. The method captures all types of proteomic compositional order including single amino-acid runs, tandem repeats, periodic structure of motifs and otherwise low complexity amino-acid regions. We introduce new order measures, distinguishing between ‘regularity’, ‘periodicity’ and ‘vocabulary’, to quantify these phenomena and to facilitate the identification of evolutionary effects. Detailed analysis of representative species across the tree-of-life demonstrates that CO proteins exhibit numerous functional enrichments, including a wide repertoire of particular patterns of dependencies on regularity and periodicity. Comparison between human and mouse proteomes further reveals the interplay of CO with evolutionary trends, such as faster substitution rate in mouse leading to decrease of periodicity, while innovation along the human lineage leads to larger regularity. Large-scale analysis of 94 proteomes leads to systematic ordering of all major taxonomic groups according to FT-vocabulary size. This is measured by the count of Different Frequent Triplets (DFT) in proteomes. The latter provides a clear hierarchical delineation of vertebrates, invertebrates, plants, fungi and prokaryotes, with thermophiles showing the lowest level of FT-vocabulary. Among eukaryotes, this ordering correlates with phylogenetic proximity. Interestingly, in all kingdoms CO accumulation in the proteome has universal characteristics. We suggest that CO is a genomic-information correlate of both macroevolution and various protein functions. The results indicate a mechanism of genomic ‘innovation’ at the peptide level, involved in protein elongation, shaped in a universal manner by mutational and selective forces. PMID:24278003

  16. Age distribution of human gene families shows significant roles of both large- and small-scale duplications in vertebrate evolution.

    PubMed

    Gu, Xun; Wang, Yufeng; Gu, Jianying

    2002-06-01

    The classical (two-round) hypothesis of vertebrate genome duplication proposes two successive whole-genome duplication(s) (polyploidizations) predating the origin of fishes, a view now being seriously challenged. As the debate largely concerns the relative merits of the 'big-bang mode' theory (large-scale duplication) and the 'continuous mode' theory (constant creation by small-scale duplications), we tested whether a significant proportion of paralogous genes in the contemporary human genome was indeed generated in the early stage of vertebrate evolution. After an extensive search of major databases, we dated 1,739 gene duplication events from the phylogenetic analysis of 749 vertebrate gene families. We found a pattern characterized by two waves (I, II) and an ancient component. Wave I represents a recent gene family expansion by tandem or segmental duplications, whereas wave II, a rapid paralogous gene increase in the early stage of vertebrate evolution, supports the idea of genome duplication(s) (the big-bang mode). Further analysis indicated that large- and small-scale gene duplications both make a significant contribution during the early stage of vertebrate evolution to build the current hierarchy of the human proteome.

  17. Proteomics and circadian rhythms: It’s all about signaling!

    PubMed Central

    Mauvoisin, Daniel; Dayon, Loïc; Gachon, Frédéric; Kussmann, Martin

    2014-01-01

    1. Abstract Proteomic technologies using mass spectrometry (MS) offer new perspectives in circadian biology, in particular the possibility to study posttranslational modifications (PTMs). To date, only very few studies have been carried out to decipher the rhythmicity of protein expression in mammals with large-scale proteomics. Although signaling has been shown to be of high relevance, comprehensive characterization studies of PTMs are even more rare. This review aims at describing the actual landscape of circadian proteomics and the opportunities and challenges appearing on the horizon. Emphasis was given to signaling processes for their role in metabolic heath as regulated by circadian clocks and environmental factors. Those signaling processes are expected to be better and more deeply characterized in the coming years with proteomics. PMID:25103677

  18. Quantitative proteomics reveals a role of JAZ7 in plant defense response to Pseudomonas syringae DC3000.

    PubMed

    Zhang, Tong; Meng, Li; Kong, Wenwen; Yin, Zepeng; Wang, Yang; Schneider, Jacqueline D; Chen, Sixue

    2018-03-20

    Jasmonate ZIM-domain (JAZ) proteins are key transcriptional repressors regulating various biological processes. Although many studies have studied JAZ proteins by genetic and biochemical analyses, little is known about JAZ7-associated global protein networks and how JAZ7 contributes to bacterial pathogen defense. In this study, we aim to fill this knowledge gap by conducting unbiased large-scale quantitative proteomics using tandem mass tags (TMT). We compared the proteomes of a JAZ7 knock-out line, a JAZ7 overexpression line, as well as the wild type Arabidopsis plants in the presence and absence of Pseudomonas syringae DC3000 infection. Both pairwise comparison and multi-factor analysis of variance reveal that differential proteins are enriched in biological processes such as primary and secondary metabolism, redox regulation, and response to stress. The differential regulation in these pathways may account for the alterations in plant size, redox homeostasis and accumulation of glucosinolates. In addition, possible interplay between genotype and environment is suggested as the abundance of seven proteins is influenced by the interaction of the two factors. Collectively, we demonstrate a role of JAZ7 in pathogen defense and provide a list of proteins that are uniquely responsive to genetic disruption, pathogen infection, or the interaction between genotypes and environmental factors. We report proteomic changes as a result of genetic perturbation of JAZ7, and the contribution of JAZ7 in plant immunity. Specifically, the similarity between the proteomes of a JAZ7 knockout mutant and the wild type plants confirmed the functional redundancy of JAZs. In contrast, JAZ7 overexpression plants were much different, and proteomic analysis of the JAZ7 overexpression plants under Pst DC3000 infection revealed that JAZ7 may regulate plant immunity via ROS modulation, energy balance and glucosinolate biosynthesis. Multiple variate analysis for this two-factor proteomics experiment suggests that protein abundance is determined by genotype, environment and the interaction between them. Copyright © 2018 Elsevier B.V. All rights reserved.

  19. Advances in targeted proteomics and applications to biomedical research

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shi, Tujin; Song, Ehwang; Nie, Song

    Targeted proteomics technique has emerged as a powerful protein quantification tool in systems biology, biomedical research, and increasing for clinical applications. The most widely used targeted proteomics approach, selected reaction monitoring (SRM), also known as multiple reaction monitoring (MRM), can be used for quantification of cellular signaling networks and preclinical verification of candidate protein biomarkers. As an extension to our previous review on advances in SRM sensitivity (Shi et al., Proteomics, 12, 1074–1092, 2012) herein we review recent advances in the method and technology for further enhancing SRM sensitivity (from 2012 to present), and highlighting its broad biomedical applications inmore » human bodily fluids, tissue and cell lines. Furthermore, we also review two recently introduced targeted proteomics approaches, parallel reaction monitoring (PRM) and data-independent acquisition (DIA) with targeted data extraction on fast scanning high-resolution accurate-mass (HR/AM) instruments. Such HR/AM targeted quantification with monitoring all target product ions addresses SRM limitations effectively in specificity and multiplexing; whereas when compared to SRM, PRM and DIA are still in the infancy with a limited number of applications. Thus, for HR/AM targeted quantification we focus our discussion on method development, data processing and analysis, and its advantages and limitations in targeted proteomics. Finally, general perspectives on the potential of achieving both high sensitivity and high sample throughput for large-scale quantification of hundreds of target proteins are discussed.« less

  20. Potential for proteomic approaches in determining efficacy biomarkers following administration of fish oils rich in omega-3 fatty acids: application in pancreatic cancers.

    PubMed

    Runau, Franscois; Arshad, Ali; Isherwood, John; Norris, Leonie; Howells, Lynne; Metcalfe, Matthew; Dennison, Ashley

    2015-06-01

    Pancreatic cancer is a disease with a significantly poor prognosis. Despite modern advances in other medical, surgical, and oncologic therapy, the outcome from pancreatic cancer has improved little over the last 40 years. To improve the management of this difficult disease, trials investigating the use of dietary and parenteral fish oils rich in omega-3 (ω-3) fatty acids, exhibiting proven anti-inflammatory and anticarcinogenic properties, have revealed favorable results in pancreatic cancers. Proteomics is the large-scale study of proteins that attempts to characterize the complete set of proteins encoded by the genome of an organism and that, with the use of sensitive mass spectrometric-based techniques, has allowed high-throughput analysis of the proteome to aid identification of putative biomarkers pertinent to given disease states. These biomarkers provide useful insight into potentially discovering new markers for early detection or elucidating the efficacy of treatment on pancreatic cancers. Here, our review identifies potential proteomic-based biomarkers in pancreatic cancer relating to apoptosis, cell proliferation, angiogenesis, and metabolic regulation in clinical studies. We also reviewed proteomic biomarkers from the administration of ω-3 fatty acids that act on similar anticarcinogenic pathways as above and reflect that proteomic studies on the effect of ω-3 fatty acids in pancreatic cancer will yield favorable results. © 2015 American Society for Parenteral and Enteral Nutrition.

  1. Proteomic insights into floral biology.

    PubMed

    Li, Xiaobai; Jackson, Aaron; Xie, Ming; Wu, Dianxing; Tsai, Wen-Chieh; Zhang, Sheng

    2016-08-01

    The flower is the most important biological structure for ensuring angiosperms reproductive success. Not only does the flower contain critical reproductive organs, but the wide variation in morphology, color, and scent has evolved to entice specialized pollinators, and arguably mankind in many cases, to ensure the successful propagation of its species. Recent proteomic approaches have identified protein candidates related to these flower traits, which has shed light on a number of previously unknown mechanisms underlying these traits. This review article provides a comprehensive overview of the latest advances in proteomic research in floral biology according to the order of flower structure, from corolla to male and female reproductive organs. It summarizes mainstream proteomic methods for plant research and recent improvements on two dimensional gel electrophoresis and gel-free workflows for both peptide level and protein level analysis. The recent advances in sequencing technologies provide a new paradigm for the ever-increasing genome and transcriptome information on many organisms. It is now possible to integrate genomic and transcriptomic data with proteomic results for large-scale protein characterization, so that a global understanding of the complex molecular networks in flower biology can be readily achieved. This article is part of a Special Issue entitled: Plant Proteomics--a bridge between fundamental processes and crop production, edited by Dr. Hans-Peter Mock. Copyright © 2016 Elsevier B.V. All rights reserved.

  2. Use of proteomic methods in the analysis of human body fluids in Alzheimer research.

    PubMed

    Zürbig, Petra; Jahn, Holger

    2012-12-01

    Proteomics is the study of the entire population of proteins and peptides in an organism or a part of it, such as a cell, tissue, or fluids like cerebrospinal fluid, plasma, serum, urine, or saliva. It is widely assumed that changes in the composition of the proteome may reflect disease states and provide clues to its origin, eventually leading to targets for new treatments. The ability to perform large-scale proteomic studies now is based jointly on recent advances in our analytical methods. Separation techniques like CE and 2DE have developed and matured. Detection methods like MS have also improved greatly in the last 5 years. These developments have also driven the fields of bioinformatics, needed to deal with the increased data production and systems biology. All these developing methods offer specific advantages but also come with certain limitations. This review describes the different proteomic methods used in the field, their limitations, and their possible pitfalls. Based on a literature search in PubMed, we identified 112 studies that applied proteomic techniques to identify biomarkers for Alzheimer disease. This review describes the results of these studies on proteome changes in human body fluids of Alzheimer patients reviewing the most important studies. We extracted a list of 366 proteins and peptides that were identified by these studies as potential targets in Alzheimer research. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Proteomic analysis reveals O-GlcNAc modification on proteins with key regulatory functions in Arabidopsis.

    PubMed

    Xu, Shou-Ling; Chalkley, Robert J; Maynard, Jason C; Wang, Wenfei; Ni, Weimin; Jiang, Xiaoyue; Shin, Kihye; Cheng, Ling; Savage, Dasha; Hühmer, Andreas F R; Burlingame, Alma L; Wang, Zhi-Yong

    2017-02-21

    Genetic studies have shown essential functions of O-linked N -acetylglucosamine (O-GlcNAc) modification in plants. However, the proteins and sites subject to this posttranslational modification are largely unknown. Here, we report a large-scale proteomic identification of O-GlcNAc-modified proteins and sites in the model plant Arabidopsis thaliana Using lectin weak affinity chromatography to enrich modified peptides, followed by mass spectrometry, we identified 971 O-GlcNAc-modified peptides belonging to 262 proteins. The modified proteins are involved in cellular regulatory processes, including transcription, translation, epigenetic gene regulation, and signal transduction. Many proteins have functions in developmental and physiological processes specific to plants, such as hormone responses and flower development. Mass spectrometric analysis of phosphopeptides from the same samples showed that a large number of peptides could be modified by either O-GlcNAcylation or phosphorylation, but cooccurrence of the two modifications in the same peptide molecule was rare. Our study generates a snapshot of the O-GlcNAc modification landscape in plants, indicating functions in many cellular regulation pathways and providing a powerful resource for further dissecting these functions at the molecular level.

  4. PSEA-Quant: a protein set enrichment analysis on label-free and label-based protein quantification data.

    PubMed

    Lavallée-Adam, Mathieu; Rauniyar, Navin; McClatchy, Daniel B; Yates, John R

    2014-12-05

    The majority of large-scale proteomics quantification methods yield long lists of quantified proteins that are often difficult to interpret and poorly reproduced. Computational approaches are required to analyze such intricate quantitative proteomics data sets. We propose a statistical approach to computationally identify protein sets (e.g., Gene Ontology (GO) terms) that are significantly enriched with abundant proteins with reproducible quantification measurements across a set of replicates. To this end, we developed PSEA-Quant, a protein set enrichment analysis algorithm for label-free and label-based protein quantification data sets. It offers an alternative approach to classic GO analyses, models protein annotation biases, and allows the analysis of samples originating from a single condition, unlike analogous approaches such as GSEA and PSEA. We demonstrate that PSEA-Quant produces results complementary to GO analyses. We also show that PSEA-Quant provides valuable information about the biological processes involved in cystic fibrosis using label-free protein quantification of a cell line expressing a CFTR mutant. Finally, PSEA-Quant highlights the differences in the mechanisms taking place in the human, rat, and mouse brain frontal cortices based on tandem mass tag quantification. Our approach, which is available online, will thus improve the analysis of proteomics quantification data sets by providing meaningful biological insights.

  5. PSEA-Quant: A Protein Set Enrichment Analysis on Label-Free and Label-Based Protein Quantification Data

    PubMed Central

    2015-01-01

    The majority of large-scale proteomics quantification methods yield long lists of quantified proteins that are often difficult to interpret and poorly reproduced. Computational approaches are required to analyze such intricate quantitative proteomics data sets. We propose a statistical approach to computationally identify protein sets (e.g., Gene Ontology (GO) terms) that are significantly enriched with abundant proteins with reproducible quantification measurements across a set of replicates. To this end, we developed PSEA-Quant, a protein set enrichment analysis algorithm for label-free and label-based protein quantification data sets. It offers an alternative approach to classic GO analyses, models protein annotation biases, and allows the analysis of samples originating from a single condition, unlike analogous approaches such as GSEA and PSEA. We demonstrate that PSEA-Quant produces results complementary to GO analyses. We also show that PSEA-Quant provides valuable information about the biological processes involved in cystic fibrosis using label-free protein quantification of a cell line expressing a CFTR mutant. Finally, PSEA-Quant highlights the differences in the mechanisms taking place in the human, rat, and mouse brain frontal cortices based on tandem mass tag quantification. Our approach, which is available online, will thus improve the analysis of proteomics quantification data sets by providing meaningful biological insights. PMID:25177766

  6. TRDistiller: a rapid filter for enrichment of sequence datasets with proteins containing tandem repeats.

    PubMed

    Richard, François D; Kajava, Andrey V

    2014-06-01

    The dramatic growth of sequencing data evokes an urgent need to improve bioinformatics tools for large-scale proteome analysis. Over the last two decades, the foremost efforts of computer scientists were devoted to proteins with aperiodic sequences having globular 3D structures. However, a large portion of proteins contain periodic sequences representing arrays of repeats that are directly adjacent to each other (so called tandem repeats or TRs). These proteins frequently fold into elongated fibrous structures carrying different fundamental functions. Algorithms specific to the analysis of these regions are urgently required since the conventional approaches developed for globular domains have had limited success when applied to the TR regions. The protein TRs are frequently not perfect, containing a number of mutations, and some of them cannot be easily identified. To detect such "hidden" repeats several algorithms have been developed. However, the most sensitive among them are time-consuming and, therefore, inappropriate for large scale proteome analysis. To speed up the TR detection we developed a rapid filter that is based on the comparison of composition and order of short strings in the adjacent sequence motifs. Tests show that our filter discards up to 22.5% of proteins which are known to be without TRs while keeping almost all (99.2%) TR-containing sequences. Thus, we are able to decrease the size of the initial sequence dataset enriching it with TR-containing proteins which allows a faster subsequent TR detection by other methods. The program is available upon request. Copyright © 2014 Elsevier Inc. All rights reserved.

  7. Quantitative proteomic analysis reveals posttranslational responses to aneuploidy in yeast

    PubMed Central

    Dephoure, Noah; Hwang, Sunyoung; O'Sullivan, Ciara; Dodgson, Stacie E; Gygi, Steven P; Amon, Angelika; Torres, Eduardo M

    2014-01-01

    Aneuploidy causes severe developmental defects and is a near universal feature of tumor cells. Despite its profound effects, the cellular processes affected by aneuploidy are not well characterized. Here, we examined the consequences of aneuploidy on the proteome of aneuploid budding yeast strains. We show that although protein levels largely scale with gene copy number, subunits of multi-protein complexes are notable exceptions. Posttranslational mechanisms attenuate their expression when their encoding genes are in excess. Our proteomic analyses further revealed a novel aneuploidy-associated protein expression signature characteristic of altered metabolism and redox homeostasis. Indeed aneuploid cells harbor increased levels of reactive oxygen species (ROS). Interestingly, increased protein turnover attenuates ROS levels and this novel aneuploidy-associated signature and improves the fitness of most aneuploid strains. Our results show that aneuploidy causes alterations in metabolism and redox homeostasis. Cells respond to these alterations through both transcriptional and posttranscriptional mechanisms. DOI: http://dx.doi.org/10.7554/eLife.03023.001 PMID:25073701

  8. Predicting protein-protein interactions on a proteome scale by matching evolutionary and structural similarities at interfaces using PRISM.

    PubMed

    Tuncbag, Nurcan; Gursoy, Attila; Nussinov, Ruth; Keskin, Ozlem

    2011-08-11

    Prediction of protein-protein interactions at the structural level on the proteome scale is important because it allows prediction of protein function, helps drug discovery and takes steps toward genome-wide structural systems biology. We provide a protocol (termed PRISM, protein interactions by structural matching) for large-scale prediction of protein-protein interactions and assembly of protein complex structures. The method consists of two components: rigid-body structural comparisons of target proteins to known template protein-protein interfaces and flexible refinement using a docking energy function. The PRISM rationale follows our observation that globally different protein structures can interact via similar architectural motifs. PRISM predicts binding residues by using structural similarity and evolutionary conservation of putative binding residue 'hot spots'. Ultimately, PRISM could help to construct cellular pathways and functional, proteome-scale annotation. PRISM is implemented in Python and runs in a UNIX environment. The program accepts Protein Data Bank-formatted protein structures and is available at http://prism.ccbb.ku.edu.tr/prism_protocol/.

  9. Free Flow Zonal Electrophoresis for Fractionation of Plant Membrane Compartments Prior to Proteomic Analysis.

    PubMed

    Barkla, Bronwyn J

    2018-01-01

    Free flow zonal electrophoresis (FFZE) is a versatile, reproducible, and potentially high-throughput technique for the separation of plant organelles and membranes by differences in membrane surface charge. It offers considerable benefits over traditional fractionation techniques, such as density gradient centrifugation and two-phase partitioning, as it is relatively fast, sample recovery is high, and the method provides unparalleled sample purity. It has been used to successfully purify chloroplasts and mitochondria from plants but also, to obtain highly pure fractions of plasma membrane, tonoplast, ER, Golgi, and thylakoid membranes. Application of the technique can significantly improve protein coverage in large-scale proteomics studies by decreasing sample complexity. Here, we describe the method for the fractionation of plant cellular membranes from leaves by FFZE.

  10. Proteome Analysis of Liver Cells Expressing a Full- Length Hepatitis C Virus (HCV) Replicon and Biopsy Specimens of Posttransplantation Liver from HCV-Infected Patients

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jacobs, Jon M.; Diamond, Deborah L.; Chan, Eric Y.

    2005-06-01

    The development of a reproducible model system for the study of Hepatitis C virus (HCV) infection has the potential to significantly enhance the study of virus-host interactions and provide future direction for modeling the pathogenesis of HCV. While there are studies describing global gene expression changes associated with HCV infection, changes in the proteome have not been characterized. We report the first large scale proteome analysis of the highly permissive Huh-7.5 cell line containing a full length HCV replicon. We detected > 4,400 proteins in this cell line, including HCV replicon proteins, using multidimensional liquid chromatographic (LC) separations coupled tomore » mass spectrometry (MS). The set of Huh-7.5 proteins confidently identified is, to our knowledge, the most comprehensive yet reported for a human cell line. Consistent with the literature, a comparison of Huh-7.5 cells (+) and (-) the HCV replicon identified expression changes of proteins involved in lipid metabolism. We extended these analyses to liver biopsy material from HCV-infected patients where > 1,500 proteins were detected from 2 {micro}g protein lysate using the Huh-7.5 protein database and the accurate mass and time (AMT) tag strategy. These findings demonstrate the utility of multidimensional proteome analysis of the HCV replicon model system for assisting the determination of proteins/pathways affected by HCV infection. Our ability to extend these analyses to the highly complex proteome of small liver biopsies with limiting protein yields offers the unique opportunity to begin evaluating the clinical significance of protein expression changes associated with HCV infection.« less

  11. A Proteomic Analysis of Eccrine Sweat: Implications for the Discovery of Schizophrenia Biomarker Proteins

    PubMed Central

    Raiszadeh, Michelle M.; Ross, Mark M.; Russo, Paul S.; Schaepper, Mary Ann H.; Zhou, Weidong; Deng, Jianghong; Ng, Daniel; Dickson, April; Dickson, Cindy; Strom, Monica; Osorio, Carolina; Soeprono, Thomas; Wulfkuhle, Julia D.; Kabbani, Nadine; Petricoin, Emanuel F.; Liotta, Lance A.; Kirsch, Wolff M.

    2012-01-01

    Liquid chromatography tandem mass spectrometry (LC-MS/MS) and multiple reaction monitoring mass spectrometry (MRM-MS) proteomics analyses were performed on eccrine sweat of healthy controls, and the results were compared with those from individuals diagnosed with schizophrenia (SZ). This is the first large scale study of the sweat proteome. First, we performed LC-MS/MS on pooled SZ samples and pooled control samples for global proteomics analysis. Results revealed a high abundance of diverse proteins and peptides in eccrine sweat. Most of the proteins identified from sweat samples were found to be different than the most abundant proteins from serum, which indicates that eccrine sweat is not simply a plasma transudate, and may thereby be a source of unique disease-associated biomolecules. A second independent set of patient and control sweat samples were analyzed by LC-MS/MS and spectral counting to determine qualitative protein differential abundances between the control and disease groups. Differential abundances of selected proteins, initially determined by spectral counting, were verified by MRM-MS analyses. Seventeen proteins showed a differential abundance of approximately two-fold or greater between the SZ pooled sample and the control pooled sample. This study demonstrates the utility of LC-MS/MS and MRM-MS as a viable strategy for the discovery and verification of potential sweat protein disease biomarkers. PMID:22256890

  12. First Large-Scale Proteogenomic Study of Breast Cancer Provides Insight into Potential Therapeutic Targets | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    News Release: May 25, 2016 — Building on data from The Cancer Genome Atlas (TCGA) project, a multi-institutional team of scientists has completed the first large-scale “proteogenomic” study of breast cancer, linking DNA mutations to protein signaling and helping pinpoint the genes that drive cancer.

  13. Capillary nano-immunoassays: advancing quantitative proteomics analysis, biomarker assessment, and molecular diagnostics.

    PubMed

    Chen, Jin-Qiu; Wakefield, Lalage M; Goldstein, David J

    2015-06-06

    There is an emerging demand for the use of molecular profiling to facilitate biomarker identification and development, and to stratify patients for more efficient treatment decisions with reduced adverse effects. In the past decade, great strides have been made to advance genomic, transcriptomic and proteomic approaches to address these demands. While there has been much progress with these large scale approaches, profiling at the protein level still faces challenges due to limitations in clinical sample size, poor reproducibility, unreliable quantitation, and lack of assay robustness. A novel automated capillary nano-immunoassay (CNIA) technology has been developed. This technology offers precise and accurate measurement of proteins and their post-translational modifications using either charge-based or size-based separation formats. The system not only uses ultralow nanogram levels of protein but also allows multi-analyte analysis using a parallel single-analyte format for increased sensitivity and specificity. The high sensitivity and excellent reproducibility of this technology make it particularly powerful for analysis of clinical samples. Furthermore, the system can distinguish and detect specific protein post-translational modifications that conventional Western blot and other immunoassays cannot easily capture. This review will summarize and evaluate the latest progress to optimize the CNIA system for comprehensive, quantitative protein and signaling event characterization. It will also discuss how the technology has been successfully applied in both discovery research and clinical studies, for signaling pathway dissection, proteomic biomarker assessment, targeted treatment evaluation and quantitative proteomic analysis. Lastly, a comparison of this novel system with other conventional immuno-assay platforms is performed.

  14. Proteomic analysis in type 2 diabetes patients before and after a very low calorie diet reveals potential disease state and intervention specific biomarkers.

    PubMed

    Sleddering, Maria A; Markvoort, Albert J; Dharuri, Harish K; Jeyakar, Skhandhan; Snel, Marieke; Juhasz, Peter; Lynch, Moira; Hines, Wade; Li, Xiaohong; Jazet, Ingrid M; Adourian, Aram; Hilbers, Peter A J; Smit, Johannes W A; Van Dijk, Ko Willems

    2014-01-01

    Very low calorie diets (VLCD) with and without exercise programs lead to major metabolic improvements in obese type 2 diabetes patients. The mechanisms underlying these improvements have so far not been elucidated fully. To further investigate the mechanisms of a VLCD with or without exercise and to uncover possible biomarkers associated with these interventions, blood samples were collected from 27 obese type 2 diabetes patients before and after a 16-week VLCD (Modifast ∼ 450 kcal/day). Thirteen of these patients followed an exercise program in addition to the VCLD. Plasma was obtained from 27 lean and 27 obese controls as well. Proteomic analysis was performed using mass spectrometry (MS) and targeted multiple reaction monitoring (MRM) and a large scale isobaric tags for relative and absolute quantitation (iTRAQ) approach. After the 16-week VLCD, there was a significant decrease in body weight and HbA1c in all patients, without differences between the two intervention groups. Targeted MRM analysis revealed differences in several proteins, which could be divided in diabetes-associated (fibrinogen, transthyretin), obesity-associated (complement C3), and diet-associated markers (apolipoproteins, especially apolipoprotein A-IV). To further investigate the effects of exercise, large scale iTRAQ analysis was performed. However, no proteins were found showing an exercise effect. Thus, in this study, specific proteins were found to be differentially expressed in type 2 diabetes patients versus controls and before and after a VLCD. These proteins are potential disease state and intervention specific biomarkers. Controlled-Trials.com ISRCTN76920690.

  15. Proteomic Analysis in Type 2 Diabetes Patients before and after a Very Low Calorie Diet Reveals Potential Disease State and Intervention Specific Biomarkers

    PubMed Central

    Dharuri, Harish K.; Jeyakar, Skhandhan; Snel, Marieke; Juhasz, Peter; Lynch, Moira; Hines, Wade; Li, Xiaohong; Jazet, Ingrid M.; Adourian, Aram; Hilbers, Peter A. J.; Smit, Johannes W. A.; Van Dijk, Ko Willems

    2014-01-01

    Very low calorie diets (VLCD) with and without exercise programs lead to major metabolic improvements in obese type 2 diabetes patients. The mechanisms underlying these improvements have so far not been elucidated fully. To further investigate the mechanisms of a VLCD with or without exercise and to uncover possible biomarkers associated with these interventions, blood samples were collected from 27 obese type 2 diabetes patients before and after a 16-week VLCD (Modifast ∼450 kcal/day). Thirteen of these patients followed an exercise program in addition to the VCLD. Plasma was obtained from 27 lean and 27 obese controls as well. Proteomic analysis was performed using mass spectrometry (MS) and targeted multiple reaction monitoring (MRM) and a large scale isobaric tags for relative and absolute quantitation (iTRAQ) approach. After the 16-week VLCD, there was a significant decrease in body weight and HbA1c in all patients, without differences between the two intervention groups. Targeted MRM analysis revealed differences in several proteins, which could be divided in diabetes-associated (fibrinogen, transthyretin), obesity-associated (complement C3), and diet-associated markers (apolipoproteins, especially apolipoprotein A-IV). To further investigate the effects of exercise, large scale iTRAQ analysis was performed. However, no proteins were found showing an exercise effect. Thus, in this study, specific proteins were found to be differentially expressed in type 2 diabetes patients versus controls and before and after a VLCD. These proteins are potential disease state and intervention specific biomarkers. Trial Registration Controlled-Trials.com ISRCTN76920690 PMID:25415563

  16. GProX, a user-friendly platform for bioinformatics analysis and visualization of quantitative proteomics data.

    PubMed

    Rigbolt, Kristoffer T G; Vanselow, Jens T; Blagoev, Blagoy

    2011-08-01

    Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics experimenters with a toolbox for bioinformatics analysis of quantitative proteomics data. The program is released as open-source and can be freely downloaded from the project webpage at http://gprox.sourceforge.net.

  17. GProX, a User-Friendly Platform for Bioinformatics Analysis and Visualization of Quantitative Proteomics Data*

    PubMed Central

    Rigbolt, Kristoffer T. G.; Vanselow, Jens T.; Blagoev, Blagoy

    2011-01-01

    Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)1. The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics experimenters with a toolbox for bioinformatics analysis of quantitative proteomics data. The program is released as open-source and can be freely downloaded from the project webpage at http://gprox.sourceforge.net. PMID:21602510

  18. Alternative Splicing May Not Be the Key to Proteome Complexity.

    PubMed

    Tress, Michael L; Abascal, Federico; Valencia, Alfonso

    2017-02-01

    Alternative splicing is commonly believed to be a major source of cellular protein diversity. However, although many thousands of alternatively spliced transcripts are routinely detected in RNA-seq studies, reliable large-scale mass spectrometry-based proteomics analyses identify only a small fraction of annotated alternative isoforms. The clearest finding from proteomics experiments is that most human genes have a single main protein isoform, while those alternative isoforms that are identified tend to be the most biologically plausible: those with the most cross-species conservation and those that do not compromise functional domains. Indeed, most alternative exons do not seem to be under selective pressure, suggesting that a large majority of predicted alternative transcripts may not even be translated into proteins. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  19. Advanced proteomic liquid chromatography

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xie, Fang; Smith, Richard D.; Shen, Yufeng

    2012-10-26

    Liquid chromatography coupled with mass spectrometry is the predominant platform used to analyze proteomics samples consisting of large numbers of proteins and their proteolytic products (e.g., truncated polypeptides) and spanning a wide range of relative concentrations. This review provides an overview of advanced capillary liquid chromatography techniques and methodologies that greatly improve separation resolving power and proteomics analysis coverage, sensitivity, and throughput.

  20. The HUPO proteomics standards initiative--overcoming the fragmentation of proteomics data.

    PubMed

    Hermjakob, Henning

    2006-09-01

    Proteomics is a key field of modern biomolecular research, with many small and large scale efforts producing a wealth of proteomics data. However, the vast majority of this data is never exploited to its full potential. Even in publicly funded projects, often the raw data generated in a specific context is analysed, conclusions are drawn and published, but little attention is paid to systematic documentation, archiving, and public access to the data supporting the scientific results. It is often difficult to validate the results stated in a particular publication, and even simple global questions like "In which cellular contexts has my protein of interest been observed?" can currently not be answered with realistic effort, due to a lack of standardised reporting and collection of proteomics data. The Proteomics Standards Initiative (PSI), a work group of the Human Proteome Organisation (HUPO), defines community standards for data representation in proteomics to facilitate systematic data capture, comparison, exchange and verification. In this article we provide an overview of PSI organisational structure, activities, and current results, as well as ways to get involved in the broad-based, open PSI process.

  1. Identification of cypermethrin induced protein changes in green algae by iTRAQ quantitative proteomics.

    PubMed

    Gao, Yan; Lim, Teck Kwang; Lin, Qingsong; Li, Sam Fong Yau

    2016-04-29

    Cypermethrin (CYP) is one of the most widely used pesticides in large scale for agricultural and domestic purpose and the residue often seriously affects aquatic system. Environmental pollutant-induced protein changes in organisms could be detected by proteomics, leading to discovery of potential biomarkers and understanding of mode of action. While proteomics investigations of CYP stress in some animal models have been well studied, few reports about the effects of exposure to CYP on algae proteome were published. To determine CYP effect in algae, the impact of various dosages (0.001μg/L, 0.01μg/L and 1μg/L) of CYP on green algae Chlorella vulgaris for 24h and 96h was investigated by using iTRAQ quantitative proteomics technique. A total of 162 and 198 proteins were significantly altered after CYP exposure for 24h and 96h, respectively. Overview of iTRAQ results indicated that the influence of CYP on algae protein might be dosage-dependent. Functional analysis of differentially expressed proteins showed that CYP could induce protein alterations related to photosynthesis, stress responses and carbohydrate metabolism. This study provides a comprehensive view of complex mode of action of algae under CYP stress and highlights several potential biomarkers for further investigation of pesticide-exposed plant and algae. Copyright © 2016 Elsevier B.V. All rights reserved.

  2. Advances in targeted proteomics and applications to biomedical research

    PubMed Central

    Shi, Tujin; Song, Ehwang; Nie, Song; Rodland, Karin D.; Liu, Tao; Qian, Wei-Jun; Smith, Richard D.

    2016-01-01

    Targeted proteomics technique has emerged as a powerful protein quantification tool in systems biology, biomedical research, and increasing for clinical applications. The most widely used targeted proteomics approach, selected reaction monitoring (SRM), also known as multiple reaction monitoring (MRM), can be used for quantification of cellular signaling networks and preclinical verification of candidate protein biomarkers. As an extension to our previous review on advances in SRM sensitivity herein we review recent advances in the method and technology for further enhancing SRM sensitivity (from 2012 to present), and highlighting its broad biomedical applications in human bodily fluids, tissue and cell lines. Furthermore, we also review two recently introduced targeted proteomics approaches, parallel reaction monitoring (PRM) and data-independent acquisition (DIA) with targeted data extraction on fast scanning high-resolution accurate-mass (HR/AM) instruments. Such HR/AM targeted quantification with monitoring all target product ions addresses SRM limitations effectively in specificity and multiplexing; whereas when compared to SRM, PRM and DIA are still in the infancy with a limited number of applications. Thus, for HR/AM targeted quantification we focus our discussion on method development, data processing and analysis, and its advantages and limitations in targeted proteomics. Finally, general perspectives on the potential of achieving both high sensitivity and high sample throughput for large-scale quantification of hundreds of target proteins are discussed. PMID:27302376

  3. MEERCAT: Multiplexed Efficient Cell Free Expression of Recombinant QconCATs For Large Scale Absolute Proteome Quantification*

    PubMed Central

    Takemori, Nobuaki; Takemori, Ayako; Tanaka, Yuki; Endo, Yaeta; Hurst, Jane L.; Gómez-Baena, Guadalupe; Harman, Victoria M.; Beynon, Robert J.

    2017-01-01

    A major challenge in proteomics is the absolute accurate quantification of large numbers of proteins. QconCATs, artificial proteins that are concatenations of multiple standard peptides, are well established as an efficient means to generate standards for proteome quantification. Previously, QconCATs have been expressed in bacteria, but we now describe QconCAT expression in a robust, cell-free system. The new expression approach rescues QconCATs that previously were unable to be expressed in bacteria and can reduce the incidence of proteolytic damage to QconCATs. Moreover, it is possible to cosynthesize QconCATs in a highly-multiplexed translation reaction, coexpressing tens or hundreds of QconCATs simultaneously. By obviating bacterial culture and through the gain of high level multiplexing, it is now possible to generate tens of thousands of standard peptides in a matter of weeks, rendering absolute quantification of a complex proteome highly achievable in a reproducible, broadly deployable system. PMID:29055021

  4. Evaluation of a Genome-Scale In Silico Metabolic Model for Geobacter metallireducens Using Proteomic Data from a Field Biostimulation Experiment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fang, Yilin; Wilkins, Michael J.; Yabusaki, Steven B.

    2012-12-12

    Biomass and shotgun global proteomics data that reflected relative protein abundances from samples collected during the 2008 experiment at the U.S. Department of Energy Integrated Field-Scale Subsurface Research Challenge site in Rifle, Colorado, provided an unprecedented opportunity to validate a genome-scale metabolic model of Geobacter metallireducens and assess its performance with respect to prediction of metal reduction, biomass yield, and growth rate under dynamic field conditions. Reconstructed from annotated genomic sequence, biochemical, and physiological data, the constraint-based in silico model of G. metallireducens relates an annotated genome sequence to the physiological functions with 697 reactions controlled by 747 enzyme-coding genes.more » Proteomic analysis showed that 180 of the 637 G. metallireducens proteins detected during the 2008 experiment were associated with specific metabolic reactions in the in silico model. When the field-calibrated Fe(III) terminal electron acceptor process reaction in a reactive transport model for the field experiments was replaced with the genome-scale model, the model predicted that the largest metabolic fluxes through the in silico model reactions generally correspond to the highest abundances of proteins that catalyze those reactions. Central metabolism predicted by the model agrees well with protein abundance profiles inferred from proteomic analysis. Model discrepancies with the proteomic data, such as the relatively low fluxes through amino acid transport and metabolism, revealed pathways or flux constraints in the in silico model that could be updated to more accurately predict metabolic processes that occur in the subsurface environment.« less

  5. Workflow based framework for life science informatics.

    PubMed

    Tiwari, Abhishek; Sekhar, Arvind K T

    2007-10-01

    Workflow technology is a generic mechanism to integrate diverse types of available resources (databases, servers, software applications and different services) which facilitate knowledge exchange within traditionally divergent fields such as molecular biology, clinical research, computational science, physics, chemistry and statistics. Researchers can easily incorporate and access diverse, distributed tools and data to develop their own research protocols for scientific analysis. Application of workflow technology has been reported in areas like drug discovery, genomics, large-scale gene expression analysis, proteomics, and system biology. In this article, we have discussed the existing workflow systems and the trends in applications of workflow based systems.

  6. Advanced proteomic liquid chromatography

    PubMed Central

    Xie, Fang; Smith, Richard D.; Shen, Yufeng

    2012-01-01

    Liquid chromatography coupled with mass spectrometry is the predominant platform used to analyze proteomics samples consisting of large numbers of proteins and their proteolytic products (e.g., truncated polypeptides) and spanning a wide range of relative concentrations. This review provides an overview of advanced capillary liquid chromatography techniques and methodologies that greatly improve separation resolving power and proteomics analysis coverage, sensitivity, and throughput. PMID:22840822

  7. The Use of Proteomic Tools to Address Challenges Faced in Clonal Propagation of Tropical Crops through Somatic Embryogenesis.

    PubMed

    Chin, Chiew Foan; Tan, Hooi Sin

    2018-05-04

    In many tropical countries with agriculture as the mainstay of the economy, tropical crops are commonly cultivated at the plantation scale. The successful establishment of crop plantations depends on the availability of a large quantity of elite seedling plants. Many plantation companies establish plant tissue culture laboratories to supply planting materials for their plantations and one of the most common applications of plant tissue culture is the mass propagation of true-to-type elite seedlings. However, problems encountered in tissue culture technology prevent its applications being widely adopted. Proteomics can be a powerful tool for use in the analysis of cultures, and to understand the biological processes that takes place at the cellular and molecular levels in order to address these problems. This mini review presents the tissue culture technologies commonly used in the propagation of tropical crops. It provides an outline of some the genes and proteins isolated that are associated with somatic embryogenesis and the use of proteomic technology in analysing tissue culture samples and processes in tropical crops.

  8. Automation, parallelism, and robotics for proteomics.

    PubMed

    Alterovitz, Gil; Liu, Jonathan; Chow, Jijun; Ramoni, Marco F

    2006-07-01

    The speed of the human genome project (Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C. et al., Nature 2001, 409, 860-921) was made possible, in part, by developments in automation of sequencing technologies. Before these technologies, sequencing was a laborious, expensive, and personnel-intensive task. Similarly, automation and robotics are changing the field of proteomics today. Proteomics is defined as the effort to understand and characterize proteins in the categories of structure, function and interaction (Englbrecht, C. C., Facius, A., Comb. Chem. High Throughput Screen. 2005, 8, 705-715). As such, this field nicely lends itself to automation technologies since these methods often require large economies of scale in order to achieve cost and time-saving benefits. This article describes some of the technologies and methods being applied in proteomics in order to facilitate automation within the field as well as in linking proteomics-based information with other related research areas.

  9. A Method for Label-Free, Differential Top-Down Proteomics.

    PubMed

    Ntai, Ioanna; Toby, Timothy K; LeDuc, Richard D; Kelleher, Neil L

    2016-01-01

    Biomarker discovery in the translational research has heavily relied on labeled and label-free quantitative bottom-up proteomics. Here, we describe a new approach to biomarker studies that utilizes high-throughput top-down proteomics and is the first to offer whole protein characterization and relative quantitation within the same experiment. Using yeast as a model, we report procedures for a label-free approach to quantify the relative abundance of intact proteins ranging from 0 to 30 kDa in two different states. In this chapter, we describe the integrated methodology for the large-scale profiling and quantitation of the intact proteome by liquid chromatography-mass spectrometry (LC-MS) without the need for metabolic or chemical labeling. This recent advance for quantitative top-down proteomics is best implemented with a robust and highly controlled sample preparation workflow before data acquisition on a high-resolution mass spectrometer, and the application of a hierarchical linear statistical model to account for the multiple levels of variance contained in quantitative proteomic comparisons of samples for basic and clinical research.

  10. Recent advances in stable isotope labeling based techniques for proteome relative quantification.

    PubMed

    Zhou, Yuan; Shan, Yichu; Zhang, Lihua; Zhang, Yukui

    2014-10-24

    The large scale relative quantification of all proteins expressed in biological samples under different states is of great importance for discovering proteins with important biological functions, as well as screening disease related biomarkers and drug targets. Therefore, the accurate quantification of proteins at proteome level has become one of the key issues in protein science. Herein, the recent advances in stable isotope labeling based techniques for proteome relative quantification were reviewed, from the aspects of metabolic labeling, chemical labeling and enzyme-catalyzed labeling. Furthermore, the future research direction in this field was prospected. Copyright © 2014 Elsevier B.V. All rights reserved.

  11. High throughput profile-profile based fold recognition for the entire human proteome.

    PubMed

    McGuffin, Liam J; Smith, Richard T; Bryson, Kevin; Sørensen, Søren-Aksel; Jones, David T

    2006-06-07

    In order to maintain the most comprehensive structural annotation databases we must carry out regular updates for each proteome using the latest profile-profile fold recognition methods. The ability to carry out these updates on demand is necessary to keep pace with the regular updates of sequence and structure databases. Providing the highest quality structural models requires the most intensive profile-profile fold recognition methods running with the very latest available sequence databases and fold libraries. However, running these methods on such a regular basis for every sequenced proteome requires large amounts of processing power. In this paper we describe and benchmark the JYDE (Job Yield Distribution Environment) system, which is a meta-scheduler designed to work above cluster schedulers, such as Sun Grid Engine (SGE) or Condor. We demonstrate the ability of JYDE to distribute the load of genomic-scale fold recognition across multiple independent Grid domains. We use the most recent profile-profile version of our mGenTHREADER software in order to annotate the latest version of the Human proteome against the latest sequence and structure databases in as short a time as possible. We show that our JYDE system is able to scale to large numbers of intensive fold recognition jobs running across several independent computer clusters. Using our JYDE system we have been able to annotate 99.9% of the protein sequences within the Human proteome in less than 24 hours, by harnessing over 500 CPUs from 3 independent Grid domains. This study clearly demonstrates the feasibility of carrying out on demand high quality structural annotations for the proteomes of major eukaryotic organisms. Specifically, we have shown that it is now possible to provide complete regular updates of profile-profile based fold recognition models for entire eukaryotic proteomes, through the use of Grid middleware such as JYDE.

  12. Spermatogenesis in mammals: proteomic insights.

    PubMed

    Chocu, Sophie; Calvel, Pierre; Rolland, Antoine D; Pineau, Charles

    2012-08-01

    Spermatogenesis is a highly sophisticated process involved in the transmission of genetic heritage. It includes halving ploidy, repackaging of the chromatin for transport, and the equipment of developing spermatids and eventually spermatozoa with the advanced apparatus (e.g., tightly packed mitochondrial sheat in the mid piece, elongating of the tail, reduction of cytoplasmic volume) to elicit motility once they reach the epididymis. Mammalian spermatogenesis is divided into three phases. In the first the primitive germ cells or spermatogonia undergo a series of mitotic divisions. In the second the spermatocytes undergo two consecutive divisions in meiosis to produce haploid spermatids. In the third the spermatids differentiate into spermatozoa in a process called spermiogenesis. Paracrine, autocrine, juxtacrine, and endocrine pathways all contribute to the regulation of the process. The array of structural elements and chemical factors modulating somatic and germ cell activity is such that the network linking the various cellular activities during spermatogenesis is unimaginably complex. Over the past two decades, advances in genomics have greatly improved our knowledge of spermatogenesis, by identifying numerous genes essential for the development of functional male gametes. Large-scale analyses of testicular function have deepened our insight into normal and pathological spermatogenesis. Progress in genome sequencing and microarray technology have been exploited for genome-wide expression studies, leading to the identification of hundreds of genes differentially expressed within the testis. However, although proteomics has now come of age, the proteomics-based investigation of spermatogenesis remains in its infancy. Here, we review the state-of-the-art of large-scale proteomic analyses of spermatogenesis, from germ cell development during sex determination to spermatogenesis in the adult. Indeed, a few laboratories have undertaken differential protein profiling expression studies and/or systematic analyses of testicular proteomes in entire organs or isolated cells from various species. We consider the pros and cons of proteomics for studying the testicular germ cell gene expression program. Finally, we address the use of protein datasets, through integrative genomics (i.e., combining genomics, transcriptomics, and proteomics), bioinformatics, and modelling.

  13. A genome-wide structure-based survey of nucleotide binding proteins in M. tuberculosis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bhagavat, Raghu; Kim, Heung -Bok; Kim, Chang -Yub

    Nucleoside tri-phosphates (NTP) form an important class of small molecule ligands that participate in, and are essential to a large number of biological processes. Here, we seek to identify the NTP binding proteome (NTPome) in M. tuberculosis (M.tb), a deadly pathogen. Identifying the NTPome is useful not only for gaining functional insights of the individual proteins but also for identifying useful drug targets. From an earlier study, we had structural models of M.tb at a proteome scale from which a set of 13,858 small molecule binding pockets were identified. We use a set of NTP binding sub-structural motifs derived frommore » a previous study and scan the M.tb pocketome, and find that 1,768 proteins or 43% of the proteome can theoretically bind NTP ligands. Using an experimental proteomics approach involving dye-ligand affinity chromatography, we confirm NTP binding to 47 different proteins, of which 4 are hypothetical proteins. Our analysis also provides the precise list of binding site residues in each case, and the probable ligand binding pose. In conclusion, as the list includes a number of known and potential drug targets, the identification of NTP binding can directly facilitate structure-based drug design of these targets.« less

  14. A genome-wide structure-based survey of nucleotide binding proteins in M. tuberculosis

    DOE PAGES

    Bhagavat, Raghu; Kim, Heung -Bok; Kim, Chang -Yub; ...

    2017-10-02

    Nucleoside tri-phosphates (NTP) form an important class of small molecule ligands that participate in, and are essential to a large number of biological processes. Here, we seek to identify the NTP binding proteome (NTPome) in M. tuberculosis (M.tb), a deadly pathogen. Identifying the NTPome is useful not only for gaining functional insights of the individual proteins but also for identifying useful drug targets. From an earlier study, we had structural models of M.tb at a proteome scale from which a set of 13,858 small molecule binding pockets were identified. We use a set of NTP binding sub-structural motifs derived frommore » a previous study and scan the M.tb pocketome, and find that 1,768 proteins or 43% of the proteome can theoretically bind NTP ligands. Using an experimental proteomics approach involving dye-ligand affinity chromatography, we confirm NTP binding to 47 different proteins, of which 4 are hypothetical proteins. Our analysis also provides the precise list of binding site residues in each case, and the probable ligand binding pose. In conclusion, as the list includes a number of known and potential drug targets, the identification of NTP binding can directly facilitate structure-based drug design of these targets.« less

  15. The online Tabloid Proteome: an annotated database of protein associations

    PubMed Central

    Turan, Demet; Tavernier, Jan

    2018-01-01

    Abstract A complete knowledge of the proteome can only be attained by determining the associations between proteins, along with the nature of these associations (e.g. physical contact in protein–protein interactions, participation in complex formation or different roles in the same pathway). Despite extensive efforts in elucidating direct protein interactions, our knowledge on the complete spectrum of protein associations remains limited. We therefore developed a new approach that detects protein associations from identifications obtained after re-processing of large-scale, public mass spectrometry-based proteomics data. Our approach infers protein association based on the co-occurrence of proteins across many different proteomics experiments, and provides information that is almost completely complementary to traditional direct protein interaction studies. We here present a web interface to query and explore the associations derived from this method, called the online Tabloid Proteome. The online Tabloid Proteome also integrates biological knowledge from several existing resources to annotate our derived protein associations. The online Tabloid Proteome is freely available through a user-friendly web interface, which provides intuitive navigation and data exploration options for the user at http://iomics.ugent.be/tabloidproteome. PMID:29040688

  16. Evaluation of a genome-scale in silico metabolic model for Geobacter metallireducens by using proteomic data from a field biostimulation experiment.

    PubMed

    Fang, Yilin; Wilkins, Michael J; Yabusaki, Steven B; Lipton, Mary S; Long, Philip E

    2012-12-01

    Accurately predicting the interactions between microbial metabolism and the physical subsurface environment is necessary to enhance subsurface energy development, soil and groundwater cleanup, and carbon management. This study was an initial attempt to confirm the metabolic functional roles within an in silico model using environmental proteomic data collected during field experiments. Shotgun global proteomics data collected during a subsurface biostimulation experiment were used to validate a genome-scale metabolic model of Geobacter metallireducens-specifically, the ability of the metabolic model to predict metal reduction, biomass yield, and growth rate under dynamic field conditions. The constraint-based in silico model of G. metallireducens relates an annotated genome sequence to the physiological functions with 697 reactions controlled by 747 enzyme-coding genes. Proteomic analysis showed that 180 of the 637 G. metallireducens proteins detected during the 2008 experiment were associated with specific metabolic reactions in the in silico model. When the field-calibrated Fe(III) terminal electron acceptor process reaction in a reactive transport model for the field experiments was replaced with the genome-scale model, the model predicted that the largest metabolic fluxes through the in silico model reactions generally correspond to the highest abundances of proteins that catalyze those reactions. Central metabolism predicted by the model agrees well with protein abundance profiles inferred from proteomic analysis. Model discrepancies with the proteomic data, such as the relatively low abundances of proteins associated with amino acid transport and metabolism, revealed pathways or flux constraints in the in silico model that could be updated to more accurately predict metabolic processes that occur in the subsurface environment.

  17. Developmental and Subcellular Organization of Single-Cell C₄ Photosynthesis in Bienertia sinuspersici Determined by Large-Scale Proteomics and cDNA Assembly from 454 DNA Sequencing.

    PubMed

    Offermann, Sascha; Friso, Giulia; Doroshenk, Kelly A; Sun, Qi; Sharpe, Richard M; Okita, Thomas W; Wimmer, Diana; Edwards, Gerald E; van Wijk, Klaas J

    2015-05-01

    Kranz C4 species strictly depend on separation of primary and secondary carbon fixation reactions in different cell types. In contrast, the single-cell C4 (SCC4) species Bienertia sinuspersici utilizes intracellular compartmentation including two physiologically and biochemically different chloroplast types; however, information on identity, localization, and induction of proteins required for this SCC4 system is currently very limited. In this study, we determined the distribution of photosynthesis-related proteins and the induction of the C4 system during development by label-free proteomics of subcellular fractions and leaves of different developmental stages. This was enabled by inferring a protein sequence database from 454 sequencing of Bienertia cDNAs. Large-scale proteome rearrangements were observed as C4 photosynthesis developed during leaf maturation. The proteomes of the two chloroplasts are different with differential accumulation of linear and cyclic electron transport components, primary and secondary carbon fixation reactions, and a triose-phosphate shuttle that is shared between the two chloroplast types. This differential protein distribution pattern suggests the presence of a mRNA or protein-sorting mechanism for nuclear-encoded, chloroplast-targeted proteins in SCC4 species. The combined information was used to provide a comprehensive model for NAD-ME type carbon fixation in SCC4 species.

  18. Combinatorial depletion analysis to assemble the network architecture of the SAGA and ADA chromatin remodeling complexes.

    PubMed

    Lee, Kenneth K; Sardiu, Mihaela E; Swanson, Selene K; Gilmore, Joshua M; Torok, Michael; Grant, Patrick A; Florens, Laurence; Workman, Jerry L; Washburn, Michael P

    2011-07-05

    Despite the availability of several large-scale proteomics studies aiming to identify protein interactions on a global scale, little is known about how proteins interact and are organized within macromolecular complexes. Here, we describe a technique that consists of a combination of biochemistry approaches, quantitative proteomics and computational methods using wild-type and deletion strains to investigate the organization of proteins within macromolecular protein complexes. We applied this technique to determine the organization of two well-studied complexes, Spt-Ada-Gcn5 histone acetyltransferase (SAGA) and ADA, for which no comprehensive high-resolution structures exist. This approach revealed that SAGA/ADA is composed of five distinct functional modules, which can persist separately. Furthermore, we identified a novel subunit of the ADA complex, termed Ahc2, and characterized Sgf29 as an ADA family protein present in all Gcn5 histone acetyltransferase complexes. Finally, we propose a model for the architecture of the SAGA and ADA complexes, which predicts novel functional associations within the SAGA complex and provides mechanistic insights into phenotypical observations in SAGA mutants.

  19. Combinatorial depletion analysis to assemble the network architecture of the SAGA and ADA chromatin remodeling complexes

    PubMed Central

    Lee, Kenneth K; Sardiu, Mihaela E; Swanson, Selene K; Gilmore, Joshua M; Torok, Michael; Grant, Patrick A; Florens, Laurence; Workman, Jerry L; Washburn, Michael P

    2011-01-01

    Despite the availability of several large-scale proteomics studies aiming to identify protein interactions on a global scale, little is known about how proteins interact and are organized within macromolecular complexes. Here, we describe a technique that consists of a combination of biochemistry approaches, quantitative proteomics and computational methods using wild-type and deletion strains to investigate the organization of proteins within macromolecular protein complexes. We applied this technique to determine the organization of two well-studied complexes, Spt–Ada–Gcn5 histone acetyltransferase (SAGA) and ADA, for which no comprehensive high-resolution structures exist. This approach revealed that SAGA/ADA is composed of five distinct functional modules, which can persist separately. Furthermore, we identified a novel subunit of the ADA complex, termed Ahc2, and characterized Sgf29 as an ADA family protein present in all Gcn5 histone acetyltransferase complexes. Finally, we propose a model for the architecture of the SAGA and ADA complexes, which predicts novel functional associations within the SAGA complex and provides mechanistic insights into phenotypical observations in SAGA mutants. PMID:21734642

  20. Proteomic profile of the Bradysia odoriphaga in response to the microbial secondary metabolite benzothiazole.

    PubMed

    Zhao, Yunhe; Cui, Kaidi; Xu, Chunmei; Wang, Qiuhong; Wang, Yao; Zhang, Zhengqun; Liu, Feng; Mu, Wei

    2016-11-24

    Benzothiazole, a microbial secondary metabolite, has been demonstrated to possess fumigant activity against Sclerotinia sclerotiorum, Ditylenchus destructor and Bradysia odoriphaga. However, to facilitate the development of novel microbial pesticides, the mode of action of benzothiazole needs to be elucidated. Here, we employed iTRAQ-based quantitative proteomics analysis to investigate the effects of benzothiazole on the proteomic expression of B. odoriphaga. In response to benzothiazole, 92 of 863 identified proteins in B. odoriphaga exhibited altered levels of expression, among which 14 proteins were related to the action mechanism of benzothiazole, 11 proteins were involved in stress responses, and 67 proteins were associated with the adaptation of B. odoriphaga to benzothiazole. Further bioinformatics analysis indicated that the reduction in energy metabolism, inhibition of the detoxification process and interference with DNA and RNA synthesis were potentially associated with the mode of action of benzothiazole. The myosin heavy chain, succinyl-CoA synthetase and Ca + -transporting ATPase proteins may be related to the stress response. Increased expression of proteins involved in carbohydrate metabolism, energy production and conversion pathways was responsible for the adaptive response of B. odoriphaga. The results of this study provide novel insight into the molecular mechanisms of benzothiazole at a large-scale translation level and will facilitate the elucidation of the mechanism of action of benzothiazole.

  1. Ursgal, Universal Python Module Combining Common Bottom-Up Proteomics Tools for Large-Scale Analysis.

    PubMed

    Kremer, Lukas P M; Leufken, Johannes; Oyunchimeg, Purevdulam; Schulze, Stefan; Fufezan, Christian

    2016-03-04

    Proteomics data integration has become a broad field with a variety of programs offering innovative algorithms to analyze increasing amounts of data. Unfortunately, this software diversity leads to many problems as soon as the data is analyzed using more than one algorithm for the same task. Although it was shown that the combination of multiple peptide identification algorithms yields more robust results, it is only recently that unified approaches are emerging; however, workflows that, for example, aim to optimize search parameters or that employ cascaded style searches can only be made accessible if data analysis becomes not only unified but also and most importantly scriptable. Here we introduce Ursgal, a Python interface to many commonly used bottom-up proteomics tools and to additional auxiliary programs. Complex workflows can thus be composed using the Python scripting language using a few lines of code. Ursgal is easily extensible, and we have made several database search engines (X!Tandem, OMSSA, MS-GF+, Myrimatch, MS Amanda), statistical postprocessing algorithms (qvality, Percolator), and one algorithm that combines statistically postprocessed outputs from multiple search engines ("combined FDR") accessible as an interface in Python. Furthermore, we have implemented a new algorithm ("combined PEP") that combines multiple search engines employing elements of "combined FDR", PeptideShaker, and Bayes' theorem.

  2. Transcriptomic and proteomic responses of Serratia marcescens to spaceflight conditions involve large-scale changes in metabolic pathways

    NASA Astrophysics Data System (ADS)

    Wang, Yajuan; Yuan, Yanting; Liu, Jinwen; Su, Longxiang; Chang, De; Guo, Yinghua; Chen, Zhenhong; Fang, Xiangqun; Wang, Junfeng; Li, Tianzhi; Zhou, Lisha; Fang, Chengxiang; Yang, Ruifu; Liu, Changting

    2014-04-01

    The microgravity environment of spaceflight expeditions has been associated with altered microbial responses. This study explores the characterization of Serratia marcescensis grown in a spaceflight environment at the phenotypic, transcriptomic and proteomic levels. From November 1, 2011 to November 17, 2011, a strain of S. marcescensis was sent into space for 398 h on the Shenzhou VIII spacecraft, and ground simulation was performed as a control (LCT-SM213). After the flight, two mutant strains (LCT-SM166 and LCT-SM262) were selected for further analysis. Although no changes in the morphology, post-culture growth kinetics, hemolysis or antibiotic sensitivity were observed, the two mutant strains exhibited significant changes in their metabolic profiles after exposure to spaceflight. Enrichment analysis of the transcriptome showed that the differentially expressed genes of the two spaceflight strains and the ground control strain mainly included those involved in metabolism and degradation. The proteome revealed that changes at the protein level were also associated with metabolic functions, such as glycolysis/gluconeogenesis, pyruvate metabolism, arginine and proline metabolism and the degradation of valine, leucine and isoleucine. In summary S. marcescens showed alterations primarily in genes and proteins that were associated with metabolism under spaceflight conditions, which gave us valuable clues for future research.

  3. SWATH2stats: An R/Bioconductor Package to Process and Convert Quantitative SWATH-MS Proteomics Data for Downstream Analysis Tools.

    PubMed

    Blattmann, Peter; Heusel, Moritz; Aebersold, Ruedi

    2016-01-01

    SWATH-MS is an acquisition and analysis technique of targeted proteomics that enables measuring several thousand proteins with high reproducibility and accuracy across many samples. OpenSWATH is popular open-source software for peptide identification and quantification from SWATH-MS data. For downstream statistical and quantitative analysis there exist different tools such as MSstats, mapDIA and aLFQ. However, the transfer of data from OpenSWATH to the downstream statistical tools is currently technically challenging. Here we introduce the R/Bioconductor package SWATH2stats, which allows convenient processing of the data into a format directly readable by the downstream analysis tools. In addition, SWATH2stats allows annotation, analyzing the variation and the reproducibility of the measurements, FDR estimation, and advanced filtering before submitting the processed data to downstream tools. These functionalities are important to quickly analyze the quality of the SWATH-MS data. Hence, SWATH2stats is a new open-source tool that summarizes several practical functionalities for analyzing, processing, and converting SWATH-MS data and thus facilitates the efficient analysis of large-scale SWATH/DIA datasets.

  4. Frequently Asked Questions about Genetic and Genomic Science

    MedlinePlus

    ... of the new genetic and genomic techniques and technologies? Proteomics The suffix "-ome" comes from the Greek ... pharmacogenomics is one of the large-scale "omic" technologies, it can examine the entirety of the genome, ...

  5. Global proteomic profiling in multistep hepatocarcinogenesis and identification of PARP1 as a novel molecular marker in hepatocellular carcinoma

    PubMed Central

    Wang, Jianguo; Xie, Haiyang; Li, Jie; Cao, Jili; Zhou, Lin; Zheng, Shusen

    2016-01-01

    The more accurate biomarkers have long been desired for hepatocellular carcinoma (HCC). Here, we characterized global large-scale proteomics of multistep hepatocarcinogenesis in an attempt to identify novel biomarkers for HCC. Quantitative data of 37874 sequences and 3017 proteins during hepatocarcinogenesis were obtained in cohort 1 of 75 samples (5 pooled groups: normal livers, hepatitis livers, cirrhotic livers, peritumoral livers, and HCC tissues) by iTRAQ 2D LC-MS/MS. The diagnostic performance of the top six most upregulated proteins in HCC group and HSP70 as reference were subsequently validated in cohort 2 of 114 samples (hepatocarcinogenesis from normal livers to HCC) using immunohistochemistry. Of seven candidate protein markers, PARP1, GS and NDRG1 showed the optimal diagnostic performance for HCC. PARP1, as a novel marker, showed comparable diagnostic performance to that of classic markers GS and NDRG1 in HCC (AUCs = 0.872, 0.856 and 0.792, respectively). A significant higher AUC of 0.945 was achieved when three markers combined. For diagnosis of HCC, the sensitivity and specificity were 88.2% and 81.0% when at least two of the markers were positive. Similar diagnostic values of PARP1, GS and NDRG1 were confirmed by immunohistochemistry in cohort 3 of 180 HCC patients. Further analysis indicated that PARP1 and NDRG1 were associated with some clinicopathological features, and the independent prognostic factors for HCC patients. Overall, global large-scale proteomics on spectrum of multistep hepatocarcinogenesis are obtained. PARP1 is a novel promising diagnostic/prognostic marker for HCC, and the three-marker panel (PARP1, GS and NDRG1) with excellent diagnostic performance for HCC was established. PMID:26883192

  6. Structural and metabolic transitions of C4 leaf development and differentiation defined by microscopy and quantitative proteomics in maize.

    PubMed

    Majeran, Wojciech; Friso, Giulia; Ponnala, Lalit; Connolly, Brian; Huang, Mingshu; Reidel, Edwin; Zhang, Cankui; Asakura, Yukari; Bhuiyan, Nazmul H; Sun, Qi; Turgeon, Robert; van Wijk, Klaas J

    2010-11-01

    C(4) grasses, such as maize (Zea mays), have high photosynthetic efficiency through combined biochemical and structural adaptations. C(4) photosynthesis is established along the developmental axis of the leaf blade, leading from an undifferentiated leaf base just above the ligule into highly specialized mesophyll cells (MCs) and bundle sheath cells (BSCs) at the tip. To resolve the kinetics of maize leaf development and C(4) differentiation and to obtain a systems-level understanding of maize leaf formation, the accumulation profiles of proteomes of the leaf and the isolated BSCs with their vascular bundle along the developmental gradient were determined using large-scale mass spectrometry. This was complemented by extensive qualitative and quantitative microscopy analysis of structural features (e.g., Kranz anatomy, plasmodesmata, cell wall, and organelles). More than 4300 proteins were identified and functionally annotated. Developmental protein accumulation profiles and hierarchical cluster analysis then determined the kinetics of organelle biogenesis, formation of cellular structures, metabolism, and coexpression patterns. Two main expression clusters were observed, each divided in subclusters, suggesting that a limited number of developmental regulatory networks organize concerted protein accumulation along the leaf gradient. The coexpression with BSC and MC markers provided strong candidates for further analysis of C(4) specialization, in particular transporters and biogenesis factors. Based on the integrated information, we describe five developmental transitions that provide a conceptual and practical template for further analysis. An online protein expression viewer is provided through the Plant Proteome Database.

  7. EvoluCode: Evolutionary Barcodes as a Unifying Framework for Multilevel Evolutionary Data.

    PubMed

    Linard, Benjamin; Nguyen, Ngoc Hoan; Prosdocimi, Francisco; Poch, Olivier; Thompson, Julie D

    2012-01-01

    Evolutionary systems biology aims to uncover the general trends and principles governing the evolution of biological networks. An essential part of this process is the reconstruction and analysis of the evolutionary histories of these complex, dynamic networks. Unfortunately, the methodologies for representing and exploiting such complex evolutionary histories in large scale studies are currently limited. Here, we propose a new formalism, called EvoluCode (Evolutionary barCode), which allows the integration of different evolutionary parameters (eg, sequence conservation, orthology, synteny …) in a unifying format and facilitates the multilevel analysis and visualization of complex evolutionary histories at the genome scale. The advantages of the approach are demonstrated by constructing barcodes representing the evolution of the complete human proteome. Two large-scale studies are then described: (i) the mapping and visualization of the barcodes on the human chromosomes and (ii) automatic clustering of the barcodes to highlight protein subsets sharing similar evolutionary histories and their functional analysis. The methodologies developed here open the way to the efficient application of other data mining and knowledge extraction techniques in evolutionary systems biology studies. A database containing all EvoluCode data is available at: http://lbgi.igbmc.fr/barcodes.

  8. Analysis of the pumpkin phloem proteome provides insights into angiosperm sieve tube function.

    PubMed

    Lin, Ming-Kuem; Lee, Young-Jin; Lough, Tony J; Phinney, Brett S; Lucas, William J

    2009-02-01

    Increasing evidence suggests that proteins present in the angiosperm sieve tube system play an important role in the long distance signaling system of plants. To identify the nature of these putatively non-cell-autonomous proteins, we adopted a large scale proteomics approach to analyze pumpkin phloem exudates. Phloem proteins were fractionated by fast protein liquid chromatography using both anion and cation exchange columns and then either in-solution or in-gel digested following further separation by SDS-PAGE. A total of 345 LC-MS/MS data sets were analyzed using a combination of Mascot and X!Tandem against the NCBI non-redundant green plant database and an extensive Cucurbit maxima expressed sequence tag database. In this analysis, 1,209 different consensi were obtained of which 1,121 could be annotated from GenBank and BLAST search analyses against three plant species, Arabidopsis thaliana, rice (Oryza sativa), and poplar (Populus trichocarpa). Gene ontology (GO) enrichment analyses identified sets of phloem proteins that function in RNA binding, mRNA translation, ubiquitin-mediated proteolysis, and macromolecular and vesicle trafficking. Our findings indicate that protein synthesis and turnover, processes that were thought to be absent in enucleate sieve elements, likely occur within the angiosperm phloem translocation stream. In addition, our GO analysis identified a set of phloem proteins that are associated with the GO term "embryonic development ending in seed dormancy"; this finding raises the intriguing question as to whether the phloem may exert some level of control over seed development. The universal significance of the phloem proteome was highlighted by conservation of the phloem proteome in species as diverse as monocots (rice), eudicots (Arabidopsis and pumpkin), and trees (poplar). These results are discussed from the perspective of the role played by the phloem proteome as an integral component of the whole plant communication system.

  9. Peroxisome Biogenesis and Function

    PubMed Central

    Kaur, Navneet; Reumann, Sigrun; Hu, Jianping

    2009-01-01

    Peroxisomes are small and single membrane-delimited organelles that execute numerous metabolic reactions and have pivotal roles in plant growth and development. In recent years, forward and reverse genetic studies along with biochemical and cell biological analyses in Arabidopsis have enabled researchers to identify many peroxisome proteins and elucidate their functions. This review focuses on the advances in our understanding of peroxisome biogenesis and metabolism, and further explores the contribution of large-scale analysis, such as in sillco predictions and proteomics, in augmenting our knowledge of peroxisome function In Arabidopsis. PMID:22303249

  10. Proteomic analysis of ligamentum flavum from patients with lumbar spinal stenosis.

    PubMed

    Kamita, Masahiro; Mori, Taiki; Sakai, Yoshihito; Ito, Sadayuki; Gomi, Masahiro; Miyamoto, Yuko; Harada, Atsushi; Niida, Shumpei; Yamada, Tesshi; Watanabe, Ken; Ono, Masaya

    2015-05-01

    Lumbar spinal stenosis (LSS) is a syndromic degenerative spinal disease and is characterized by spinal canal narrowing with subsequent neural compression causing gait disturbances. Although LSS is a major age-related musculoskeletal disease that causes large decreases in the daily living activities of the elderly, its molecular pathology has not been investigated using proteomics. Thus, we used several proteomic technologies to analyze the ligamentum flavum (LF) of individuals with LSS. Using comprehensive proteomics with strong cation exchange fractionation, we detected 1288 proteins in these LF samples. A GO analysis of the comprehensive proteome revealed that more than 30% of the identified proteins were extracellular. Next, we used 2D image converted analysis of LC/MS to compare LF obtained from individuals with LSS to that obtained from individuals with disc herniation (nondegenerative control). We detected 64 781 MS peaks and identified 1675 differentially expressed peptides derived from 286 proteins. We verified four differentially expressed proteins (fibronectin, serine protease HTRA1, tenascin, and asporin) by quantitative proteomics using SRM/MRM. The present proteomic study is the first to identify proteins from degenerated and hypertrophied LF in LSS, which will help in studying LSS. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Infrared Multiphoton Dissociation for Quantitative Shotgun Proteomics

    PubMed Central

    Ledvina, Aaron R.; Lee, M. Violet; McAlister, Graeme C.; Westphall, Michael S.; Coon, Joshua J.

    2012-01-01

    We modified a dual-cell linear ion trap mass spectrometer to perform infrared multiphoton dissociation (IRMPD) in the low pressure trap of a dual-cell quadrupole linear ion trap (dual cell QLT) and perform large-scale IRMPD analyses of complex peptide mixtures. Upon optimization of activation parameters (precursor q-value, irradiation time, and photon flux), IRMPD subtly, but significantly outperforms resonant excitation CAD for peptides identified at a 1% false-discovery rate (FDR) from a yeast tryptic digest (95% confidence, p = 0.019). We further demonstrate that IRMPD is compatible with the analysis of isobaric-tagged peptides. Using fixed QLT RF amplitude allows for the consistent retention of reporter ions, but necessitates the use of variable IRMPD irradiation times, dependent upon precursor mass-to-charge (m/z). We show that IRMPD activation parameters can be tuned to allow for effective peptide identification and quantitation simultaneously. We thus conclude that IRMPD performed in a dual-cell ion trap is an effective option for the large-scale analysis of both unmodified and isobaric-tagged peptides. PMID:22480380

  12. Clinical proteomic analysis of scrub typhus infection.

    PubMed

    Park, Edmond Changkyun; Lee, Sang-Yeop; Yun, Sung Ho; Choi, Chi-Won; Lee, Hayoung; Song, Hyun Seok; Jun, Sangmi; Kim, Gun-Hwa; Lee, Chang-Seop; Kim, Seung Il

    2018-01-01

    Scrub typhus is an acute and febrile infectious disease caused by the Gram-negative α-proteobacterium Orientia tsutsugamushi from the family Rickettsiaceae that is widely distributed in Northern, Southern and Eastern Asia. In the present study, we analysed the serum proteome of scrub typhus patients to investigate specific clinical protein patterns in an attempt to explain pathophysiology and discover potential biomarkers of infection. Serum samples were collected from three patients (before and after treatment with antibiotics) and three healthy subjects. One-dimensional sodium dodecyl sulphate-polyacrylamide gel electrophoresis followed by liquid chromatography-tandem mass spectrometry was performed to identify differentially abundant proteins using quantitative proteomic approaches. Bioinformatic analysis was then performed using Ingenuity Pathway Analysis. Proteomic analysis identified 236 serum proteins, of which 32 were differentially expressed in normal subjects, naive scrub typhus patients and patients treated with antibiotics. Comparative bioinformatic analysis of the identified proteins revealed up-regulation of proteins involved in immune responses, especially complement system, following infection with O. tsutsugamushi , and normal expression was largely rescued by antibiotic treatment. This is the first proteomic study of clinical serum samples from scrub typhus patients. Proteomic analysis identified changes in protein expression upon infection with O. tsutsugamushi and following antibiotic treatment. Our results provide valuable information for further investigation of scrub typhus therapy and diagnosis.

  13. A reference guide for tree analysis and visualization

    PubMed Central

    2010-01-01

    The quantities of data obtained by the new high-throughput technologies, such as microarrays or ChIP-Chip arrays, and the large-scale OMICS-approaches, such as genomics, proteomics and transcriptomics, are becoming vast. Sequencing technologies become cheaper and easier to use and, thus, large-scale evolutionary studies towards the origins of life for all species and their evolution becomes more and more challenging. Databases holding information about how data are related and how they are hierarchically organized expand rapidly. Clustering analysis is becoming more and more difficult to be applied on very large amounts of data since the results of these algorithms cannot be efficiently visualized. Most of the available visualization tools that are able to represent such hierarchies, project data in 2D and are lacking often the necessary user friendliness and interactivity. For example, the current phylogenetic tree visualization tools are not able to display easy to understand large scale trees with more than a few thousand nodes. In this study, we review tools that are currently available for the visualization of biological trees and analysis, mainly developed during the last decade. We describe the uniform and standard computer readable formats to represent tree hierarchies and we comment on the functionality and the limitations of these tools. We also discuss on how these tools can be developed further and should become integrated with various data sources. Here we focus on freely available software that offers to the users various tree-representation methodologies for biological data analysis. PMID:20175922

  14. Transcriptional analysis of product-concentration driven changes in cellular programs of recombinant Clostridium acetobutylicumstrains.

    PubMed

    Tummala, Seshu B; Junne, Stefan G; Paredes, Carlos J; Papoutsakis, Eleftherios T

    2003-12-30

    Antisense RNA (asRNA) downregulation alters protein expression without changing the regulation of gene expression. Downregulation of primary metabolic enzymes possibly combined with overexpression of other metabolic enzymes may result in profound changes in product formation, and this may alter the large-scale transcriptional program of the cells. DNA-array based large-scale transcriptional analysis has the potential to elucidate factors that control cellular fluxes even in the absence of proteome data. These themes are explored in the study of large-scale transcriptional analysis programs and the in vivo primary-metabolism fluxes of several related recombinant C. acetobutylicum strains: C. acetobutylicum ATCC 824(pSOS95del) (plasmid control; produces high levels of butanol snd acetone), 824(pCTFB1AS) (expresses antisense RNA against CoA transferase (ctfb1-asRNA); produces very low levels of butanol and acetone), and 824(pAADB1) (expresses ctfb1-asRNA and the alcohol-aldehyde dahydrogenase gene (aad); produce high alcohol and low acetone levels). DNA-array based transcriptional analysis revealed that the large changes in product concentrations (snd notably butanol concentration) due to ctfb1-asRNA expression alone and in combination with aad overexpression resulted in dramatic changes of the cellular transcriptome. Cluster analysis and gene expression patterns of established and putative operons involved in stress response, motility, sporulation, and fatty-acid biosynthesis indicate that these simple genetic changes dramatically alter the cellular programs of C. acetobutylicum. Comparison of gene expression and flux analysis data may point to possible flux-controling steps and suggest unknown regulatory mechanisms. Copyright 2003; Wiley Periodicals, Inc.

  15. Automated selected reaction monitoring data analysis workflow for large-scale targeted proteomic studies.

    PubMed

    Surinova, Silvia; Hüttenhain, Ruth; Chang, Ching-Yun; Espona, Lucia; Vitek, Olga; Aebersold, Ruedi

    2013-08-01

    Targeted proteomics based on selected reaction monitoring (SRM) mass spectrometry is commonly used for accurate and reproducible quantification of protein analytes in complex biological mixtures. Strictly hypothesis-driven, SRM assays quantify each targeted protein by collecting measurements on its peptide fragment ions, called transitions. To achieve sensitive and accurate quantitative results, experimental design and data analysis must consistently account for the variability of the quantified transitions. This consistency is especially important in large experiments, which increasingly require profiling up to hundreds of proteins over hundreds of samples. Here we describe a robust and automated workflow for the analysis of large quantitative SRM data sets that integrates data processing, statistical protein identification and quantification, and dissemination of the results. The integrated workflow combines three software tools: mProphet for peptide identification via probabilistic scoring; SRMstats for protein significance analysis with linear mixed-effect models; and PASSEL, a public repository for storage, retrieval and query of SRM data. The input requirements for the protocol are files with SRM traces in mzXML format, and a file with a list of transitions in a text tab-separated format. The protocol is especially suited for data with heavy isotope-labeled peptide internal standards. We demonstrate the protocol on a clinical data set in which the abundances of 35 biomarker candidates were profiled in 83 blood plasma samples of subjects with ovarian cancer or benign ovarian tumors. The time frame to realize the protocol is 1-2 weeks, depending on the number of replicates used in the experiment.

  16. Large-scale proteome analysis of abscisic acid and ABSCISIC ACID INSENSITIVE3-dependent proteins related to desiccation tolerance in Physcomitrella patens

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yotsui, Izumi, E-mail: izumi.yotsui@riken.jp; Serada, Satoshi, E-mail: serada@nibiohn.go.jp; Naka, Tetsuji, E-mail: tnaka@nibiohn.go.jp

    2016-03-18

    Desiccation tolerance is an ancestral feature of land plants and is still retained in non-vascular plants such as bryophytes and some vascular plants. However, except for seeds and spores, this trait is absent in vegetative tissues of vascular plants. Although many studies have focused on understanding the molecular basis underlying desiccation tolerance using transcriptome and proteome approaches, the critical molecular differences between desiccation tolerant plants and non-desiccation plants are still not clear. The moss Physcomitrella patens cannot survive rapid desiccation under laboratory conditions, but if cells of the protonemata are treated by the phytohormone abscisic acid (ABA) prior to desiccation,more » it can survive 24 h exposure to desiccation and regrow after rehydration. The desiccation tolerance induced by ABA (AiDT) is specific to this hormone, but also depends on a plant transcription factor ABSCISIC ACID INSENSITIVE3 (ABI3). Here we report the comparative proteomic analysis of AiDT between wild type and ABI3 deleted mutant (Δabi3) of P. patens using iTRAQ (Isobaric Tags for Relative and Absolute Quantification). From a total of 1980 unique proteins that we identified, only 16 proteins are significantly altered in Δabi3 compared to wild type after desiccation following ABA treatment. Among this group, three of the four proteins that were severely affected in Δabi3 tissue were Arabidopsis orthologous genes, which were expressed in maturing seeds under the regulation of ABI3. These included a Group 1 late embryogenesis abundant (LEA) protein, a short-chain dehydrogenase, and a desiccation-related protein. Our results suggest that at least three of these proteins expressed in desiccation tolerant cells of both Arabidopsis and the moss are very likely to play important roles in acquisition of desiccation tolerance in land plants. Furthermore, our results suggest that the regulatory machinery of ABA- and ABI3-mediated gene expression for desiccation tolerance might have evolved in ancestral land plants before the separation of bryophytes and vascular plants. - Highlights: • Large-scale proteomics highlighted proteins related to plant desiccation tolerance. • The proteins were regulated by both the phytohormone ABA and ABI3. • The proteins accumulated in desiccation tolerant cells of both Arabidopsis and moss. • Evolutionary origin of regulatory machinery for desiccation tolerance is proposed.« less

  17. Bioinformatics strategies in life sciences: from data processing and data warehousing to biological knowledge extraction.

    PubMed

    Thiele, Herbert; Glandorf, Jörg; Hufnagel, Peter

    2010-05-27

    With the large variety of Proteomics workflows, as well as the large variety of instruments and data-analysis software available, researchers today face major challenges validating and comparing their Proteomics data. Here we present a new generation of the ProteinScape bioinformatics platform, now enabling researchers to manage Proteomics data from the generation and data warehousing to a central data repository with a strong focus on the improved accuracy, reproducibility and comparability demanded by many researchers in the field. It addresses scientists; current needs in proteomics identification, quantification and validation. But producing large protein lists is not the end point in Proteomics, where one ultimately aims to answer specific questions about the biological condition or disease model of the analyzed sample. In this context, a new tool has been developed at the Spanish Centro Nacional de Biotecnologia Proteomics Facility termed PIKE (Protein information and Knowledge Extractor) that allows researchers to control, filter and access specific information from genomics and proteomic databases, to understand the role and relationships of the proteins identified in the experiments. Additionally, an EU funded project, ProDac, has coordinated systematic data collection in public standards-compliant repositories like PRIDE. This will cover all aspects from generating MS data in the laboratory, assembling the whole annotation information and storing it together with identifications in a standardised format.

  18. 2D-Difference Gel Electrophoretic Proteomic Analysis of a Cell Culture Model of Alveolar Rhabdomyosarcoma

    PubMed Central

    Pressey, Joseph G.; Pressey, Christine S.; Robinson, Gloria; Herring, Richie; Wilson, Landon; Kelly, David R.; Kim, Helen

    2011-01-01

    To evaluate the consequences of expression of the protein encoded by PAX3-FOXO1 (P3F) in the pediatric malignancy alveolar rhabdomyosarcoma (A-RMS), we developed and evaluated a genetically defined in vitro model of A-RMS tumorigenesis. The expression of P3F in cooperation with simian virus 40 (SV40) Large-T (LT) antigen in murine C3H10T1/2 fibroblasts led to robust malignant transformation. Using 2 dimensional difference gel electrophoresis (2D-DIGE) we compared proteomes from lysates from cells that express P3F + LT versus from cells that express LT alone. Analysis of 2D gel spot patterns by DeCyder™ image analysis software indicated 93 spots that were different in abundance. Peptide mass fingerprint analysis of the 93 spots by matrix assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) analysis identified 37 non-redundant proteins. 2D DIGE analysis of cell culture media conditioned by cells transduced by P3F + LT versus by LT alone found 29 spots in the P3F + LT cells leading to the identification of 11 non-redundant proteins. A substantial number of proteins with potential roles in tumorigenesis and myogenesis were detected, most of which have not been identified in previous wide-scale expression studies of RMS experimental models or tumors. We validated the 2D gel image analysis findings by western blot analysis and immunohistochemistry (IHC). Thus, the 2D DIGE proteomics methodology described here provided an important discovery approach to the study of RMS biology and complements the findings of previous mRNA expression studies. PMID:21110518

  19. 2D-difference gel electrophoretic proteomic analysis of a cell culture model of alveolar rhabdomyosarcoma.

    PubMed

    Pressey, Joseph G; Pressey, Christine S; Robinson, Gloria; Herring, Richie; Wilson, Landon; Kelly, David R; Kim, Helen

    2011-02-04

    To evaluate the consequences of expression of the protein encoded by PAX3-FOXO1 (P3F) in the pediatric malignancy alveolar rhabdomyosarcoma (A-RMS), we developed and evaluated a genetically defined in vitro model of A-RMS tumorigenesis. The expression of P3F in cooperation with simian virus 40 (SV40) Large-T (LT) antigen in murine C3H10T1/2 fibroblasts led to robust malignant transformation. Using 2-dimensional-difference gel electrophoresis (2D-DIGE), we compared proteomes from lysates from cells that express P3F + LT versus from cells that express LT alone. Analysis of 2D gel spot patterns by DeCyder image analysis software indicated 93 spots that were different in abundance. Peptide mass fingerprint analysis of the 93 spots by matrix assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) analysis identified 37 nonredundant proteins. 2D-DIGE analysis of cell culture media conditioned by cells transduced by P3F + LT versus by LT alone found 29 spots in the P3F + LT cells leading to the identification of 11 nonredundant proteins. A substantial number of proteins with potential roles in tumorigenesis and myogenesis were detected, most of which have not been identified in previous wide-scale expression studies of RMS experimental models or tumors. We validated the 2D gel image analysis findings by Western blot analysis and immunohistochemistry (IHC). Thus, the 2D-DIGE proteomics methodology described here provided an important discovery approach to the study of RMS biology and complements the findings of previous mRNA expression studies.

  20. Simple preparation of plant epidermal tissue for laser microdissection and downstream quantitative proteome and carbohydrate analysis

    PubMed Central

    Falter, Christian; Ellinger, Dorothea; von Hülsen, Behrend; Heim, René; Voigt, Christian A.

    2015-01-01

    The outwardly directed cell wall and associated plasma membrane of epidermal cells represent the first layers of plant defense against intruding pathogens. Cell wall modifications and the formation of defense structures at sites of attempted pathogen penetration are decisive for plant defense. A precise isolation of these stress-induced structures would allow a specific analysis of regulatory mechanism and cell wall adaption. However, methods for large-scale epidermal tissue preparation from the model plant Arabidopsis thaliana, which would allow proteome and cell wall analysis of complete, laser-microdissected epidermal defense structures, have not been provided. We developed the adhesive tape – liquid cover glass technique (ACT) for simple leaf epidermis preparation from A. thaliana, which is also applicable on grass leaves. This method is compatible with subsequent staining techniques to visualize stress-related cell wall structures, which were precisely isolated from the epidermal tissue layer by laser microdissection (LM) coupled to laser pressure catapulting. We successfully demonstrated that these specific epidermal tissue samples could be used for quantitative downstream proteome and cell wall analysis. The development of the ACT for simple leaf epidermis preparation and the compatibility to LM and downstream quantitative analysis opens new possibilities in the precise examination of stress- and pathogen-related cell wall structures in epidermal cells. Because the developed tissue processing is also applicable on A. thaliana, well-established, model pathosystems that include the interaction with powdery mildews can be studied to determine principal regulatory mechanisms in plant–microbe interaction with their potential outreach into crop breeding. PMID:25870605

  1. Simple preparation of plant epidermal tissue for laser microdissection and downstream quantitative proteome and carbohydrate analysis.

    PubMed

    Falter, Christian; Ellinger, Dorothea; von Hülsen, Behrend; Heim, René; Voigt, Christian A

    2015-01-01

    The outwardly directed cell wall and associated plasma membrane of epidermal cells represent the first layers of plant defense against intruding pathogens. Cell wall modifications and the formation of defense structures at sites of attempted pathogen penetration are decisive for plant defense. A precise isolation of these stress-induced structures would allow a specific analysis of regulatory mechanism and cell wall adaption. However, methods for large-scale epidermal tissue preparation from the model plant Arabidopsis thaliana, which would allow proteome and cell wall analysis of complete, laser-microdissected epidermal defense structures, have not been provided. We developed the adhesive tape - liquid cover glass technique (ACT) for simple leaf epidermis preparation from A. thaliana, which is also applicable on grass leaves. This method is compatible with subsequent staining techniques to visualize stress-related cell wall structures, which were precisely isolated from the epidermal tissue layer by laser microdissection (LM) coupled to laser pressure catapulting. We successfully demonstrated that these specific epidermal tissue samples could be used for quantitative downstream proteome and cell wall analysis. The development of the ACT for simple leaf epidermis preparation and the compatibility to LM and downstream quantitative analysis opens new possibilities in the precise examination of stress- and pathogen-related cell wall structures in epidermal cells. Because the developed tissue processing is also applicable on A. thaliana, well-established, model pathosystems that include the interaction with powdery mildews can be studied to determine principal regulatory mechanisms in plant-microbe interaction with their potential outreach into crop breeding.

  2. Comparing Simplification Strategies for the Skeletal Muscle Proteome

    PubMed Central

    Geary, Bethany; Young, Iain S.; Cash, Phillip; Whitfield, Phillip D.; Doherty, Mary K.

    2016-01-01

    Skeletal muscle is a complex tissue that is dominated by the presence of a few abundant proteins. This wide dynamic range can mask the presence of lower abundance proteins, which can be a confounding factor in large-scale proteomic experiments. In this study, we have investigated a number of pre-fractionation methods, at both the protein and peptide level, for the characterization of the skeletal muscle proteome. The analyses revealed that the use of OFFGEL isoelectric focusing yielded the largest number of protein identifications (>750) compared to alternative gel-based and protein equalization strategies. Further, OFFGEL led to a substantial enrichment of a different sub-population of the proteome. Filter-aided sample preparation (FASP), coupled to peptide-level OFFGEL provided more confidence in the results due to a substantial increase in the number of peptides assigned to each protein. The findings presented here support the use of a multiplexed approach to proteome characterization of skeletal muscle, which has a recognized imbalance in the dynamic range of its protein complement. PMID:28248220

  3. Characterization of Macrophage Endogenous S-Nitrosoproteome Using a Cysteine-Specific Phosphonate Adaptable Tag in Combination with TiO2 Chromatography.

    PubMed

    Ibáñez-Vea, María; Huang, Honggang; Martínez de Morentin, Xabier; Pérez, Estela; Gato, Maria; Zuazo, Miren; Arasanz, Hugo; Fernández-Irigoyen, Joaquin; Santamaría, Enrique; Fernandez-Hinojal, Gonzalo; Larsen, Martin R; Escors, David; Kochan, Grazyna

    2018-03-02

    Protein S-nitrosylation is a cysteine post-translational modification mediated by nitric oxide. An increasing number of studies highlight S-nitrosylation as an important regulator of signaling involved in numerous cellular processes. Despite the significant progress in the development of redox proteomic methods, identification and quantification of endogeneous S-nitrosylation using high-throughput mass-spectrometry-based methods is a technical challenge because this modification is highly labile. To overcome this drawback, most methods induce S-nitrosylation chemically in proteins using nitrosylating compounds before analysis, with the risk of introducing nonphysiological S-nitrosylation. Here we present a novel method to efficiently identify endogenous S-nitrosopeptides in the macrophage total proteome. Our approach is based on the labeling of S-nitrosopeptides reduced by ascorbate with a cysteine specific phosphonate adaptable tag (CysPAT), followed by titanium dioxide (TiO 2 ) chromatography enrichment prior to nLC-MS/MS analysis. To test our procedure, we performed a large-scale analysis of this low-abundant modification in a murine macrophage cell line. We identified 569 endogeneous S-nitrosylated proteins compared with 795 following exogenous chemically induced S-nitrosylation. Importantly, we discovered 579 novel S-nitrosylation sites. The large number of identified endogenous S-nitrosylated peptides allowed the definition of two S-nitrosylation consensus sites, highlighting protein translation and redox processes as key S-nitrosylation targets in macrophages.

  4. A large scale Plasmodium vivax- Saimiri boliviensis trophozoite-schizont transition proteome

    PubMed Central

    Lapp, Stacey A.; Barnwell, John W.; Galinski, Mary R.

    2017-01-01

    Plasmodium vivax is a complex protozoan parasite with over 6,500 genes and stage-specific differential expression. Much of the unique biology of this pathogen remains unknown, including how it modifies and restructures the host reticulocyte. Using a recently published P. vivax reference genome, we report the proteome from two biological replicates of infected Saimiri boliviensis host reticulocytes undergoing transition from the late trophozoite to early schizont stages. Using five database search engines, we identified a total of 2000 P. vivax and 3487 S. boliviensis proteins, making this the most comprehensive P. vivax proteome to date. PlasmoDB GO-term enrichment analysis of proteins identified at least twice by a search engine highlighted core metabolic processes and molecular functions such as glycolysis, translation and protein folding, cell components such as ribosomes, proteasomes and the Golgi apparatus, and a number of vesicle and trafficking related clusters. Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.8 enriched functional annotation clusters of S. boliviensis proteins highlighted vesicle and trafficking-related clusters, elements of the cytoskeleton, oxidative processes and response to oxidative stress, macromolecular complexes such as the proteasome and ribosome, metabolism, translation, and cell death. Host and parasite proteins potentially involved in cell adhesion were also identified. Over 25% of the P. vivax proteins have no functional annotation; this group includes 45 VIR members of the large PIR family. A number of host and pathogen proteins contained highly oxidized or nitrated residues, extending prior trophozoite-enriched stage observations from S. boliviensis infections, and supporting the possibility of oxidative stress in relation to the disease. This proteome significantly expands the size and complexity of the known P. vivax and Saimiri host iRBC proteomes, and provides in-depth data that will be valuable for ongoing research on this parasite’s biology and pathogenesis. PMID:28829774

  5. Protein complexes, big data, machine learning and integrative proteomics: lessons learned over a decade of systematic analysis of protein interaction networks.

    PubMed

    Havugimana, Pierre C; Hu, Pingzhao; Emili, Andrew

    2017-10-01

    Elucidation of the networks of physical (functional) interactions present in cells and tissues is fundamental for understanding the molecular organization of biological systems, the mechanistic basis of essential and disease-related processes, and for functional annotation of previously uncharacterized proteins (via guilt-by-association or -correlation). After a decade in the field, we felt it timely to document our own experiences in the systematic analysis of protein interaction networks. Areas covered: Researchers worldwide have contributed innovative experimental and computational approaches that have driven the rapidly evolving field of 'functional proteomics'. These include mass spectrometry-based methods to characterize macromolecular complexes on a global-scale and sophisticated data analysis tools - most notably machine learning - that allow for the generation of high-quality protein association maps. Expert commentary: Here, we recount some key lessons learned, with an emphasis on successful workflows, and challenges, arising from our own and other groups' ongoing efforts to generate, interpret and report proteome-scale interaction networks in increasingly diverse biological contexts.

  6. Proteomic Profiling of Mitochondrial Enzymes during Skeletal Muscle Aging.

    PubMed

    Staunton, Lisa; O'Connell, Kathleen; Ohlendieck, Kay

    2011-03-07

    Mitochondria are of central importance for energy generation in skeletal muscles. Expression changes or functional alterations in mitochondrial enzymes play a key role during myogenesis, fibre maturation, and various neuromuscular pathologies, as well as natural fibre aging. Mass spectrometry-based proteomics suggests itself as a convenient large-scale and high-throughput approach to catalogue the mitochondrial protein complement and determine global changes during health and disease. This paper gives a brief overview of the relatively new field of mitochondrial proteomics and discusses the findings from recent proteomic surveys of mitochondrial elements in aged skeletal muscles. Changes in the abundance, biochemical activity, subcellular localization, and/or posttranslational modifications in key mitochondrial enzymes might be useful as novel biomarkers of aging. In the long term, this may advance diagnostic procedures, improve the monitoring of disease progression, help in the testing of side effects due to new drug regimes, and enhance our molecular understanding of age-related muscle degeneration.

  7. Mass spectrometry-based biomarker discovery: toward a global proteome index of individuality.

    PubMed

    Hawkridge, Adam M; Muddiman, David C

    2009-01-01

    Biomarker discovery and proteomics have become synonymous with mass spectrometry in recent years. Although this conflation is an injustice to the many essential biomolecular techniques widely used in biomarker-discovery platforms, it underscores the power and potential of contemporary mass spectrometry. Numerous novel and powerful technologies have been developed around mass spectrometry, proteomics, and biomarker discovery over the past 20 years to globally study complex proteomes (e.g., plasma). However, very few large-scale longitudinal studies have been carried out using these platforms to establish the analytical variability relative to true biological variability. The purpose of this review is not to cover exhaustively the applications of mass spectrometry to biomarker discovery, but rather to discuss the analytical methods and strategies that have been developed for mass spectrometry-based biomarker-discovery platforms and to place them in the context of the many challenges and opportunities yet to be addressed.

  8. Proteomic Analysis of Pigeonpea (Cajanus cajan) Seeds Reveals the Accumulation of Numerous Stress-Related Proteins.

    PubMed

    Krishnan, Hari B; Natarajan, Savithiry S; Oehrle, Nathan W; Garrett, Wesley M; Darwish, Omar

    2017-06-14

    Pigeonpea is one of the major sources of dietary protein for more than a billion people living in South Asia. This hardy legume is often grown in low-input and risk-prone marginal environments. Considerable research effort has been devoted by a global research consortium to develop genomic resources for the improvement of this legume crop. These efforts have resulted in the elucidation of the complete genome sequence of pigeonpea. Despite these developments, little is known about the seed proteome of this important crop. Here, we report the proteome of pigeonpea seed. To enable the isolation of maximum number of seed proteins, including those that are present in very low amounts, three different protein fractions were obtained by employing different extraction media. High-resolution two-dimensional (2-D) electrophoresis followed by MALDI-TOF-TOF-MS/MS analysis of these protein fractions resulted in the identification of 373 pigeonpea seed proteins. Consistent with the reported high degree of synteny between the pigeonpea and soybean genomes, a large number of pigeonpea seed proteins exhibited significant amino acid homology with soybean seed proteins. Our proteomic analysis identified a large number of stress-related proteins, presumably due to its adaptation to drought-prone environments. The availability of a pigeonpea seed proteome reference map should shed light on the roles of these identified proteins in various biological processes and facilitate the improvement of seed composition.

  9. Investigating the Role of Large-Scale Domain Dynamics in Protein-Protein Interactions.

    PubMed

    Delaforge, Elise; Milles, Sigrid; Huang, Jie-Rong; Bouvier, Denis; Jensen, Malene Ringkjøbing; Sattler, Michael; Hart, Darren J; Blackledge, Martin

    2016-01-01

    Intrinsically disordered linkers provide multi-domain proteins with degrees of conformational freedom that are often essential for function. These highly dynamic assemblies represent a significant fraction of all proteomes, and deciphering the physical basis of their interactions represents a considerable challenge. Here we describe the difficulties associated with mapping the large-scale domain dynamics and describe two recent examples where solution state methods, in particular NMR spectroscopy, are used to investigate conformational exchange on very different timescales.

  10. Investigating the Role of Large-Scale Domain Dynamics in Protein-Protein Interactions

    PubMed Central

    Delaforge, Elise; Milles, Sigrid; Huang, Jie-rong; Bouvier, Denis; Jensen, Malene Ringkjøbing; Sattler, Michael; Hart, Darren J.; Blackledge, Martin

    2016-01-01

    Intrinsically disordered linkers provide multi-domain proteins with degrees of conformational freedom that are often essential for function. These highly dynamic assemblies represent a significant fraction of all proteomes, and deciphering the physical basis of their interactions represents a considerable challenge. Here we describe the difficulties associated with mapping the large-scale domain dynamics and describe two recent examples where solution state methods, in particular NMR spectroscopy, are used to investigate conformational exchange on very different timescales. PMID:27679800

  11. Development of a Highly Automated and Multiplexed Targeted Proteome Pipeline and Assay for 112 Rat Brain Synaptic Proteins

    PubMed Central

    Colangelo, Christopher M.; Ivosev, Gordana; Chung, Lisa; Abbott, Thomas; Shifman, Mark; Sakaue, Fumika; Cox, David; Kitchen, Rob R.; Burton, Lyle; Tate, Stephen A; Gulcicek, Erol; Bonner, Ron; Rinehart, Jesse; Nairn, Angus C.; Williams, Kenneth R.

    2015-01-01

    We present a comprehensive workflow for large scale (>1000 transitions/run) label-free LC-MRM proteome assays. Innovations include automated MRM transition selection, intelligent retention time scheduling (xMRM) that improves Signal/Noise by >2-fold, and automatic peak modeling. Improvements to data analysis include a novel Q/C metric, Normalized Group Area Ratio (NGAR), MLR normalization, weighted regression analysis, and data dissemination through the Yale Protein Expression Database. As a proof of principle we developed a robust 90 minute LC-MRM assay for Mouse/Rat Post-Synaptic Density (PSD) fractions which resulted in the routine quantification of 337 peptides from 112 proteins based on 15 observations per protein. Parallel analyses with stable isotope dilution peptide standards (SIS), demonstrate very high correlation in retention time (1.0) and protein fold change (0.94) between the label-free and SIS analyses. Overall, our first method achieved a technical CV of 11.4% with >97.5% of the 1697 transitions being quantified without user intervention, resulting in a highly efficient, robust, and single injection LC-MRM assay. PMID:25476245

  12. Laser assisted microdissection, an efficient technique to understand tissue specific gene expression patterns and functional genomics in plants.

    PubMed

    Gautam, Vibhav; Sarkar, Ananda K

    2015-04-01

    Laser assisted microdissection (LAM) is an advanced technology used to perform tissue or cell-specific expression profiling of genes and proteins, owing to its ability to isolate the desired tissue or cell type from a heterogeneous population. Due to the specificity and high efficiency acquired during its pioneering use in medical science, the LAM technique has quickly been adopted for use in many biological researches. Today, it has become a potent tool to address a wide range of questions in diverse field of plant biology. Beginning with comparative transcriptome analysis of different tissues such as reproductive parts, meristems, lateral organs, roots etc., LAM has also been extensively used in plant-pathogen interaction studies, proteomics, and metabolomics. In combination with next generation sequencing and proteomics analysis, LAM has opened up promising opportunities in the area of large scale functional studies in plants. Ever since the advent of this technique, significant improvements have been achieved in term of its instrumentation and method, which has made LAM a more efficient tool applicable in wider research areas. Here, we discuss the advancement of LAM technique with special emphasis on its methodology and highlight its scope in modern research areas of plant biology. Although we put emphasis on use of LAM in transcriptome studies, which is mostly used, we also discuss its recent application and scope in proteome and metabolome studies.

  13. Global Proteomics Analysis of Protein Lysine Methylation.

    PubMed

    Cao, Xing-Jun; Garcia, Benjamin A

    2016-11-01

    Lysine methylation is a common protein post-translational modification dynamically mediated by protein lysine methyltransferases (PKMTs) and protein lysine demethylases (PKDMs). Beyond histone proteins, lysine methylation on non-histone proteins plays a substantial role in a variety of functions in cells and is closely associated with diseases such as cancer. A large body of evidence indicates that the dysregulation of some PKMTs leads to tumorigenesis via their non-histone substrates. However, most studies on other PKMTs have made slow progress owing to the lack of approaches for extensive screening of lysine methylation sites. However, recently, there has been a series of publications to perform large-scale analysis of protein lysine methylation. In this unit, we introduce a protocol for the global analysis of protein lysine methylation in cells by means of immunoaffinity enrichment and mass spectrometry. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  14. Resources for Functional Genomics Studies in Drosophila melanogaster

    PubMed Central

    Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert

    2014-01-01

    Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003

  15. Benchmarking quantitative label-free LC-MS data processing workflows using a complex spiked proteomic standard dataset.

    PubMed

    Ramus, Claire; Hovasse, Agnès; Marcellin, Marlène; Hesse, Anne-Marie; Mouton-Barbosa, Emmanuelle; Bouyssié, David; Vaca, Sebastian; Carapito, Christine; Chaoui, Karima; Bruley, Christophe; Garin, Jérôme; Cianférani, Sarah; Ferro, Myriam; Van Dorssaeler, Alain; Burlet-Schiltz, Odile; Schaeffer, Christine; Couté, Yohann; Gonzalez de Peredo, Anne

    2016-01-30

    Proteomic workflows based on nanoLC-MS/MS data-dependent-acquisition analysis have progressed tremendously in recent years. High-resolution and fast sequencing instruments have enabled the use of label-free quantitative methods, based either on spectral counting or on MS signal analysis, which appear as an attractive way to analyze differential protein expression in complex biological samples. However, the computational processing of the data for label-free quantification still remains a challenge. Here, we used a proteomic standard composed of an equimolar mixture of 48 human proteins (Sigma UPS1) spiked at different concentrations into a background of yeast cell lysate to benchmark several label-free quantitative workflows, involving different software packages developed in recent years. This experimental design allowed to finely assess their performances in terms of sensitivity and false discovery rate, by measuring the number of true and false-positive (respectively UPS1 or yeast background proteins found as differential). The spiked standard dataset has been deposited to the ProteomeXchange repository with the identifier PXD001819 and can be used to benchmark other label-free workflows, adjust software parameter settings, improve algorithms for extraction of the quantitative metrics from raw MS data, or evaluate downstream statistical methods. Bioinformatic pipelines for label-free quantitative analysis must be objectively evaluated in their ability to detect variant proteins with good sensitivity and low false discovery rate in large-scale proteomic studies. This can be done through the use of complex spiked samples, for which the "ground truth" of variant proteins is known, allowing a statistical evaluation of the performances of the data processing workflow. We provide here such a controlled standard dataset and used it to evaluate the performances of several label-free bioinformatics tools (including MaxQuant, Skyline, MFPaQ, IRMa-hEIDI and Scaffold) in different workflows, for detection of variant proteins with different absolute expression levels and fold change values. The dataset presented here can be useful for tuning software tool parameters, and also testing new algorithms for label-free quantitative analysis, or for evaluation of downstream statistical methods. Copyright © 2015 Elsevier B.V. All rights reserved.

  16. A complete mass spectrometric map for the analysis of the yeast proteome and its application to quantitative trait analysis

    PubMed Central

    Picotti, Paola; Clement-Ziza, Mathieu; Lam, Henry; Campbell, David S.; Schmidt, Alexander; Deutsch, Eric W.; Röst, Hannes; Sun, Zhi; Rinner, Oliver; Reiter, Lukas; Shen, Qin; Michaelson, Jacob J.; Frei, Andreas; Alberti, Simon; Kusebauch, Ulrike; Wollscheid, Bernd; Moritz, Robert; Beyer, Andreas; Aebersold, Ruedi

    2013-01-01

    Complete reference maps or datasets, like the genomic map of an organism, are highly beneficial tools for biological and biomedical research. Attempts to generate such reference datasets for a proteome so far failed to reach complete proteome coverage, with saturation apparent at approximately two thirds of the proteomes tested, even for the most thoroughly characterized proteomes. Here, we used a strategy based on high-throughput peptide synthesis and mass spectrometry to generate a close to complete reference map (97% of the genome-predicted proteins) of the S. cerevisiae proteome. We generated two versions of this mass spectrometric map one supporting discovery- (shotgun) and the other hypothesis-driven (targeted) proteomic measurements. The two versions of the map, therefore, constitute a complete set of proteomic assays to support most studies performed with contemporary proteomic technologies. The reference libraries can be browsed via a web-based repository and associated navigation tools. To demonstrate the utility of the reference libraries we applied them to a protein quantitative trait locus (pQTL) analysis, which requires measurement of the same peptides over a large number of samples with high precision. Protein measurements over a set of 78 S. cerevisiae strains revealed a complex relationship between independent genetic loci, impacting on the levels of related proteins. Our results suggest that selective pressure favors the acquisition of sets of polymorphisms that maintain the stoichiometry of protein complexes and pathways. PMID:23334424

  17. A pursuit of lineage-specific and niche-specific proteome features in the world of archaea

    PubMed Central

    2012-01-01

    Background Archaea evoke interest among researchers for two enigmatic characteristics –a combination of bacterial and eukaryotic components in their molecular architectures and an enormous diversity in their life-style and metabolic capabilities. Despite considerable research efforts, lineage- specific/niche-specific molecular features of the whole archaeal world are yet to be fully unveiled. The study offers the first large-scale in silico proteome analysis of all archaeal species of known genome sequences with a special emphasis on methanogenic and sulphur-metabolising archaea. Results Overall amino acid usage in archaea is dominated by GC-bias. But the environmental factors like oxygen requirement or thermal adaptation seem to play important roles in selection of residues with no GC-bias at the codon level. All methanogens, irrespective of their thermal/salt adaptation, show higher usage of Cys and have relatively acidic proteomes, while the proteomes of sulphur-metabolisers have higher aromaticity and more positive charges. Despite of exhibiting thermophilic life-style, korarchaeota possesses an acidic proteome. Among the distinct trends prevailing in COGs (Cluster of Orthologous Groups of proteins) distribution profiles, crenarchaeal organisms display higher intra-order variations in COGs repertoire, especially in the metabolic ones, as compared to euryarchaea. All methanogens are characterised by a presence of 22 exclusive COGs. Conclusions Divergences in amino acid usage, aromaticity/charge profiles and COG repertoire among methanogens and sulphur-metabolisers, aerobic and anaerobic archaea or korarchaeota and nanoarchaeota, as elucidated in the present study, point towards the presence of distinct molecular strategies for niche specialization in the archaeal world. PMID:22691113

  18. A pursuit of lineage-specific and niche-specific proteome features in the world of archaea.

    PubMed

    Roy Chowdhury, Anindya; Dutta, Chitra

    2012-06-12

    Archaea evoke interest among researchers for two enigmatic characteristics -a combination of bacterial and eukaryotic components in their molecular architectures and an enormous diversity in their life-style and metabolic capabilities. Despite considerable research efforts, lineage- specific/niche-specific molecular features of the whole archaeal world are yet to be fully unveiled. The study offers the first large-scale in silico proteome analysis of all archaeal species of known genome sequences with a special emphasis on methanogenic and sulphur-metabolising archaea. Overall amino acid usage in archaea is dominated by GC-bias. But the environmental factors like oxygen requirement or thermal adaptation seem to play important roles in selection of residues with no GC-bias at the codon level. All methanogens, irrespective of their thermal/salt adaptation, show higher usage of Cys and have relatively acidic proteomes, while the proteomes of sulphur-metabolisers have higher aromaticity and more positive charges. Despite of exhibiting thermophilic life-style, korarchaeota possesses an acidic proteome. Among the distinct trends prevailing in COGs (Cluster of Orthologous Groups of proteins) distribution profiles, crenarchaeal organisms display higher intra-order variations in COGs repertoire, especially in the metabolic ones, as compared to euryarchaea. All methanogens are characterised by a presence of 22 exclusive COGs. Divergences in amino acid usage, aromaticity/charge profiles and COG repertoire among methanogens and sulphur-metabolisers, aerobic and anaerobic archaea or korarchaeota and nanoarchaeota, as elucidated in the present study, point towards the presence of distinct molecular strategies for niche specialization in the archaeal world.

  19. Applications of mass spectrometry for quantitative protein analysis in formalin-fixed paraffin-embedded tissues

    PubMed Central

    Steiner, Carine; Ducret, Axel; Tille, Jean-Christophe; Thomas, Marlene; McKee, Thomas A; Rubbia-Brandt, Laura A; Scherl, Alexander; Lescuyer, Pierre; Cutler, Paul

    2014-01-01

    Proteomic analysis of tissues has advanced in recent years as instruments and methodologies have evolved. The ability to retrieve peptides from formalin-fixed paraffin-embedded tissues followed by shotgun or targeted proteomic analysis is offering new opportunities in biomedical research. In particular, access to large collections of clinically annotated samples should enable the detailed analysis of pathologically relevant tissues in a manner previously considered unfeasible. In this paper, we review the current status of proteomic analysis of formalin-fixed paraffin-embedded tissues with a particular focus on targeted approaches and the potential for this technique to be used in clinical research and clinical diagnosis. We also discuss the limitations and perspectives of the technique, particularly with regard to application in clinical diagnosis and drug discovery. PMID:24339433

  20. Large-scale gene function analysis with the PANTHER classification system.

    PubMed

    Mi, Huaiyu; Muruganujan, Anushya; Casagrande, John T; Thomas, Paul D

    2013-08-01

    The PANTHER (protein annotation through evolutionary relationship) classification system (http://www.pantherdb.org/) is a comprehensive system that combines gene function, ontology, pathways and statistical analysis tools that enable biologists to analyze large-scale, genome-wide data from sequencing, proteomics or gene expression experiments. The system is built with 82 complete genomes organized into gene families and subfamilies, and their evolutionary relationships are captured in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models or HMMs). Genes are classified according to their function in several different ways: families and subfamilies are annotated with ontology terms (Gene Ontology (GO) and PANTHER protein class), and sequences are assigned to PANTHER pathways. The PANTHER website includes a suite of tools that enable users to browse and query gene functions, and to analyze large-scale experimental data with a number of statistical tests. It is widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. In the 2013 release of PANTHER (v.8.0), in addition to an update of the data content, we redesigned the website interface to improve both user experience and the system's analytical capability. This protocol provides a detailed description of how to analyze genome-wide experimental data with the PANTHER classification system.

  1. Linking the proteins--elucidation of proteome-scale networks using mass spectrometry.

    PubMed

    Pflieger, Delphine; Gonnet, Florence; de la Fuente van Bentem, Sergio; Hirt, Heribert; de la Fuente, Alberto

    2011-01-01

    Proteomes are intricate. Typically, thousands of proteins interact through physical association and post-translational modifications (PTMs) to give rise to the emergent functions of cells. Understanding these functions requires one to study proteomes as "systems" rather than collections of individual protein molecules. The abstraction of the interacting proteome to "protein networks" has recently gained much attention, as networks are effective representations, that lose specific molecular details, but provide the ability to see the proteome as a whole. Mostly two aspects of the proteome have been represented by network models: proteome-wide physical protein-protein-binding interactions organized into Protein Interaction Networks (PINs), and proteome-wide PTM relations organized into Protein Signaling Networks (PSNs). Mass spectrometry (MS) techniques have been shown to be essential to reveal both of these aspects on a proteome-wide scale. Techniques such as affinity purification followed by MS have been used to elucidate protein-protein interactions, and MS-based quantitative phosphoproteomics is critical to understand the structure and dynamics of signaling through the proteome. We here review the current state-of-the-art MS-based analytical pipelines for the purpose to characterize proteome-scale networks. Copyright © 2010 Wiley Periodicals, Inc.

  2. Placental Proteomics: A Shortcut to Biological Insight

    PubMed Central

    Robinson, John M.; Vandré, Dale D.; Ackerman, William E.

    2012-01-01

    Proteomics analysis of biological samples has the potential to identify novel protein expression patterns and/or changes in protein expression patterns in different developmental or disease states. An important component of successful proteomics research, at least in its present form, is to reduce the complexity of the sample if it is derived from cells or tissues. One method to simplify complex tissues is to focus on a specific, highly purified sub-proteome. Using this approach we have developed methods to prepare highly enriched fractions of the apical plasma membrane of the syncytiotrophoblast. Through proteomics analysis of this fraction we have identified over five hundred proteins several of which were previously not known to reside in the syncytiotrophoblast. Herein, we focus on two of these, dysferlin and myoferlin. These proteins, largely known from studies of skeletal muscle, may not have been found in the human placenta were it not for discovery-based proteomics analysis. This new knowledge, acquired through a discovery-driven approach, can now be applied for the generation of hypothesis-based experimentation. Thus discovery-based and hypothesis-based research are complimentary approaches that when coupled together can hasten scientific discoveries. PMID:19070895

  3. Systematic Evaluation of the Use of Human Plasma and Serum for Mass-Spectrometry-Based Shotgun Proteomics.

    PubMed

    Lan, Jiayi; Núñez Galindo, Antonio; Doecke, James; Fowler, Christopher; Martins, Ralph N; Rainey-Smith, Stephanie R; Cominetti, Ornella; Dayon, Loïc

    2018-04-06

    Over the last two decades, EDTA-plasma has been used as the preferred sample matrix for human blood proteomic profiling. Serum has also been employed widely. Only a few studies have assessed the difference and relevance of the proteome profiles obtained from plasma samples, such as EDTA-plasma or lithium-heparin-plasma, and serum. A more complete evaluation of the use of EDTA-plasma, heparin-plasma, and serum would greatly expand the comprehensiveness of shotgun proteomics of blood samples. In this study, we evaluated the use of heparin-plasma with respect to EDTA-plasma and serum to profile blood proteomes using a scalable automated proteomic pipeline (ASAP 2 ). The use of plasma and serum for mass-spectrometry-based shotgun proteomics was first tested with commercial pooled samples. The proteome coverage consistency and the quantitative performance were compared. Furthermore, protein measurements in EDTA-plasma and heparin-plasma samples were comparatively studied using matched sample pairs from 20 individuals from the Australian Imaging, Biomarkers and Lifestyle (AIBL) Study. We identified 442 proteins in common between EDTA-plasma and heparin-plasma samples. Overall agreement of the relative protein quantification between the sample pairs demonstrated that shotgun proteomics using workflows such as the ASAP 2 is suitable in analyzing heparin-plasma and that such sample type may be considered in large-scale clinical research studies. Moreover, the partial proteome coverage overlaps (e.g., ∼70%) showed that measures from heparin-plasma could be complementary to those obtained from EDTA-plasma.

  4. Inconsistencies in the red blood cell membrane proteome analysis: generation of a database for research and diagnostic applications

    PubMed Central

    Hegedűs, Tamás; Chaubey, Pururawa Mayank; Várady, György; Szabó, Edit; Sarankó, Hajnalka; Hofstetter, Lia; Roschitzki, Bernd; Sarkadi, Balázs

    2015-01-01

    Based on recent results, the determination of the easily accessible red blood cell (RBC) membrane proteins may provide new diagnostic possibilities for assessing mutations, polymorphisms or regulatory alterations in diseases. However, the analysis of the current mass spectrometry-based proteomics datasets and other major databases indicates inconsistencies—the results show large scattering and only a limited overlap for the identified RBC membrane proteins. Here, we applied membrane-specific proteomics studies in human RBC, compared these results with the data in the literature, and generated a comprehensive and expandable database using all available data sources. The integrated web database now refers to proteomic, genetic and medical databases as well, and contains an unexpected large number of validated membrane proteins previously thought to be specific for other tissues and/or related to major human diseases. Since the determination of protein expression in RBC provides a method to indicate pathological alterations, our database should facilitate the development of RBC membrane biomarker platforms and provide a unique resource to aid related further research and diagnostics. Database URL: http://rbcc.hegelab.org PMID:26078478

  5. The amino acid's backup bone - storage solutions for proteomics facilities.

    PubMed

    Meckel, Hagen; Stephan, Christian; Bunse, Christian; Krafzik, Michael; Reher, Christopher; Kohl, Michael; Meyer, Helmut Erich; Eisenacher, Martin

    2014-01-01

    Proteomics methods, especially high-throughput mass spectrometry analysis have been continually developed and improved over the years. The analysis of complex biological samples produces large volumes of raw data. Data storage and recovery management pose substantial challenges to biomedical or proteomic facilities regarding backup and archiving concepts as well as hardware requirements. In this article we describe differences between the terms backup and archive with regard to manual and automatic approaches. We also introduce different storage concepts and technologies from transportable media to professional solutions such as redundant array of independent disks (RAID) systems, network attached storages (NAS) and storage area network (SAN). Moreover, we present a software solution, which we developed for the purpose of long-term preservation of large mass spectrometry raw data files on an object storage device (OSD) archiving system. Finally, advantages, disadvantages, and experiences from routine operations of the presented concepts and technologies are evaluated and discussed. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan. Copyright © 2013. Published by Elsevier B.V.

  6. Proteome-wide search for functional motifs altered in tumors: Prediction of nuclear export signals inactivated by cancer-related mutations

    PubMed Central

    Prieto, Gorka; Fullaondo, Asier; Rodríguez, Jose A.

    2016-01-01

    Large-scale sequencing projects are uncovering a growing number of missense mutations in human tumors. Understanding the phenotypic consequences of these alterations represents a formidable challenge. In silico prediction of functionally relevant amino acid motifs disrupted by cancer mutations could provide insight into the potential impact of a mutation, and guide functional tests. We have previously described Wregex, a tool for the identification of potential functional motifs, such as nuclear export signals (NESs), in proteins. Here, we present an improved version that allows motif prediction to be combined with data from large repositories, such as the Catalogue of Somatic Mutations in Cancer (COSMIC), and to be applied to a whole proteome scale. As an example, we have searched the human proteome for candidate NES motifs that could be altered by cancer-related mutations included in the COSMIC database. A subset of the candidate NESs identified was experimentally tested using an in vivo nuclear export assay. A significant proportion of the selected motifs exhibited nuclear export activity, which was abrogated by the COSMIC mutations. In addition, our search identified a cancer mutation that inactivates the NES of the human deubiquitinase USP21, and leads to the aberrant accumulation of this protein in the nucleus. PMID:27174732

  7. The HUPO PSI's molecular interaction format--a community standard for the representation of protein interaction data.

    PubMed

    Hermjakob, Henning; Montecchi-Palazzi, Luisa; Bader, Gary; Wojcik, Jérôme; Salwinski, Lukasz; Ceol, Arnaud; Moore, Susan; Orchard, Sandra; Sarkans, Ugis; von Mering, Christian; Roechert, Bernd; Poux, Sylvain; Jung, Eva; Mersch, Henning; Kersey, Paul; Lappe, Michael; Li, Yixue; Zeng, Rong; Rana, Debashis; Nikolski, Macha; Husi, Holger; Brun, Christine; Shanker, K; Grant, Seth G N; Sander, Chris; Bork, Peer; Zhu, Weimin; Pandey, Akhilesh; Brazma, Alvis; Jacq, Bernard; Vidal, Marc; Sherman, David; Legrain, Pierre; Cesareni, Gianni; Xenarios, Ioannis; Eisenberg, David; Steipe, Boris; Hogue, Chris; Apweiler, Rolf

    2004-02-01

    A major goal of proteomics is the complete description of the protein interaction network underlying cell physiology. A large number of small scale and, more recently, large-scale experiments have contributed to expanding our understanding of the nature of the interaction network. However, the necessary data integration across experiments is currently hampered by the fragmentation of publicly available protein interaction data, which exists in different formats in databases, on authors' websites or sometimes only in print publications. Here, we propose a community standard data model for the representation and exchange of protein interaction data. This data model has been jointly developed by members of the Proteomics Standards Initiative (PSI), a work group of the Human Proteome Organization (HUPO), and is supported by major protein interaction data providers, in particular the Biomolecular Interaction Network Database (BIND), Cellzome (Heidelberg, Germany), the Database of Interacting Proteins (DIP), Dana Farber Cancer Institute (Boston, MA, USA), the Human Protein Reference Database (HPRD), Hybrigenics (Paris, France), the European Bioinformatics Institute's (EMBL-EBI, Hinxton, UK) IntAct, the Molecular Interactions (MINT, Rome, Italy) database, the Protein-Protein Interaction Database (PPID, Edinburgh, UK) and the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING, EMBL, Heidelberg, Germany).

  8. Substrate-Mediated Laser Ablation under Ambient Conditions for Spatially-Resolved Tissue Proteomics

    PubMed Central

    Fatou, Benoit; Wisztorski, Maxence; Focsa, Cristian; Salzet, Michel; Ziskind, Michael; Fournier, Isabelle

    2015-01-01

    Numerous applications of ambient Mass Spectrometry (MS) have been demonstrated over the past decade. They promoted the emergence of various micro-sampling techniques such as Laser Ablation/Droplet Capture (LADC). LADC consists in the ablation of analytes from a surface and their subsequent capture in a solvent droplet which can then be analyzed by MS. LADC is thus generally performed in the UV or IR range, using a wavelength at which analytes or the matrix absorb. In this work, we explore the potential of visible range LADC (532 nm) as a micro-sampling technology for large-scale proteomics analyses. We demonstrate that biomolecule analyses using 532 nm LADC are possible, despite the low absorbance of biomolecules at this wavelength. This is due to the preponderance of an indirect substrate-mediated ablation mechanism at low laser energy which contrasts with the conventional direct ablation driven by sample absorption. Using our custom LADC system and taking advantage of this substrate-mediated ablation mechanism, we were able to perform large-scale proteomic analyses of micro-sampled tissue sections and demonstrated the possible identification of proteins with relevant biological functions. Consequently, the 532 nm LADC technique offers a new tool for biological and clinical applications. PMID:26674367

  9. Large Scale Proteomic Data and Network-Based Systems Biology Approaches to Explore the Plant World.

    PubMed

    Di Silvestre, Dario; Bergamaschi, Andrea; Bellini, Edoardo; Mauri, PierLuigi

    2018-06-03

    The investigation of plant organisms by means of data-derived systems biology approaches based on network modeling is mainly characterized by genomic data, while the potential of proteomics is largely unexplored. This delay is mainly caused by the paucity of plant genomic/proteomic sequences and annotations which are fundamental to perform mass-spectrometry (MS) data interpretation. However, Next Generation Sequencing (NGS) techniques are contributing to filling this gap and an increasing number of studies are focusing on plant proteome profiling and protein-protein interactions (PPIs) identification. Interesting results were obtained by evaluating the topology of PPI networks in the context of organ-associated biological processes as well as plant-pathogen relationships. These examples foreshadow well the benefits that these approaches may provide to plant research. Thus, in addition to providing an overview of the main-omic technologies recently used on plant organisms, we will focus on studies that rely on concepts of module, hub and shortest path, and how they can contribute to the plant discovery processes. In this scenario, we will also consider gene co-expression networks, and some examples of integration with metabolomic data and genome-wide association studies (GWAS) to select candidate genes will be mentioned.

  10. Analysis of high accuracy, quantitative proteomics data in the MaxQB database.

    PubMed

    Schaab, Christoph; Geiger, Tamar; Stoehr, Gabriele; Cox, Juergen; Mann, Matthias

    2012-03-01

    MS-based proteomics generates rapidly increasing amounts of precise and quantitative information. Analysis of individual proteomic experiments has made great strides, but the crucial ability to compare and store information across different proteome measurements still presents many challenges. For example, it has been difficult to avoid contamination of databases with low quality peptide identifications, to control for the inflation in false positive identifications when combining data sets, and to integrate quantitative data. Although, for example, the contamination with low quality identifications has been addressed by joint analysis of deposited raw data in some public repositories, we reasoned that there should be a role for a database specifically designed for high resolution and quantitative data. Here we describe a novel database termed MaxQB that stores and displays collections of large proteomics projects and allows joint analysis and comparison. We demonstrate the analysis tools of MaxQB using proteome data of 11 different human cell lines and 28 mouse tissues. The database-wide false discovery rate is controlled by adjusting the project specific cutoff scores for the combined data sets. The 11 cell line proteomes together identify proteins expressed from more than half of all human genes. For each protein of interest, expression levels estimated by label-free quantification can be visualized across the cell lines. Similarly, the expression rank order and estimated amount of each protein within each proteome are plotted. We used MaxQB to calculate the signal reproducibility of the detected peptides for the same proteins across different proteomes. Spearman rank correlation between peptide intensity and detection probability of identified proteins was greater than 0.8 for 64% of the proteome, whereas a minority of proteins have negative correlation. This information can be used to pinpoint false protein identifications, independently of peptide database scores. The information contained in MaxQB, including high resolution fragment spectra, is accessible to the community via a user-friendly web interface at http://www.biochem.mpg.de/maxqb.

  11. Proteomics data repositories

    PubMed Central

    Riffle, Michael; Eng, Jimmy K.

    2010-01-01

    The field of proteomics, particularly the application of mass spectrometry analysis to protein samples, is well-established and growing rapidly. Proteomics studies generate large volumes of raw experimental data and inferred biological results. To facilitate the dissemination of these data, centralized data repositories have been developed that make the data and results accessible to proteomics researchers and biologists alike. This review of proteomics data repositories focuses exclusively on freely-available, centralized data resources that disseminate or store experimental mass spectrometry data and results. The resources chosen reflect a current “snapshot” of the state of resources available with an emphasis placed on resources that may be of particular interest to yeast researchers. Resources are described in terms of their intended purpose and the features and functionality provided to users. PMID:19795424

  12. Direct digestion of proteins in living cells into peptides for proteomic analysis.

    PubMed

    Chen, Qi; Yan, Guoquan; Gao, Mingxia; Zhang, Xiangmin

    2015-01-01

    To analyze the proteome of an extremely low number of cells or even a single cell, we established a new method of digesting whole cells into mass-spectrometry-identifiable peptides in a single step within 2 h. Our sampling method greatly simplified the processes of cell lysis, protein extraction, protein purification, and overnight digestion, without compromising efficiency. We used our method to digest hundred-scale cells. As far as we know, there is no report of proteome analysis starting directly with as few as 100 cells. We identified an average of 109 proteins from 100 cells, and with three replicates, the number of proteins rose to 204. Good reproducibility was achieved, showing stability and reliability of the method. Gene Ontology analysis revealed that proteins in different cellular compartments were well represented.

  13. MaRaCluster: A Fragment Rarity Metric for Clustering Fragment Spectra in Shotgun Proteomics.

    PubMed

    The, Matthew; Käll, Lukas

    2016-03-04

    Shotgun proteomics experiments generate large amounts of fragment spectra as primary data, normally with high redundancy between and within experiments. Here, we have devised a clustering technique to identify fragment spectra stemming from the same species of peptide. This is a powerful alternative method to traditional search engines for analyzing spectra, specifically useful for larger scale mass spectrometry studies. As an aid in this process, we propose a distance calculation relying on the rarity of experimental fragment peaks, following the intuition that peaks shared by only a few spectra offer more evidence than peaks shared by a large number of spectra. We used this distance calculation and a complete-linkage scheme to cluster data from a recent large-scale mass spectrometry-based study. The clusterings produced by our method have up to 40% more identified peptides for their consensus spectra compared to those produced by the previous state-of-the-art method. We see that our method would advance the construction of spectral libraries as well as serve as a tool for mining large sets of fragment spectra. The source code and Ubuntu binary packages are available at https://github.com/statisticalbiotechnology/maracluster (under an Apache 2.0 license).

  14. CoreFlow: a computational platform for integration, analysis and modeling of complex biological data.

    PubMed

    Pasculescu, Adrian; Schoof, Erwin M; Creixell, Pau; Zheng, Yong; Olhovsky, Marina; Tian, Ruijun; So, Jonathan; Vanderlaan, Rachel D; Pawson, Tony; Linding, Rune; Colwill, Karen

    2014-04-04

    A major challenge in mass spectrometry and other large-scale applications is how to handle, integrate, and model the data that is produced. Given the speed at which technology advances and the need to keep pace with biological experiments, we designed a computational platform, CoreFlow, which provides programmers with a framework to manage data in real-time. It allows users to upload data into a relational database (MySQL), and to create custom scripts in high-level languages such as R, Python, or Perl for processing, correcting and modeling this data. CoreFlow organizes these scripts into project-specific pipelines, tracks interdependencies between related tasks, and enables the generation of summary reports as well as publication-quality images. As a result, the gap between experimental and computational components of a typical large-scale biology project is reduced, decreasing the time between data generation, analysis and manuscript writing. CoreFlow is being released to the scientific community as an open-sourced software package complete with proteomics-specific examples, which include corrections for incomplete isotopic labeling of peptides (SILAC) or arginine-to-proline conversion, and modeling of multiple/selected reaction monitoring (MRM/SRM) results. CoreFlow was purposely designed as an environment for programmers to rapidly perform data analysis. These analyses are assembled into project-specific workflows that are readily shared with biologists to guide the next stages of experimentation. Its simple yet powerful interface provides a structure where scripts can be written and tested virtually simultaneously to shorten the life cycle of code development for a particular task. The scripts are exposed at every step so that a user can quickly see the relationships between the data, the assumptions that have been made, and the manipulations that have been performed. Since the scripts use commonly available programming languages, they can easily be transferred to and from other computational environments for debugging or faster processing. This focus on 'on the fly' analysis sets CoreFlow apart from other workflow applications that require wrapping of scripts into particular formats and development of specific user interfaces. Importantly, current and future releases of data analysis scripts in CoreFlow format will be of widespread benefit to the proteomics community, not only for uptake and use in individual labs, but to enable full scrutiny of all analysis steps, thus increasing experimental reproducibility and decreasing errors. This article is part of a Special Issue entitled: Can Proteomics Fill the Gap Between Genomics and Phenotypes? Copyright © 2014 Elsevier B.V. All rights reserved.

  15. The Skeleton Forming Proteome of an Early Branching Metazoan: A Molecular Survey of the Biomineralization Components Employed by the Coralline Sponge Vaceletia Sp.

    PubMed Central

    Wörheide, Gert; Jackson, Daniel John

    2015-01-01

    The ability to construct a mineralized skeleton was a major innovation for the Metazoa during their evolution in the late Precambrian/early Cambrian. Porifera (sponges) hold an informative position for efforts aimed at unraveling the origins of this ability because they are widely regarded to be the earliest branching metazoans, and are among the first multi-cellular animals to display the ability to biomineralize in the fossil record. Very few biomineralization associated proteins have been identified in sponges so far, with no transcriptome or proteome scale surveys yet available. In order to understand what genetic repertoire may have been present in the last common ancestor of the Metazoa (LCAM), and that may have contributed to the evolution of the ability to biocalcify, we have studied the skeletal proteome of the coralline demosponge Vaceletia sp. and compare this to other metazoan biomineralizing proteomes. We bring some spatial resolution to this analysis by dividing Vaceletia’s aragonitic calcium carbonate skeleton into “head” and “stalk” regions. With our approach we were able to identify 40 proteins from both the head and stalk regions, with many of these sharing some similarity to previously identified gene products from other organisms. Among these proteins are known biomineralization compounds, such as carbonic anhydrase, spherulin, extracellular matrix proteins and very acidic proteins. This report provides the first proteome scale analysis of a calcified poriferan skeletal proteome, and its composition clearly demonstrates that the LCAM contributed several key enzymes and matrix proteins to its descendants that supported the metazoan ability to biocalcify. However, lineage specific evolution is also likely to have contributed significantly to the ability of disparate metazoan lineages to biocalcify. PMID:26536128

  16. Comparative proteomic analysis of differentially expressed proteins in β-aminobutyric acid enhanced Arabidopsis thaliana tolerance to simulated acid rain.

    PubMed

    Liu, Tingwu; Jiang, Xinwu; Shi, Wuliang; Chen, Juan; Pei, Zhenming; Zheng, Hailei

    2011-05-01

    Acid rain is a worldwide environmental issue that has seriously destroyed forest ecosystems. As a highly effective and broad-spectrum plant resistance-inducing agent, β-aminobutyric acid could elevate the tolerance of Arabidopsis when subjected to simulated acid rain. Using comparative proteomic strategies, we analyzed 203 significantly varied proteins of which 175 proteins were identified responding to β-aminobutyric acid in the absence and presence of simulated acid rain. They could be divided into ten groups according to their biological functions. Among them, the majority was cell rescue, development and defense-related proteins, followed by transcription, protein synthesis, folding, modification and destination-associated proteins. Our conclusion is β-aminobutyric acid can lead to a large-scale primary metabolism change and simultaneously activate antioxidant system and salicylic acid, jasmonic acid, abscisic acid signaling pathways. In addition, β-aminobutyric acid can reinforce physical barriers to defend simulated acid rain stress. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. Tissue proteomics of the low-molecular weight proteome using an integrated cLC-ESI-QTOFMS approach.

    PubMed

    Alvarez, MeiHwa Tanielle Bench; Shah, Dipti Jigar; Thulin, Craig D; Graves, Steven W

    2013-05-01

    Analysis of the protein/peptide composition of tissue has provided meaningful insights into tissue biology and even disease mechanisms. However, little has been published regarding top down methods to investigate lower molecular weight (MW) (500-5000 Da) species in tissue. Here, we evaluate a tissue proteomics approach involving tissue homogenization followed by depletion of large proteins and then cLC-MS (where c stands for capillary) analysis to interrogate the low MW/low abundance tissue proteome. In the development of this method, sheep heart, lung, liver, kidney, and spleen were surveyed to test our ability to observe tissue differences. After categorical tissue differences were demonstrated, a detailed study of this method's reproducibility was undertaken to determine whether or not it is suitable for analyzing more subtle differences in the abundance of small proteins and peptides. Our results suggest that this method should be useful in exploring the low MW proteome of tissues. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. Solid-Phase Extraction Strategies to Surmount Body Fluid Sample Complexity in High-Throughput Mass Spectrometry-Based Proteomics

    PubMed Central

    Bladergroen, Marco R.; van der Burgt, Yuri E. M.

    2015-01-01

    For large-scale and standardized applications in mass spectrometry- (MS-) based proteomics automation of each step is essential. Here we present high-throughput sample preparation solutions for balancing the speed of current MS-acquisitions and the time needed for analytical workup of body fluids. The discussed workflows reduce body fluid sample complexity and apply for both bottom-up proteomics experiments and top-down protein characterization approaches. Various sample preparation methods that involve solid-phase extraction (SPE) including affinity enrichment strategies have been automated. Obtained peptide and protein fractions can be mass analyzed by direct infusion into an electrospray ionization (ESI) source or by means of matrix-assisted laser desorption ionization (MALDI) without further need of time-consuming liquid chromatography (LC) separations. PMID:25692071

  19. Recent Achievements in Characterizing the Histone Code and Approaches to Integrating Epigenomics and Systems Biology.

    PubMed

    Janssen, K A; Sidoli, S; Garcia, B A

    2017-01-01

    Functional epigenetic regulation occurs by dynamic modification of chromatin, including genetic material (i.e., DNA methylation), histone proteins, and other nuclear proteins. Due to the highly complex nature of the histone code, mass spectrometry (MS) has become the leading technique in identification of single and combinatorial histone modifications. MS has now overcome antibody-based strategies due to its automation, high resolution, and accurate quantitation. Moreover, multiple approaches to analysis have been developed for global quantitation of posttranslational modifications (PTMs), including large-scale characterization of modification coexistence (middle-down and top-down proteomics), which is not currently possible with any other biochemical strategy. Recently, our group and others have simplified and increased the effectiveness of analyzing histone PTMs by improving multiple MS methods and data analysis tools. This review provides an overview of the major achievements in the analysis of histone PTMs using MS with a focus on the most recent improvements. We speculate that the workflow for histone analysis at its state of the art is highly reliable in terms of identification and quantitation accuracy, and it has the potential to become a routine method for systems biology thanks to the possibility of integrating histone MS results with genomics and proteomics datasets. © 2017 Elsevier Inc. All rights reserved.

  20. ProteomeVis: a web app for exploration of protein properties from structure to sequence evolution across organisms' proteomes.

    PubMed

    Razban, Rostam M; Gilson, Amy I; Durfee, Niamh; Strobelt, Hendrik; Dinkla, Kasper; Choi, Jeong-Mo; Pfister, Hanspeter; Shakhnovich, Eugene I

    2018-05-08

    Protein evolution spans time scales and its effects span the length of an organism. A web app named ProteomeVis is developed to provide a comprehensive view of protein evolution in the S. cerevisiae and E. coli proteomes. ProteomeVis interactively creates protein chain graphs, where edges between nodes represent structure and sequence similarities within user-defined ranges, to study the long time scale effects of protein structure evolution. The short time scale effects of protein sequence evolution are studied by sequence evolutionary rate (ER) correlation analyses with protein properties that span from the molecular to the organismal level. We demonstrate the utility and versatility of ProteomeVis by investigating the distribution of edges per node in organismal protein chain universe graphs (oPCUGs) and putative ER determinants. S. cerevisiae and E. coli oPCUGs are scale-free with scaling constants of 1.79 and 1.56, respectively. Both scaling constants can be explained by a previously reported theoretical model describing protein structure evolution (Dokholyan et al., 2002). Protein abundance most strongly correlates with ER among properties in ProteomeVis, with Spearman correlations of -0.49 (p-value<10-10) and -0.46 (p-value<10-10) for S. cerevisiae and E. coli, respectively. This result is consistent with previous reports that found protein expression to be the most important ER determinant (Zhang and Yang, 2015). ProteomeVis is freely accessible at http://proteomevis.chem.harvard.edu. Supplementary data are available at Bioinformatics. shakhnovich@chemistry.harvard.edu.

  1. Mechanistic evaluation of primary human hepatocyte culture using global proteomic analysis reveals a selective dedifferentiation profile.

    PubMed

    Heslop, James A; Rowe, Cliff; Walsh, Joanne; Sison-Young, Rowena; Jenkins, Roz; Kamalian, Laleh; Kia, Richard; Hay, David; Jones, Robert P; Malik, Hassan Z; Fenwick, Stephen; Chadwick, Amy E; Mills, John; Kitteringham, Neil R; Goldring, Chris E P; Kevin Park, B

    2017-01-01

    The application of primary human hepatocytes following isolation from human tissue is well accepted to be compromised by the process of dedifferentiation. This phenomenon reduces many unique hepatocyte functions, limiting their use in drug disposition and toxicity assessment. The aetiology of dedifferentiation has not been well defined, and further understanding of the process would allow the development of novel strategies for sustaining the hepatocyte phenotype in culture or for improving protocols for maturation of hepatocytes generated from stem cells. We have therefore carried out the first proteomic comparison of primary human hepatocyte differentiation. Cells were cultured for 0, 24, 72 and 168 h as a monolayer in order to permit unrestricted hepatocyte dedifferentiation, so as to reveal the causative signalling pathways and factors in this process, by pathway analysis. A total of 3430 proteins were identified with a false detection rate of <1 %, of which 1117 were quantified at every time point. Increasing numbers of significantly differentially expressed proteins compared with the freshly isolated cells were observed at 24 h (40 proteins), 72 h (118 proteins) and 168 h (272 proteins) (p < 0.05). In particular, cytochromes P450 and mitochondrial proteins underwent major changes, confirmed by functional studies and investigated by pathway analysis. We report the key factors and pathways which underlie the loss of hepatic phenotype in vitro, particularly those driving the large-scale and selective remodelling of the mitochondrial and metabolic proteomes. In summary, these findings expand the current understanding of dedifferentiation should facilitate further development of simple and complex hepatic culture systems.

  2. The membrane proteome of Medicago truncatula roots displays qualitative and quantitative changes in response to arbuscular mycorrhizal symbiosis.

    PubMed

    Abdallah, Cosette; Valot, Benoit; Guillier, Christelle; Mounier, Arnaud; Balliau, Thierry; Zivy, Michel; van Tuinen, Diederik; Renaut, Jenny; Wipf, Daniel; Dumas-Gaudot, Eliane; Recorbet, Ghislaine

    2014-08-28

    Arbuscular mycorrhizal (AM) symbiosis that associates roots of most land plants with soil-borne fungi (Glomeromycota), is characterized by reciprocal nutritional benefits. Fungal colonization of plant roots induces massive changes in cortical cells where the fungus differentiates an arbuscule, which drives proliferation of the plasma membrane. Despite the recognized importance of membrane proteins in sustaining AM symbiosis, the root microsomal proteome elicited upon mycorrhiza still remains to be explored. In this study, we first examined the qualitative composition of the root membrane proteome of Medicago truncatula after microsome enrichment and subsequent in depth analysis by GeLC-MS/MS. The results obtained highlighted the identification of 1226 root membrane protein candidates whose cellular and functional classifications predispose plastids and protein synthesis as prevalent organelle and function, respectively. Changes at the protein abundance level between the membrane proteomes of mycorrhizal and nonmycorrhizal roots were further monitored by spectral counting, which retrieved a total of 96 proteins that displayed a differential accumulation upon AM symbiosis. Besides the canonical markers of the periarbuscular membrane, new candidates supporting the importance of membrane trafficking events during mycorrhiza establishment/functioning were identified, including flotillin-like proteins. The data have been deposited to the ProteomeXchange with identifier PXD000875. During arbuscular mycorrhizal symbiosis, one of the most widespread mutualistic associations in nature, the endomembrane system of plant roots is believed to undergo qualitative and quantitative changes in order to sustain both the accommodation process of the AM fungus within cortical cells and the exchange of nutrients between symbionts. Large-scale GeLC-MS/MS proteomic analysis of the membrane fractions from mycorrhizal and nonmycorrhizal roots of M. truncatula coupled to spectral counting retrieved around one hundred proteins that displayed changes in abundance upon mycorrhizal establishment. The symbiosis-related membrane proteins that were identified mostly function in signaling/membrane trafficking and nutrient uptake regulation. Besides extending the coverage of the root membrane proteome of M. truncatula, new candidates involved in the symbiotic program emerged from the current study, which pointed out a dynamic reorganization of microsomal proteins during the accommodation of AM fungi within cortical cells. Copyright © 2014 Elsevier B.V. All rights reserved.

  3. pyGeno: A Python package for precision medicine and proteogenomics.

    PubMed

    Daouda, Tariq; Perreault, Claude; Lemieux, Sébastien

    2016-01-01

    pyGeno is a Python package mainly intended for precision medicine applications that revolve around genomics and proteomics. It integrates reference sequences and annotations from Ensembl, genomic polymorphisms from the dbSNP database and data from next-gen sequencing into an easy to use, memory-efficient and fast framework, therefore allowing the user to easily explore subject-specific genomes and proteomes. Compared to a standalone program, pyGeno gives the user access to the complete expressivity of Python, a general programming language. Its range of application therefore encompasses both short scripts and large scale genome-wide studies.

  4. pyGeno: A Python package for precision medicine and proteogenomics

    PubMed Central

    Daouda, Tariq; Perreault, Claude; Lemieux, Sébastien

    2016-01-01

    pyGeno is a Python package mainly intended for precision medicine applications that revolve around genomics and proteomics. It integrates reference sequences and annotations from Ensembl, genomic polymorphisms from the dbSNP database and data from next-gen sequencing into an easy to use, memory-efficient and fast framework, therefore allowing the user to easily explore subject-specific genomes and proteomes. Compared to a standalone program, pyGeno gives the user access to the complete expressivity of Python, a general programming language. Its range of application therefore encompasses both short scripts and large scale genome-wide studies. PMID:27785359

  5. A Comprehensive Proteomics Analysis of the Human Iris Tissue: Ready to Embrace Postgenomics Precision Medicine in Ophthalmology?

    PubMed

    Murthy, Krishna R; Dammalli, Manjunath; Pinto, Sneha M; Murthy, Kalpana Babu; Nirujogi, Raja Sekhar; Madugundu, Anil K; Dey, Gourav; Subbannayya, Yashwanth; Mishra, Uttam Kumar; Nair, Bipin; Gowda, Harsha; Prasad, T S Keshava

    2016-09-01

    The annual economic burden of visual disorders in the United States was estimated at $139 billion. Ophthalmology is therefore one of the salient application fields of postgenomics biotechnologies such as proteomics in the pursuit of global precision medicine. Interestingly, the protein composition of the human iris tissue still remains largely unexplored. In this context, the uveal tract constitutes the vascular middle coat of the eye and is formed by the choroid, ciliary body, and iris. The iris forms the anterior most part of the uvea. It is a thin muscular diaphragm with a central perforation called pupil. Inflammation of the uvea is termed uveitis and causes reduced vision or blindness. However, the pathogenesis of the spectrum of diseases causing uveitis is still not very well understood. We investigated the proteome of the iris tissue harvested from healthy donor eyes that were enucleated within 6 h of death using high-resolution Fourier transform mass spectrometry. A total of 4959 nonredundant proteins were identified in the human iris, which included proteins involved in signaling, cell communication, metabolism, immune response, and transport. This study is the first attempt to comprehensively profile the global proteome of the human iris tissue and, thus, offers the potential to facilitate biomedical research into pathological diseases of the uvea such as Behcet's disease, Vogt Koyonagi Harada's disease, and juvenile rheumatoid arthritis. Finally, we make a call to the broader visual health and ophthalmology community that proteomics offers a veritable prospect to obtain a systems scale, functional, and dynamic picture of the eye tissue in health and disease. This knowledge is ultimately pertinent for precision medicine diagnostics and therapeutics innovation to address the pressing needs of the 21st century visual health.

  6. Genome-scale prediction of proteins with long intrinsically disordered regions.

    PubMed

    Peng, Zhenling; Mizianty, Marcin J; Kurgan, Lukasz

    2014-01-01

    Proteins with long disordered regions (LDRs), defined as having 30 or more consecutive disordered residues, are abundant in eukaryotes, and these regions are recognized as a distinct class of biologically functional domains. LDRs facilitate various cellular functions and are important for target selection in structural genomics. Motivated by the lack of methods that directly predict proteins with LDRs, we designed Super-fast predictor of proteins with Long Intrinsically DisordERed regions (SLIDER). SLIDER utilizes logistic regression that takes an empirically chosen set of numerical features, which consider selected physicochemical properties of amino acids, sequence complexity, and amino acid composition, as its inputs. Empirical tests show that SLIDER offers competitive predictive performance combined with low computational cost. It outperforms, by at least a modest margin, a comprehensive set of modern disorder predictors (that can indirectly predict LDRs) and is 16 times faster compared to the best currently available disorder predictor. Utilizing our time-efficient predictor, we characterized abundance and functional roles of proteins with LDRs over 110 eukaryotic proteomes. Similar to related studies, we found that eukaryotes have many (on average 30.3%) proteins with LDRs with majority of proteomes having between 25 and 40%, where higher abundance is characteristic to proteomes that have larger proteins. Our first-of-its-kind large-scale functional analysis shows that these proteins are enriched in a number of cellular functions and processes including certain binding events, regulation of catalytic activities, cellular component organization, biogenesis, biological regulation, and some metabolic and developmental processes. A webserver that implements SLIDER is available at http://biomine.ece.ualberta.ca/SLIDER/. Copyright © 2013 Wiley Periodicals, Inc.

  7. Screening of missing proteins in the human liver proteome by improved MRM-approach-based targeted proteomics.

    PubMed

    Chen, Chen; Liu, Xiaohui; Zheng, Weimin; Zhang, Lei; Yao, Jun; Yang, Pengyuan

    2014-04-04

    To completely annotate the human genome, the task of identifying and characterizing proteins that currently lack mass spectrometry (MS) evidence is inevitable and urgent. In this study, as the first effort to screen missing proteins in large scale, we developed an approach based on SDS-PAGE followed by liquid chromatography-multiple reaction monitoring (LC-MRM), for screening of those missing proteins with only a single peptide hit in the previous liver proteome data set. Proteins extracted from normal human liver were separated in SDS-PAGE and digested in split gel slice, and the resulting digests were then subjected to LC-schedule MRM analysis. The MRM assays were developed through synthesized crude peptides for target peptides. In total, the expressions of 57 target proteins were confirmed from 185 MRM assays in normal human liver tissues. Among the proved 57 one-hit wonders, 50 proteins are of the minimally redundant set in the PeptideAtlas database, 7 proteins even have none MS-based information previously in various biological processes. We conclude that our SDS-PAGE-MRM workflow can be a powerful approach to screen missing or poorly characterized proteins in different samples and to provide their quantity if detected. The MRM raw data have been uploaded to ISB/SRM Atlas/PASSEL (PXD000648).

  8. Gas-Phase Enrichment of Multiply Charged Peptide Ions by Differential Ion Mobility Extend the Comprehensiveness of SUMO Proteome Analyses

    NASA Astrophysics Data System (ADS)

    Pfammatter, Sibylle; Bonneil, Eric; McManus, Francis P.; Thibault, Pierre

    2018-04-01

    The small ubiquitin-like modifier (SUMO) is a member of the family of ubiquitin-like modifiers (UBLs) and is involved in important cellular processes, including DNA damage response, meiosis and cellular trafficking. The large-scale identification of SUMO peptides in a site-specific manner is challenging not only because of the low abundance and dynamic nature of this modification, but also due to the branched structure of the corresponding peptides that further complicate their identification using conventional search engines. Here, we exploited the unusual structure of SUMO peptides to facilitate their separation by high-field asymmetric waveform ion mobility spectrometry (FAIMS) and increase the coverage of SUMO proteome analysis. Upon trypsin digestion, branched peptides contain a SUMO remnant side chain and predominantly form triply protonated ions that facilitate their gas-phase separation using FAIMS. We evaluated the mobility characteristics of synthetic SUMO peptides and further demonstrated the application of FAIMS to profile the changes in protein SUMOylation of HEK293 cells following heat shock, a condition known to affect this modification. FAIMS typically provided a 10-fold improvement of detection limit of SUMO peptides, and enabled a 36% increase in SUMO proteome coverage compared to the same LC-MS/MS analyses performed without FAIMS. [Figure not available: see fulltext.

  9. Covering complete proteomes with X-ray structures: A current snapshot

    DOE PAGES

    Mizianty, Marcin J.; Fan, Xiao; Yan, Jing; ...

    2014-10-23

    Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtainedmore » through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-ray structure determination were determined.« less

  10. The role of internal duplication in the evolution of multi-domain proteins.

    PubMed

    Nacher, J C; Hayashida, M; Akutsu, T

    2010-08-01

    Many proteins consist of several structural domains. These multi-domain proteins have likely been generated by selective genome growth dynamics during evolution to perform new functions as well as to create structures that fold on a biologically feasible time scale. Domain units frequently evolved through a variety of genetic shuffling mechanisms. Here we examine the protein domain statistics of more than 1000 organisms including eukaryotic, archaeal and bacterial species. The analysis extends earlier findings on asymmetric statistical laws for proteome to a wider variety of species. While proteins are composed of a wide range of domains, displaying a power-law decay, the computation of domain families for each protein reveals an exponential distribution, characterizing a protein universe composed of a thin number of unique families. Structural studies in proteomics have shown that domain repeats, or internal duplicated domains, represent a small but significant fraction of genome. In spite of its importance, this observation has been largely overlooked until recently. We model the evolutionary dynamics of proteome and demonstrate that these distinct distributions are in fact rooted in an internal duplication mechanism. This process generates the contemporary protein structural domain universe, determines its reduced thickness, and tames its growth. These findings have important implications, ranging from protein interaction network modeling to evolutionary studies based on fundamental mechanisms governing genome expansion.

  11. Comparison of analytical methods for profiling N- and O-linked glycans from cultured cell lines

    PubMed Central

    Togayachi, Akira; Azadi, Parastoo; Ishihara, Mayumi; Geyer, Rudolf; Galuska, Christina; Geyer, Hildegard; Kakehi, Kazuaki; Kinoshita, Mitsuhiro; Karlsson, Niclas G.; Jin, Chunsheng; Kato, Koichi; Yagi, Hirokazu; Kondo, Sachiko; Kawasaki, Nana; Hashii, Noritaka; Kolarich, Daniel; Stavenhagen, Kathrin; Packer, Nicolle H.; Thaysen-Andersen, Morten; Nakano, Miyako; Taniguchi, Naoyuki; Kurimoto, Ayako; Wada, Yoshinao; Tajiri, Michiko; Yang, Pengyuan; Cao, Weiqian; Li, Hong; Rudd, Pauline M.; Narimatsu, Hisashi

    2016-01-01

    The Human Disease Glycomics/Proteome Initiative (HGPI) is an activity in the Human Proteome Organization (HUPO) supported by leading researchers from international institutes and aims at development of disease-related glycomics/glycoproteomics analysis techniques. Since 2004, the initiative has conducted three pilot studies. The first two were N- and O-glycan analyses of purified transferrin and immunoglobulin-G and assessed the most appropriate analytical approach employed at the time. This paper describes the third study, which was conducted to compare different approaches for quantitation of N- and O-linked glycans attached to proteins in crude biological samples. The preliminary analysis on cell pellets resulted in wildly varied glycan profiles, which was probably the consequence of variations in the pre-processing sample preparation methodologies. However, the reproducibility of the data was not improved dramatically in the subsequent analysis on cell lysate fractions prepared in a specified method by one lab. The study demonstrated the difficulty of carrying out a complete analysis of the glycome in crude samples by any single technology and the importance of rigorous optimization of the course of analysis from preprocessing to data interpretation. It suggests that another collaborative study employing the latest technologies in this rapidly evolving field will help to realize the requirements of carrying out the large-scale analysis of glycoproteins in complex cell samples. PMID:26511985

  12. Recent advances in micro-scale and nano-scale high-performance liquid-phase chromatography for proteome research.

    PubMed

    Tao, Dingyin; Zhang, Lihua; Shan, Yichu; Liang, Zhen; Zhang, Yukui

    2011-01-01

    High-performance liquid chromatography-electrospray ionization tandem mass spectrometry (HPLC-ESI-MS-MS) is regarded as one of the most powerful techniques for separation and identification of proteins. Recently, much effort has been made to improve the separation capacity, detection sensitivity, and analysis throughput of micro- and nano-HPLC, by increasing column length, reducing column internal diameter, and using integrated techniques. Development of HPLC columns has also been rapid, as a result of the use of submicrometer packing materials and monolithic columns. All these innovations result in clearly improved performance of micro- and nano-HPLC for proteome research.

  13. Proteomic Analysis of the Cell Cycle of Procylic Form Trypanosoma brucei.

    PubMed

    Crozier, Thomas W M; Tinti, Michele; Wheeler, Richard J; Ly, Tony; Ferguson, Michael A J; Lamond, Angus I

    2018-06-01

    We describe a single-step centrifugal elutriation method to produce synchronous Gap1 (G1)-phase procyclic trypanosomes at a scale amenable for proteomic analysis of the cell cycle. Using ten-plex tandem mass tag (TMT) labeling and mass spectrometry (MS)-based proteomics technology, the expression levels of 5325 proteins were quantified across the cell cycle in this parasite. Of these, 384 proteins were classified as cell-cycle regulated and subdivided into nine clusters with distinct temporal regulation. These groups included many known cell cycle regulators in trypanosomes, which validates the approach. In addition, we identify 40 novel cell cycle regulated proteins that are essential for trypanosome survival and thus represent potential future drug targets for the prevention of trypanosomiasis. Through cross-comparison to the TrypTag endogenous tagging microscopy database, we were able to validate the cell-cycle regulated patterns of expression for many of the proteins of unknown function detected in our proteomic analysis. A convenient interface to access and interrogate these data is also presented, providing a useful resource for the scientific community. Data are available via ProteomeXchange with identifier PXD008741 (https://www.ebi.ac.uk/pride/archive/). © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.

  14. Quantitative proteomics reveals ecological fitness cost of multi-herbicide resistant barnyardgrass (Echinochloa crus-galli L.).

    PubMed

    Yang, Xia; Zhang, Zichang; Gu, Tao; Dong, Mingchao; Peng, Qiong; Bai, Lianyang; Li, Yongfeng

    2017-01-06

    Barnyardgrass (Echinochloa crus-galli) is one of the top 15 herbicide-resistant weeds around the world that interferes with rice growth, resulting in major losses of rice yield. Thus, multi-herbicide resistance in barnyardgrass presents a major threat, with the underlying mechanisms that contribute to resistance requiring elucidation. In an attempt to characterize this multi-herbicide resistance at the proteomic level, comparative analysis of resistant and susceptible barnyardgrasses was performed using iTRAQ, both with and without quinclorac, bispyribac-sodium and penoxsulam herbicidal treatment. A total of 1342 protein species were identified from 2248 unique peptides by searching the UniProt database and conducting data analysis. Approximately 904 protein species with 4774 Gene Ontology (GO) terms were grouped into the categories of biological process, cellular component and molecular function. Among these, 688 protein species were annotated into 1583 KEGG pathways, with 980 protein species relating to metabolism and 93 relating to environmental information processing. A total of 292 protein species showed more than a 1.2-fold change in abundance in the resistant biotype relative to the susceptible biotype. Furthermore, herbicide treatment resulted in 157 protein species that showed more than a 1.2-fold change in the resistant biotype. Moreover, physiological analyses demonstrated an ecological fitness cost in the resistant biotype. While some studies have shown a fitness cost to be associated with an altered ecological interaction, our understanding of the fitness costs associated with herbicide resistance are limited. Herein, physiological and proteomic analysis demonstrates herbicide resistance associated ecological fitness cost and potential mechanisms of herbicide-resistance in resistant biotypes of E. crus-galli. The results presented herein have revealed differences in ecological adaptation between resistant and susceptible biotypes in E. crus-galli and provide a fundamental basis enabling the development of new strategies for weed control. Lastly, this is the first large-scale proteomics study to examine herbicide stress responses in different barnyardgrass biotypes. Copyright © 2016 Elsevier B.V. All rights reserved.

  15. A Novel Proteomics Approach to Identify SUMOylated Proteins and Their Modification Sites in Human Cells*

    PubMed Central

    Galisson, Frederic; Mahrouche, Louiza; Courcelles, Mathieu; Bonneil, Eric; Meloche, Sylvain; Chelbi-Alix, Mounira K.; Thibault, Pierre

    2011-01-01

    The small ubiquitin-related modifier (SUMO) is a small group of proteins that are reversibly attached to protein substrates to modify their functions. The large scale identification of protein SUMOylation and their modification sites in mammalian cells represents a significant challenge because of the relatively small number of in vivo substrates and the dynamic nature of this modification. We report here a novel proteomics approach to selectively enrich and identify SUMO conjugates from human cells. We stably expressed different SUMO paralogs in HEK293 cells, each containing a His6 tag and a strategically located tryptic cleavage site at the C terminus to facilitate the recovery and identification of SUMOylated peptides by affinity enrichment and mass spectrometry. Tryptic peptides with short SUMO remnants offer significant advantages in large scale SUMOylome experiments including the generation of paralog-specific fragment ions following CID and ETD activation, and the identification of modified peptides using conventional database search engines such as Mascot. We identified 205 unique protein substrates together with 17 precise SUMOylation sites present in 12 SUMO protein conjugates including three new sites (Lys-380, Lys-400, and Lys-497) on the protein promyelocytic leukemia. Label-free quantitative proteomics analyses on purified nuclear extracts from untreated and arsenic trioxide-treated cells revealed that all identified SUMOylated sites of promyelocytic leukemia were differentially SUMOylated upon stimulation. PMID:21098080

  16. Global proteomics profiling improves drug sensitivity prediction: results from a multi-omics, pan-cancer modeling approach.

    PubMed

    Ali, Mehreen; Khan, Suleiman A; Wennerberg, Krister; Aittokallio, Tero

    2018-04-15

    Proteomics profiling is increasingly being used for molecular stratification of cancer patients and cell-line panels. However, systematic assessment of the predictive power of large-scale proteomic technologies across various drug classes and cancer types is currently lacking. To that end, we carried out the first pan-cancer, multi-omics comparative analysis of the relative performance of two proteomic technologies, targeted reverse phase protein array (RPPA) and global mass spectrometry (MS), in terms of their accuracy for predicting the sensitivity of cancer cells to both cytotoxic chemotherapeutics and molecularly targeted anticancer compounds. Our results in two cell-line panels demonstrate how MS profiling improves drug response predictions beyond that of the RPPA or the other omics profiles when used alone. However, frequent missing MS data values complicate its use in predictive modeling and required additional filtering, such as focusing on completely measured or known oncoproteins, to obtain maximal predictive performance. Rather strikingly, the two proteomics profiles provided complementary predictive signal both for the cytotoxic and targeted compounds. Further, information about the cellular-abundance of primary target proteins was found critical for predicting the response of targeted compounds, although the non-target features also contributed significantly to the predictive power. The clinical relevance of the selected protein markers was confirmed in cancer patient data. These results provide novel insights into the relative performance and optimal use of the widely applied proteomic technologies, MS and RPPA, which should prove useful in translational applications, such as defining the best combination of omics technologies and marker panels for understanding and predicting drug sensitivities in cancer patients. Processed datasets, R as well as Matlab implementations of the methods are available at https://github.com/mehr-een/bemkl-rbps. mehreen.ali@helsinki.fi or tero.aittokallio@fimm.fi. Supplementary data are available at Bioinformatics online.

  17. Andromeda: a peptide search engine integrated into the MaxQuant environment.

    PubMed

    Cox, Jürgen; Neuhauser, Nadin; Michalski, Annette; Scheltema, Richard A; Olsen, Jesper V; Mann, Matthias

    2011-04-01

    A key step in mass spectrometry (MS)-based proteomics is the identification of peptides in sequence databases by their fragmentation spectra. Here we describe Andromeda, a novel peptide search engine using a probabilistic scoring model. On proteome data, Andromeda performs as well as Mascot, a widely used commercial search engine, as judged by sensitivity and specificity analysis based on target decoy searches. Furthermore, it can handle data with arbitrarily high fragment mass accuracy, is able to assign and score complex patterns of post-translational modifications, such as highly phosphorylated peptides, and accommodates extremely large databases. The algorithms of Andromeda are provided. Andromeda can function independently or as an integrated search engine of the widely used MaxQuant computational proteomics platform and both are freely available at www.maxquant.org. The combination enables analysis of large data sets in a simple analysis workflow on a desktop computer. For searching individual spectra Andromeda is also accessible via a web server. We demonstrate the flexibility of the system by implementing the capability to identify cofragmented peptides, significantly improving the total number of identified peptides.

  18. Review of Software Tools for Design and Analysis of Large scale MRM Proteomic Datasets

    PubMed Central

    Colangelo, Christopher M.; Chung, Lisa; Bruce, Can; Cheung, Kei-Hoi

    2013-01-01

    Selective or Multiple Reaction monitoring (SRM/MRM) is a liquid-chromatography (LC)/tandem-mass spectrometry (MS/MS) method that enables the quantitation of specific proteins in a sample by analyzing precursor ions and the fragment ions of their selected tryptic peptides. Instrumentation software has advanced to the point that thousands of transitions (pairs of primary and secondary m/z values) can be measured in a triple quadrupole instrument coupled to an LC, by a well-designed scheduling and selection of m/z windows. The design of a good MRM assay relies on the availability of peptide spectra from previous discovery-phase LC-MS/MS studies. The tedious aspect of manually developing and processing MRM assays involving thousands of transitions has spurred to development of software tools to automate this process. Software packages have been developed for project management, assay development, assay validation, data export, peak integration, quality assessment, and biostatistical analysis. No single tool provides a complete end-to-end solution, thus this article reviews the current state and discusses future directions of these software tools in order to enable researchers to combine these tools for a comprehensive targeted proteomics workflow. PMID:23702368

  19. Development of proteome-wide binding reagents for research and diagnostics.

    PubMed

    Taussig, Michael J; Schmidt, Ronny; Cook, Elizabeth A; Stoevesandt, Oda

    2013-12-01

    Alongside MS, antibodies and other specific protein-binding molecules have a special place in proteomics as affinity reagents in a toolbox of applications for determining protein location, quantitative distribution and function (affinity proteomics). The realisation that the range of research antibodies available, while apparently vast is nevertheless still very incomplete and frequently of uncertain quality, has stimulated projects with an objective of raising comprehensive, proteome-wide sets of protein binders. With progress in automation and throughput, a remarkable number of recent publications refer to the practical possibility of selecting binders to every protein encoded in the genome. Here we review the requirements of a pipeline of production of protein binders for the human proteome, including target prioritisation, antigen design, 'next generation' methods, databases and the approaches taken by ongoing projects in Europe and the USA. While the task of generating affinity reagents for all human proteins is complex and demanding, the benefits of well-characterised and quality-controlled pan-proteome binder resources for biomedical research, industry and life sciences in general would be enormous and justify the effort. Given the technical, personnel and financial resources needed to fulfil this aim, expansion of current efforts may best be addressed through large-scale international collaboration. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. Microbial Interactions in Plants: Perspectives and Applications of Proteomics.

    PubMed

    Imam, Jahangir; Shukla, Pratyoosh; Mandal, Nimai Prasad; Variar, Mukund

    2017-01-01

    The structure and function of proteins involved in plant-microbe interactions is investigated through large-scale proteomics technology in a complex biological sample. Since the whole genome sequences are now available for several plant species and microbes, proteomics study has become easier, accurate and huge amount of data can be generated and analyzed during plant-microbe interactions. Proteomics approaches are highly important and relevant in many studies and showed that only genomics approaches are not sufficient enough as much significant information are lost as the proteins and not the genes coding them are final product that is responsible for the observed phenotype. Novel approaches in proteomics are developing continuously enabling the study of the various aspects in arrangements and configuration of proteins and its functions. Its application is becoming more common and frequently used in plant-microbe interactions with the advancement in new technologies. They are more used for the portrayal of cell and extracellular destructiveness and pathogenicity variables delivered by pathogens. This distinguishes the protein level adjustments in host plants when infected with pathogens and advantageous partners. This review provides a brief overview of different proteomics technology which is currently available followed by their exploitation to study the plant-microbe interaction. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  1. Proteomic Characterization of Differential Abundant Proteins Accumulated between Lower and Upper Epidermises of Fleshy Scales in Onion (Allium cepa L.) Bulbs

    PubMed Central

    Wu, Xiaolin

    2016-01-01

    The onion (Allium cepa L.) is widely planted worldwide as a valuable vegetable crop. The scales of an onion bulb are a modified type of leaf. The one-layer-cell epidermis of onion scales is commonly used as a model experimental material in botany and molecular biology. The lower epidermis (LE) and upper epidermis (UE) of onion scales display obvious differences in microscopic structure, cell differentiation and pigment synthesis; however, associated proteomic differences are unclear. LE and UE can be easily sampled as single-layer-cell tissues for comparative proteomic analysis. In this study, a proteomic approach based on 2-DE and mass spectrometry (MS) was applied to compare LE and UE of fleshy scales from yellow and red onions. We identified 47 differential abundant protein spots (representing 31 unique proteins) between LE and UE in red and yellow onions. These proteins are mainly involved in pigment synthesis, stress response, and cell division. Particularly, the differentially accumulated chalcone-flavanone isomerase and flavone O-methyltransferase 1-like in LE may result in the differences in the onion scale color between red and yellow onions. Moreover, stress-related proteins abundantly accumulated in both LE and UE. In addition, the differential accumulation of UDP-arabinopyranose mutase 1-like protein and β-1,3-glucanase in the LE may be related to the different cell sizes between LE and UE of the two types of onion. The data derived from this study provides new insight into the differences in differentiation and developmental processes between onion epidermises. This study may also make a contribution to onion breeding, such as improving resistances and changing colors. PMID:28036352

  2. Proteomic Characterization of Differential Abundant Proteins Accumulated between Lower and Upper Epidermises of Fleshy Scales in Onion (Allium cepa L.) Bulbs.

    PubMed

    Wu, Si; Ning, Fen; Wu, Xiaolin; Wang, Wei

    2016-01-01

    The onion (Allium cepa L.) is widely planted worldwide as a valuable vegetable crop. The scales of an onion bulb are a modified type of leaf. The one-layer-cell epidermis of onion scales is commonly used as a model experimental material in botany and molecular biology. The lower epidermis (LE) and upper epidermis (UE) of onion scales display obvious differences in microscopic structure, cell differentiation and pigment synthesis; however, associated proteomic differences are unclear. LE and UE can be easily sampled as single-layer-cell tissues for comparative proteomic analysis. In this study, a proteomic approach based on 2-DE and mass spectrometry (MS) was applied to compare LE and UE of fleshy scales from yellow and red onions. We identified 47 differential abundant protein spots (representing 31 unique proteins) between LE and UE in red and yellow onions. These proteins are mainly involved in pigment synthesis, stress response, and cell division. Particularly, the differentially accumulated chalcone-flavanone isomerase and flavone O-methyltransferase 1-like in LE may result in the differences in the onion scale color between red and yellow onions. Moreover, stress-related proteins abundantly accumulated in both LE and UE. In addition, the differential accumulation of UDP-arabinopyranose mutase 1-like protein and β-1,3-glucanase in the LE may be related to the different cell sizes between LE and UE of the two types of onion. The data derived from this study provides new insight into the differences in differentiation and developmental processes between onion epidermises. This study may also make a contribution to onion breeding, such as improving resistances and changing colors.

  3. An integrated native mass spectrometry and top-down proteomics method that connects sequence to structure and function of macromolecular complexes

    NASA Astrophysics Data System (ADS)

    Li, Huilin; Nguyen, Hong Hanh; Ogorzalek Loo, Rachel R.; Campuzano, Iain D. G.; Loo, Joseph A.

    2018-02-01

    Mass spectrometry (MS) has become a crucial technique for the analysis of protein complexes. Native MS has traditionally examined protein subunit arrangements, while proteomics MS has focused on sequence identification. These two techniques are usually performed separately without taking advantage of the synergies between them. Here we describe the development of an integrated native MS and top-down proteomics method using Fourier-transform ion cyclotron resonance (FTICR) to analyse macromolecular protein complexes in a single experiment. We address previous concerns of employing FTICR MS to measure large macromolecular complexes by demonstrating the detection of complexes up to 1.8 MDa, and we demonstrate the efficacy of this technique for direct acquirement of sequence to higher-order structural information with several large complexes. We then summarize the unique functionalities of different activation/dissociation techniques. The platform expands the ability of MS to integrate proteomics and structural biology to provide insights into protein structure, function and regulation.

  4. Proteomic analysis of Chlorella vulgaris: Potential targets for enhanced lipid accumulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Guarnieri, Michael T.; Nag, Ambarish; Yang, Shihui

    2013-11-01

    Oleaginous microalgae are capable of producing large quantities of fatty acids and triacylglycerides. As such, they are promising feedstocks for the production of biofuels and bioproducts. Genetic strain-engineering strategies offer a means to accelerate the commercialization of algal biofuels by improving the rate and total accumulation of microalgal lipids. However, the industrial potential of these organisms remains to be met, largely due to the incomplete knowledgebase surrounding the mechanisms governing the induction of algal lipid biosynthesis. Such strategies require further elucidation of genes and gene products controlling algal lipid accumulation. In this study, we have set out to examine thesemore » mechanisms and identify novel strain-engineering targets in the oleaginous microalga, Chlorella vulgaris. Comparative shotgun proteomic analyses have identified a number of novel targets, including previously unidentified transcription factors and proteins involved in cell signaling and cell cycle regulation. These results lay the foundation for strain-improvement strategies and demonstrate the power of translational proteomic analysis.« less

  5. Expressing the human proteome for affinity proteomics: optimising expression of soluble protein domains and in vivo biotinylation.

    PubMed

    Keates, Tracy; Cooper, Christopher D O; Savitsky, Pavel; Allerston, Charles K; Phillips, Claire; Hammarström, Martin; Daga, Neha; Berridge, Georgina; Mahajan, Pravin; Burgess-Brown, Nicola A; Müller, Susanne; Gräslund, Susanne; Gileadi, Opher

    2012-06-15

    The generation of affinity reagents to large numbers of human proteins depends on the ability to express the target proteins as high-quality antigens. The Structural Genomics Consortium (SGC) focuses on the production and structure determination of human proteins. In a 7-year period, the SGC has deposited crystal structures of >800 human protein domains, and has additionally expressed and purified a similar number of protein domains that have not yet been crystallised. The targets include a diversity of protein domains, with an attempt to provide high coverage of protein families. The family approach provides an excellent basis for characterising the selectivity of affinity reagents. We present a summary of the approaches used to generate purified human proteins or protein domains, a test case demonstrating the ability to rapidly generate new proteins, and an optimisation study on the modification of >70 proteins by biotinylation in vivo. These results provide a unique synergy between large-scale structural projects and the recent efforts to produce a wide coverage of affinity reagents to the human proteome. Copyright © 2011 Elsevier B.V. All rights reserved.

  6. Expressing the human proteome for affinity proteomics: optimising expression of soluble protein domains and in vivo biotinylation

    PubMed Central

    Keates, Tracy; Cooper, Christopher D.O.; Savitsky, Pavel; Allerston, Charles K.; Phillips, Claire; Hammarström, Martin; Daga, Neha; Berridge, Georgina; Mahajan, Pravin; Burgess-Brown, Nicola A.; Müller, Susanne; Gräslund, Susanne; Gileadi, Opher

    2012-01-01

    The generation of affinity reagents to large numbers of human proteins depends on the ability to express the target proteins as high-quality antigens. The Structural Genomics Consortium (SGC) focuses on the production and structure determination of human proteins. In a 7-year period, the SGC has deposited crystal structures of >800 human protein domains, and has additionally expressed and purified a similar number of protein domains that have not yet been crystallised. The targets include a diversity of protein domains, with an attempt to provide high coverage of protein families. The family approach provides an excellent basis for characterising the selectivity of affinity reagents. We present a summary of the approaches used to generate purified human proteins or protein domains, a test case demonstrating the ability to rapidly generate new proteins, and an optimisation study on the modification of >70 proteins by biotinylation in vivo. These results provide a unique synergy between large-scale structural projects and the recent efforts to produce a wide coverage of affinity reagents to the human proteome. PMID:22027370

  7. BIG: a large-scale data integration tool for renal physiology.

    PubMed

    Zhao, Yue; Yang, Chin-Rang; Raghuram, Viswanathan; Parulekar, Jaya; Knepper, Mark A

    2016-10-01

    Due to recent advances in high-throughput techniques, we and others have generated multiple proteomic and transcriptomic databases to describe and quantify gene expression, protein abundance, or cellular signaling on the scale of the whole genome/proteome in kidney cells. The existence of so much data from diverse sources raises the following question: "How can researchers find information efficiently for a given gene product over all of these data sets without searching each data set individually?" This is the type of problem that has motivated the "Big-Data" revolution in Data Science, which has driven progress in fields such as marketing. Here we present an online Big-Data tool called BIG (Biological Information Gatherer) that allows users to submit a single online query to obtain all relevant information from all indexed databases. BIG is accessible at http://big.nhlbi.nih.gov/.

  8. Aptamer-based multiplexed proteomic technology for biomarker discovery.

    PubMed

    Gold, Larry; Ayers, Deborah; Bertino, Jennifer; Bock, Christopher; Bock, Ashley; Brody, Edward N; Carter, Jeff; Dalby, Andrew B; Eaton, Bruce E; Fitzwater, Tim; Flather, Dylan; Forbes, Ashley; Foreman, Trudi; Fowler, Cate; Gawande, Bharat; Goss, Meredith; Gunn, Magda; Gupta, Shashi; Halladay, Dennis; Heil, Jim; Heilig, Joe; Hicke, Brian; Husar, Gregory; Janjic, Nebojsa; Jarvis, Thale; Jennings, Susan; Katilius, Evaldas; Keeney, Tracy R; Kim, Nancy; Koch, Tad H; Kraemer, Stephan; Kroiss, Luke; Le, Ngan; Levine, Daniel; Lindsey, Wes; Lollo, Bridget; Mayfield, Wes; Mehan, Mike; Mehler, Robert; Nelson, Sally K; Nelson, Michele; Nieuwlandt, Dan; Nikrad, Malti; Ochsner, Urs; Ostroff, Rachel M; Otis, Matt; Parker, Thomas; Pietrasiewicz, Steve; Resnicow, Daniel I; Rohloff, John; Sanders, Glenn; Sattin, Sarah; Schneider, Daniel; Singer, Britta; Stanton, Martin; Sterkel, Alana; Stewart, Alex; Stratford, Suzanne; Vaught, Jonathan D; Vrkljan, Mike; Walker, Jeffrey J; Watrobka, Mike; Waugh, Sheela; Weiss, Allison; Wilcox, Sheri K; Wolfson, Alexey; Wolk, Steven K; Zhang, Chi; Zichi, Dom

    2010-12-07

    The interrogation of proteomes ("proteomics") in a highly multiplexed and efficient manner remains a coveted and challenging goal in biology and medicine. We present a new aptamer-based proteomic technology for biomarker discovery capable of simultaneously measuring thousands of proteins from small sample volumes (15 µL of serum or plasma). Our current assay measures 813 proteins with low limits of detection (1 pM median), 7 logs of overall dynamic range (~100 fM-1 µM), and 5% median coefficient of variation. This technology is enabled by a new generation of aptamers that contain chemically modified nucleotides, which greatly expand the physicochemical diversity of the large randomized nucleic acid libraries from which the aptamers are selected. Proteins in complex matrices such as plasma are measured with a process that transforms a signature of protein concentrations into a corresponding signature of DNA aptamer concentrations, which is quantified on a DNA microarray. Our assay takes advantage of the dual nature of aptamers as both folded protein-binding entities with defined shapes and unique nucleotide sequences recognizable by specific hybridization probes. To demonstrate the utility of our proteomics biomarker discovery technology, we applied it to a clinical study of chronic kidney disease (CKD). We identified two well known CKD biomarkers as well as an additional 58 potential CKD biomarkers. These results demonstrate the potential utility of our technology to rapidly discover unique protein signatures characteristic of various disease states. We describe a versatile and powerful tool that allows large-scale comparison of proteome profiles among discrete populations. This unbiased and highly multiplexed search engine will enable the discovery of novel biomarkers in a manner that is unencumbered by our incomplete knowledge of biology, thereby helping to advance the next generation of evidence-based medicine.

  9. Functional analysis of proteins and protein species using shotgun proteomics and linear mathematics.

    PubMed

    Hoehenwarter, Wolfgang; Chen, Yanmei; Recuenco-Munoz, Luis; Wienkoop, Stefanie; Weckwerth, Wolfram

    2011-07-01

    Covalent post-translational modification of proteins is the primary modulator of protein function in the cell. It greatly expands the functional potential of the proteome compared to the genome. In the past few years shotgun proteomics-based research, where the proteome is digested into peptides prior to mass spectrometric analysis has been prolific in this area. It has determined the kinetics of tens of thousands of sites of covalent modification on an equally large number of proteins under various biological conditions and uncovered a transiently active regulatory network that extends into diverse branches of cellular physiology. In this review, we discuss this work in light of the concept of protein speciation, which emphasizes the entire post-translationally modified molecule and its interactions and not just the modification site as the functional entity. Sometimes, particularly when considering complex multisite modification, all of the modified molecular species involved in the investigated condition, the protein species must be completely resolved for full understanding. We present a mathematical technique that delivers a good approximation for shotgun proteomics data.

  10. Beyond the proteome: Mass Spectrometry Special Interest Group (MS-SIG) at ISMB/ECCB 2013

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ryu, Soyoung; Payne, Samuel H.; Schaab, Christoph

    2014-07-02

    Mass spectrometry special interest group (MS-SIG) aims to bring together experts from the global research community to discuss highlights and challenges in the field of mass spectrometry (MS)-based proteomics and computational biology. The rapid echnological developments in MS-based proteomics have enabled the generation of a large amount of meaningful information on hundreds to thousands of proteins simultaneously from a biological sample; however, the complexity of the MS data require sophisticated computational algorithms and software for data analysis and interpretation. This year’s MS-SIG meeting theme was ‘Beyond the Proteome’ with major focuses on improving protein identification/quantification and using proteomics data tomore » solve interesting problems in systems biology and clinical research.« less

  11. Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book.

    PubMed

    Sadygov, Rovshan G; Cociorva, Daniel; Yates, John R

    2004-12-01

    Database searching is an essential element of large-scale proteomics. Because these methods are widely used, it is important to understand the rationale of the algorithms. Most algorithms are based on concepts first developed in SEQUEST and PeptideSearch. Four basic approaches are used to determine a match between a spectrum and sequence: descriptive, interpretative, stochastic and probability-based matching. We review the basic concepts used by most search algorithms, the computational modeling of peptide identification and current challenges and limitations of this approach for protein identification.

  12. Proteomic analysis of formalin-fixed paraffin embedded tissue by MALDI imaging mass spectrometry

    PubMed Central

    Casadonte, Rita; Caprioli, Richard M

    2012-01-01

    Archived formalin-fixed paraffin-embedded (FFPE) tissue collections represent a valuable informational resource for proteomic studies. Multiple FFPE core biopsies can be assembled in a single block to form tissue microarrays (TMAs). We describe a protocol for analyzing protein in FFPE -TMAs using matrix-assisted laser desorption/ionization (MAL DI) imaging mass spectrometry (IMS). The workflow incorporates an antigen retrieval step following deparaffinization, in situ trypsin digestion, matrix application and then mass spectrometry signal acquisition. The direct analysis of FFPE -TMA tissue using IMS allows direct analysis of multiple tissue samples in a single experiment without extraction and purification of proteins. The advantages of high speed and throughput, easy sample handling and excellent reproducibility make this technology a favorable approach for the proteomic analysis of clinical research cohorts with large sample numbers. For example, TMA analysis of 300 FFPE cores would typically require 6 h of total time through data acquisition, not including data analysis. PMID:22011652

  13. Software Tools | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    The CPTAC program develops new approaches to elucidate aspects of the molecular complexity of cancer made from large-scale proteogenomic datasets, and advance them toward precision medicine.  Part of the CPTAC mission is to make data and tools available and accessible to the greater research community to accelerate the discovery process.

  14. Methodologies and Perspectives of Proteomics Applied to Filamentous Fungi: From Sample Preparation to Secretome Analysis

    PubMed Central

    Bianco, Linda; Perrotta, Gaetano

    2015-01-01

    Filamentous fungi possess the extraordinary ability to digest complex biomasses and mineralize numerous xenobiotics, as consequence of their aptitude to sensing the environment and regulating their intra and extra cellular proteins, producing drastic changes in proteome and secretome composition. Recent advancement in proteomic technologies offers an exciting opportunity to reveal the fluctuations of fungal proteins and enzymes, responsible for their metabolic adaptation to a large variety of environmental conditions. Here, an overview of the most commonly used proteomic strategies will be provided; this paper will range from sample preparation to gel-free and gel-based proteomics, discussing pros and cons of each mentioned state-of-the-art technique. The main focus will be kept on filamentous fungi. Due to the biotechnological relevance of lignocellulose degrading fungi, special attention will be finally given to their extracellular proteome, or secretome. Secreted proteins and enzymes will be discussed in relation to their involvement in bio-based processes, such as biomass deconstruction and mycoremediation. PMID:25775160

  15. Methodologies and perspectives of proteomics applied to filamentous fungi: from sample preparation to secretome analysis.

    PubMed

    Bianco, Linda; Perrotta, Gaetano

    2015-03-12

    Filamentous fungi possess the extraordinary ability to digest complex biomasses and mineralize numerous xenobiotics, as consequence of their aptitude to sensing the environment and regulating their intra and extra cellular proteins, producing drastic changes in proteome and secretome composition. Recent advancement in proteomic technologies offers an exciting opportunity to reveal the fluctuations of fungal proteins and enzymes, responsible for their metabolic adaptation to a large variety of environmental conditions. Here, an overview of the most commonly used proteomic strategies will be provided; this paper will range from sample preparation to gel-free and gel-based proteomics, discussing pros and cons of each mentioned state-of-the-art technique. The main focus will be kept on filamentous fungi. Due to the biotechnological relevance of lignocellulose degrading fungi, special attention will be finally given to their extracellular proteome, or secretome. Secreted proteins and enzymes will be discussed in relation to their involvement in bio-based processes, such as biomass deconstruction and mycoremediation.

  16. Proteomics of the Human Placenta: Promises and Realities

    PubMed Central

    Robinson, J.M.; Ackerman, W.E.; Kniss, D.A.; Takizawa, T.; Vandré, D.D.

    2015-01-01

    Proteomics is an area of study that sets as its ultimate goal the global analysis of all of the proteins expressed in a biological system of interest. However, technical limitations currently hamper proteome-wide analyses of complex systems. In a more practical sense, a desired outcome of proteomics research is the translation of large protein data sets into formats that provide meaningful information regarding clinical conditions (e.g., biomarkers to serve as diagnostic and/or prognostic indicators of disease). Herein, we discuss placental proteomics by describing existing studies, pointing out their strengths and weaknesses. In so doing, we strive to inform investigators interested in this area of research about the current gap between hyperbolic promises and realities. Additionally, we discuss the utility of proteomics in discovery-based research, particularly as regards the capacity to unearth novel insights into placental biology. Importantly, when considering under studied systems such as the human placenta and diseases associated with abnormalities in placental function, proteomics can serve as a robust ‘shortcut’ to obtaining information unlikely to be garnered using traditional approaches. PMID:18222537

  17. Proteomic identification of processes and pathways characteristic of osmoregulatory tissues in spiny dogfish shark (Squalus acanthias).

    PubMed

    Lee, Jinoo; Valkova, Nelly; White, Mark P; Kültz, Dietmar

    2006-09-01

    We used dogfish shark (Squalus acanthias) as a model for proteome analysis of six different tissues to evaluate tissue-specific protein expression on a global scale and to deduce specific functions and the relatedness of multiple tissues from their proteomes. Proteomes of heart, brain, kidney, intestine, gill, and rectal gland were separated by two-dimensional gel electrophoresis (2DGE), gel images were matched using Delta 2D software and then evaluated for tissue-specific proteins. Sixty-one proteins (4%) were found to be in only a single type of tissue and 535 proteins (36%) were equally abundant in all six tissues. Relatedness between tissues was assessed based on tissue-specific expression patterns of all 1465 consistently resolved protein spots. This analysis revealed that tissues with osmoregulatory function (kidney, intestine, gill, rectal gland) were more similar in their overall proteomes than non-osmoregulatory tissues (heart, brain). Sixty-one proteins were identified by MALDI-TOF/TOF mass spectrometry and biological functions characteristic of osmoregulatory tissues were derived from gene ontology and molecular pathway analysis. Our data demonstrate that the molecular machinery for energy and urea metabolism and the Rho-GTPase/cytoskeleton pathway are enriched in osmoregulatory tissues of sharks. Our work provides a strong rationale for further study of the contribution of these mechanisms to the osmoregulation of marine sharks.

  18. A scalable strategy for high-throughput GFP tagging of endogenous human proteins.

    PubMed

    Leonetti, Manuel D; Sekine, Sayaka; Kamiyama, Daichi; Weissman, Jonathan S; Huang, Bo

    2016-06-21

    A central challenge of the postgenomic era is to comprehensively characterize the cellular role of the ∼20,000 proteins encoded in the human genome. To systematically study protein function in a native cellular background, libraries of human cell lines expressing proteins tagged with a functional sequence at their endogenous loci would be very valuable. Here, using electroporation of Cas9 nuclease/single-guide RNA ribonucleoproteins and taking advantage of a split-GFP system, we describe a scalable method for the robust, scarless, and specific tagging of endogenous human genes with GFP. Our approach requires no molecular cloning and allows a large number of cell lines to be processed in parallel. We demonstrate the scalability of our method by targeting 48 human genes and show that the resulting GFP fluorescence correlates with protein expression levels. We next present how our protocols can be easily adapted for the tagging of a given target with GFP repeats, critically enabling the study of low-abundance proteins. Finally, we show that our GFP tagging approach allows the biochemical isolation of native protein complexes for proteomic studies. Taken together, our results pave the way for the large-scale generation of endogenously tagged human cell lines for the proteome-wide analysis of protein localization and interaction networks in a native cellular context.

  19. New Markers for Predicting Fertility of the Male Gametes in the Post Genomic Age.

    PubMed

    Dipresa, Savina; De Toni, Luca; Foresta, Carlo; Garolla, Andrea

    2018-04-18

    A number of test have been proposed to assess male fertility potential, ranging from routine testing by light microscopic method for evaluating semen samples, to screening test for DNA integrity aimed to look at sperm chromatin abnormalities. Spermatozoa are an extremely differentiated cell, they have critical functions for embryo development and heredity, in addiction to delivering a haploid paternal genome to the oocyte. Towards this goal certain requirements must always be met. The ability of spermatozoa to perform its reproductive function taking place in the spermatogenesis, a highly specialized process depending on multiple factors with effect on male fertility. In the past 30 years, large-scale analyses of transcriptomic and genome expression in mammals have generated a large amount of informations on numberless biomolecules involved in spermatogenesis and male germ cell reproductive function. Sperm proteome represents the protein content that spermatozoa needs to survive and work correctly and modifications of sperm proteome play a role in determining functional changes leading to a decrease of reproductive competence into affected spermatozoa. The post-genomic approach consists of different methodologies for concurrently testicular transcriptome studies, protein compositional analysis and metabolomics findings of the spermatozoa in humans. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  20. Complex and extensive post-transcriptional regulation revealed by integrative proteomic and transcriptomic analysis of metabolite stress response in Clostridium acetobutylicum.

    PubMed

    Venkataramanan, Keerthi P; Min, Lie; Hou, Shuyu; Jones, Shawn W; Ralston, Matthew T; Lee, Kelvin H; Papoutsakis, E Terry

    2015-01-01

    Clostridium acetobutylicum is a model organism for both clostridial biology and solvent production. The organism is exposed to its own toxic metabolites butyrate and butanol, which trigger an adaptive stress response. Integrative analysis of proteomic and RNAseq data may provide novel insights into post-transcriptional regulation. The identified iTRAQ-based quantitative stress proteome is made up of 616 proteins with a 15 % genome coverage. The differentially expressed proteome correlated poorly with the corresponding differential RNAseq transcriptome. Up to 31 % of the differentially expressed proteins under stress displayed patterns opposite to those of the transcriptome, thus suggesting significant post-transcriptional regulation. The differential proteome of the translation machinery suggests that cells employ a different subset of ribosomal proteins under stress. Several highly upregulated proteins but with low mRNA levels possessed mRNAs with long 5'UTRs and strong RBS scores, thus supporting the argument that regulatory elements on the long 5'UTRs control their translation. For example, the oxidative stress response rubrerythrin was upregulated only at the protein level up to 40-fold without significant mRNA changes. We also identified many leaderless transcripts, several displaying different transcriptional start sites, thus suggesting mRNA-trimming mechanisms under stress. Downregulation of Rho and partner proteins pointed to changes in transcriptional elongation and termination under stress. The integrative proteomic-transcriptomic analysis demonstrated complex expression patterns of a large fraction of the proteome. Such patterns could not have been detected with one or the other omic analyses. Our analysis proposes the involvement of specific molecular mechanisms of post-transcriptional regulation to explain the observed complex stress response.

  1. Proteomic analysis of hyperadhesive Candida glabrata clinical isolates reveals a core wall proteome and differential incorporation of adhesins.

    PubMed

    Gómez-Molero, Emilia; de Boer, Albert D; Dekker, Henk L; Moreno-Martínez, Ana; Kraneveld, Eef A; Ichsan; Chauhan, Neeraj; Weig, Michael; de Soet, Johannes J; de Koster, Chris G; Bader, Oliver; de Groot, Piet W J

    2015-12-01

    Attachment to human host tissues or abiotic medical devices is a key step in the development of infections by Candida glabrata. The genome of this pathogenic yeast codes for a large number of adhesins, but proteomic work using reference strains has shown incorporation of only few adhesins in the cell wall. By making inventories of the wall proteomes of hyperadhesive clinical isolates and reference strain CBS138 using mass spectrometry, we describe the cell wall proteome of C. glabrata and tested the hypothesis that hyperadhesive isolates display differential incorporation of adhesins. Two clinical strains (PEU382 and PEU427) were selected, which both were hyperadhesive to polystyrene and showed high surface hydrophobicity. Cell wall proteome analysis under biofilm-forming conditions identified a core proteome of about 20 proteins present in all C. glabrata strains. In addition, 12 adhesin-like wall proteins were identified in the hyperadherent strains, including six novel adhesins (Awp8-13) of which only Awp12 was also present in CBS138. We conclude that the hyperadhesive capacity of these two clinical C. glabrata isolates is correlated with increased and differential incorporation of cell wall adhesins. Future studies should elucidate the role of the identified proteins in the establishment of C. glabrata infections. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  2. Discovery of Colorectal Cancer Biomarker Candidates by Membrane Proteomic Analysis and Subsequent Verification using Selected Reaction Monitoring (SRM) and Tissue Microarray (TMA) Analysis*

    PubMed Central

    Kume, Hideaki; Muraoka, Satoshi; Kuga, Takahisa; Adachi, Jun; Narumi, Ryohei; Watanabe, Shio; Kuwano, Masayoshi; Kodera, Yoshio; Matsushita, Kazuyuki; Fukuoka, Junya; Masuda, Takeshi; Ishihama, Yasushi; Matsubara, Hisahiro; Nomura, Fumio; Tomonaga, Takeshi

    2014-01-01

    Recent advances in quantitative proteomic technology have enabled the large-scale validation of biomarkers. We here performed a quantitative proteomic analysis of membrane fractions from colorectal cancer tissue to discover biomarker candidates, and then extensively validated the candidate proteins identified. A total of 5566 proteins were identified in six tissue samples, each of which was obtained from polyps and cancer with and without metastasis. GO cellular component analysis predicted that 3087 of these proteins were membrane proteins, whereas TMHMM algorithm predicted that 1567 proteins had a transmembrane domain. Differences were observed in the expression of 159 membrane proteins and 55 extracellular proteins between polyps and cancer without metastasis, while the expression of 32 membrane proteins and 17 extracellular proteins differed between cancer with and without metastasis. A total of 105 of these biomarker candidates were quantitated using selected (or multiple) reaction monitoring (SRM/MRM) with stable synthetic isotope-labeled peptides as an internal control. The results obtained revealed differences in the expression of 69 of these proteins, and this was subsequently verified in an independent set of patient samples (polyps (n = 10), cancer without metastasis (n = 10), cancer with metastasis (n = 10)). Significant differences were observed in the expression of 44 of these proteins, including ITGA5, GPRC5A, PDGFRB, and TFRC, which have already been shown to be overexpressed in colorectal cancer, as well as proteins with unknown function, such as C8orf55. The expression of C8orf55 was also shown to be high not only in colorectal cancer, but also in several cancer tissues using a multicancer tissue microarray, which included 1150 cores from 14 cancer tissues. This is the largest verification study of biomarker candidate membrane proteins to date; our methods for biomarker discovery and subsequent validation using SRM/MRM will contribute to the identification of useful biomarker candidates for various cancers. Data are available via ProteomeXchange with identifier PXD000851. PMID:24687888

  3. Discovery of colorectal cancer biomarker candidates by membrane proteomic analysis and subsequent verification using selected reaction monitoring (SRM) and tissue microarray (TMA) analysis.

    PubMed

    Kume, Hideaki; Muraoka, Satoshi; Kuga, Takahisa; Adachi, Jun; Narumi, Ryohei; Watanabe, Shio; Kuwano, Masayoshi; Kodera, Yoshio; Matsushita, Kazuyuki; Fukuoka, Junya; Masuda, Takeshi; Ishihama, Yasushi; Matsubara, Hisahiro; Nomura, Fumio; Tomonaga, Takeshi

    2014-06-01

    Recent advances in quantitative proteomic technology have enabled the large-scale validation of biomarkers. We here performed a quantitative proteomic analysis of membrane fractions from colorectal cancer tissue to discover biomarker candidates, and then extensively validated the candidate proteins identified. A total of 5566 proteins were identified in six tissue samples, each of which was obtained from polyps and cancer with and without metastasis. GO cellular component analysis predicted that 3087 of these proteins were membrane proteins, whereas TMHMM algorithm predicted that 1567 proteins had a transmembrane domain. Differences were observed in the expression of 159 membrane proteins and 55 extracellular proteins between polyps and cancer without metastasis, while the expression of 32 membrane proteins and 17 extracellular proteins differed between cancer with and without metastasis. A total of 105 of these biomarker candidates were quantitated using selected (or multiple) reaction monitoring (SRM/MRM) with stable synthetic isotope-labeled peptides as an internal control. The results obtained revealed differences in the expression of 69 of these proteins, and this was subsequently verified in an independent set of patient samples (polyps (n = 10), cancer without metastasis (n = 10), cancer with metastasis (n = 10)). Significant differences were observed in the expression of 44 of these proteins, including ITGA5, GPRC5A, PDGFRB, and TFRC, which have already been shown to be overexpressed in colorectal cancer, as well as proteins with unknown function, such as C8orf55. The expression of C8orf55 was also shown to be high not only in colorectal cancer, but also in several cancer tissues using a multicancer tissue microarray, which included 1150 cores from 14 cancer tissues. This is the largest verification study of biomarker candidate membrane proteins to date; our methods for biomarker discovery and subsequent validation using SRM/MRM will contribute to the identification of useful biomarker candidates for various cancers. Data are available via ProteomeXchange with identifier PXD000851. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.

  4. Top-down Proteomics in Health and Disease: Challenges and Opportunities

    PubMed Central

    Gregorich, Zachery R.; Ge, Ying

    2014-01-01

    Proteomics is essential for deciphering how molecules interact as a system and for understanding the functions of cellular systems in human disease; however, the unique characteristics of the human proteome, which include a high dynamic range of protein expression and extreme complexity due to a plethora of post-translational modifications (PTMs) and sequence variations, make such analyses challenging. An emerging “top-down” mass spectrometry (MS)-based proteomics approach, which provides a “bird’s eye” view of all proteoforms, has unique advantages for the assessment of PTMs and sequence variations. Recently, a number of studies have showcased the potential of top-down proteomics for unraveling of disease mechanisms and discovery of new biomarkers. Nevertheless, the top-down approach still faces significant challenges in terms of protein solubility, separation, and the detection of large intact proteins, as well as the under-developed data analysis tools. Consequently, new technological developments are urgently needed to advance the field of top-down proteomics. Herein, we intend to provide an overview of the recent applications of top-down proteomics in biomedical research. Moreover, we will outline the challenges and opportunities facing top-down proteomics strategies aimed at understanding and diagnosing human diseases. PMID:24723472

  5. MaxReport: An Enhanced Proteomic Result Reporting Tool for MaxQuant.

    PubMed

    Zhou, Tao; Li, Chuyu; Zhao, Wene; Wang, Xinru; Wang, Fuqiang; Sha, Jiahao

    2016-01-01

    MaxQuant is a proteomic software widely used for large-scale tandem mass spectrometry data. We have designed and developed an enhanced result reporting tool for MaxQuant, named as MaxReport. This tool can optimize the results of MaxQuant and provide additional functions for result interpretation. MaxReport can generate report tables for protein N-terminal modifications. It also supports isobaric labelling based relative quantification at the protein, peptide or site level. To obtain an overview of the results, MaxReport performs general descriptive statistical analyses for both identification and quantification results. The output results of MaxReport are well organized and therefore helpful for proteomic users to better understand and share their data. The script of MaxReport, which is freely available at http://websdoor.net/bioinfo/maxreport/, is developed using Python code and is compatible across multiple systems including Windows and Linux.

  6. Systematically Ranking the Tightness of Membrane Association for Peripheral Membrane Proteins (PMPs)*

    PubMed Central

    Gao, Liyan; Ge, Haitao; Huang, Xiahe; Liu, Kehui; Zhang, Yuanya; Xu, Wu; Wang, Yingchun

    2015-01-01

    Large-scale quantitative evaluation of the tightness of membrane association for nontransmembrane proteins is important for identifying true peripheral membrane proteins with functional significance. Herein, we simultaneously ranked more than 1000 proteins of the photosynthetic model organism Synechocystis sp. PCC 6803 for their relative tightness of membrane association using a proteomic approach. Using multiple precisely ranked and experimentally verified peripheral subunits of photosynthetic protein complexes as the landmarks, we found that proteins involved in two-component signal transduction systems and transporters are overall tightly associated with the membranes, whereas the associations of ribosomal proteins are much weaker. Moreover, we found that hypothetical proteins containing the same domains generally have similar tightness. This work provided a global view of the structural organization of the membrane proteome with respect to divergent functions, and built the foundation for future investigation of the dynamic membrane proteome reorganization in response to different environmental or internal stimuli. PMID:25505158

  7. Multiplexed and scalable super-resolution imaging of three-dimensional protein localization in size-adjustable tissues.

    PubMed

    Ku, Taeyun; Swaney, Justin; Park, Jeong-Yoon; Albanese, Alexandre; Murray, Evan; Cho, Jae Hun; Park, Young-Gyun; Mangena, Vamsi; Chen, Jiapei; Chung, Kwanghun

    2016-09-01

    The biology of multicellular organisms is coordinated across multiple size scales, from the subnanoscale of molecules to the macroscale, tissue-wide interconnectivity of cell populations. Here we introduce a method for super-resolution imaging of the multiscale organization of intact tissues. The method, called magnified analysis of the proteome (MAP), linearly expands entire organs fourfold while preserving their overall architecture and three-dimensional proteome organization. MAP is based on the observation that preventing crosslinking within and between endogenous proteins during hydrogel-tissue hybridization allows for natural expansion upon protein denaturation and dissociation. The expanded tissue preserves its protein content, its fine subcellular details, and its organ-scale intercellular connectivity. We use off-the-shelf antibodies for multiple rounds of immunolabeling and imaging of a tissue's magnified proteome, and our experiments demonstrate a success rate of 82% (100/122 antibodies tested). We show that specimen size can be reversibly modulated to image both inter-regional connections and fine synaptic architectures in the mouse brain.

  8. Proteomic and N-glycoproteomic quantification reveal aberrant changes in the human saliva of oral ulcer patients.

    PubMed

    Zhang, Ying; Wang, Xi; Cui, Dan; Zhu, Jun

    2016-12-01

    Human whole saliva is a vital body fluid for studying the physiology and pathology of the oral cavity. As a powerful technique for biomarker discovery, MS-based proteomic strategies have been introduced for saliva analysis and identified hundreds of proteins and N-glycosylation sites. However, there is still a lack of quantitative analysis, which is necessary for biomarker screening and biological research. In this study, we establish an integrated workflow by the combination of stable isotope dimethyl labeling, HILIC enrichment, and high resolution MS for both quantification of the global proteome and N-glycoproteome of human saliva from oral ulcer patients. With the help of advanced bioinformatics, we comprehensively studied oral ulcers at both protein and glycoprotein scales. Bioinformatics analyses revealed that starch digestion and protein degradation activities are inhibited while the immune response is promoted in oral ulcer saliva. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Large-Scale Chemical Similarity Networks for Target Profiling of Compounds Identified in Cell-Based Chemical Screens

    PubMed Central

    Lo, Yu-Chen; Senese, Silvia; Li, Chien-Ming; Hu, Qiyang; Huang, Yong; Damoiseaux, Robert; Torres, Jorge Z.

    2015-01-01

    Target identification is one of the most critical steps following cell-based phenotypic chemical screens aimed at identifying compounds with potential uses in cell biology and for developing novel disease therapies. Current in silico target identification methods, including chemical similarity database searches, are limited to single or sequential ligand analysis that have limited capabilities for accurate deconvolution of a large number of compounds with diverse chemical structures. Here, we present CSNAP (Chemical Similarity Network Analysis Pulldown), a new computational target identification method that utilizes chemical similarity networks for large-scale chemotype (consensus chemical pattern) recognition and drug target profiling. Our benchmark study showed that CSNAP can achieve an overall higher accuracy (>80%) of target prediction with respect to representative chemotypes in large (>200) compound sets, in comparison to the SEA approach (60–70%). Additionally, CSNAP is capable of integrating with biological knowledge-based databases (Uniprot, GO) and high-throughput biology platforms (proteomic, genetic, etc) for system-wise drug target validation. To demonstrate the utility of the CSNAP approach, we combined CSNAP's target prediction with experimental ligand evaluation to identify the major mitotic targets of hit compounds from a cell-based chemical screen and we highlight novel compounds targeting microtubules, an important cancer therapeutic target. The CSNAP method is freely available and can be accessed from the CSNAP web server (http://services.mbi.ucla.edu/CSNAP/). PMID:25826798

  10. Top-down proteomics reveals a unique protein S-thiolation switch in Salmonella Typimurium in response to infection-like conditions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ansong, Charles; Wu, Si; Meng, Da

    Characterization of the mature protein complement in cells is crucial for a better understanding of cellular processes on a systems-wide scale. Bottom-up proteomic approaches often lead to loss of critical information about an endogenous protein’s actual state due to post translational modifications (PTMs) and other processes. Top-down approaches that involve analysis of the intact protein can address this concern but present significant analytical challenges related to the separation quality needed, measurement sensitivity, and speed that result in low throughput and limited coverage. Here we used single-dimension ultra high pressure liquid chromatography mass spectrometry to investigate the comprehensive ‘intact’ proteome ofmore » the Gram negative bacterial pathogen Salmonella Typhimurium. Top-down proteomics analysis revealed 563 unique proteins including 1665 proteoforms generated by PTMs, representing the largest microbial top-down dataset reported to date. Our analysis not only confirmed several previously recognized aspects of Salmonella biology and bacterial PTMs in general, but also revealed several novel biological insights. Of particular interest was differential utilization of the protein S-thiolation forms S-glutathionylation and S-cysteinylation in response to infection-like conditions versus basal conditions, which was corroborated by changes in corresponding biosynthetic pathways. This differential utilization highlights underlying metabolic mechanisms that modulate changes in cellular signaling, and represents to our knowledge the first report of S-cysteinylation in Gram negative bacteria. The demonstrated utility of our simple proteome-wide intact protein level measurement strategy for gaining biological insight should promote broader adoption and applications of top-down proteomics approaches.« less

  11. Redox Proteomics of Protein-bound Methionine Oxidation*

    PubMed Central

    Ghesquière, Bart; Jonckheere, Veronique; Colaert, Niklaas; Van Durme, Joost; Timmerman, Evy; Goethals, Marc; Schymkowitz, Joost; Rousseau, Frederic; Vandekerckhove, Joël; Gevaert, Kris

    2011-01-01

    We here present a new method to measure the degree of protein-bound methionine sulfoxide formation at a proteome-wide scale. In human Jurkat cells that were stressed with hydrogen peroxide, over 2000 oxidation-sensitive methionines in more than 1600 different proteins were mapped and their extent of oxidation was quantified. Meta-analysis of the sequences surrounding the oxidized methionine residues revealed a high preference for neighboring polar residues. Using synthetic methionine sulfoxide containing peptides designed according to the observed sequence preferences in the oxidized Jurkat proteome, we discovered that the substrate specificity of the cellular methionine sulfoxide reductases is a major determinant for the steady-state of methionine oxidation. This was supported by a structural modeling of the MsrA catalytic center. Finally, we applied our method onto a serum proteome from a mouse sepsis model and identified 35 in vivo methionine oxidation events in 27 different proteins. PMID:21406390

  12. Using Public Data for Comparative Proteome Analysis in Precision Medicine Programs.

    PubMed

    Hughes, Christopher S; Morin, Gregg B

    2018-03-01

    Maximizing the clinical utility of information obtained in longitudinal precision medicine programs would benefit from robust comparative analyses to known information to assess biological features of patient material toward identifying the underlying features driving their disease phenotype. Herein, the potential for utilizing publically deposited mass-spectrometry-based proteomics data to perform inter-study comparisons of cell-line or tumor-tissue materials is investigated. To investigate the robustness of comparison between MS-based proteomics studies carried out with different methodologies, deposited data representative of label-free (MS1) and isobaric tagging (MS2 and MS3 quantification) are utilized. In-depth quantitative proteomics data acquired from analysis of ovarian cancer cell lines revealed the robust recapitulation of observable gene expression dynamics between individual studies carried out using significantly different methodologies. The observed signatures enable robust inter-study clustering of cell line samples. In addition, the ability to classify and cluster tumor samples based on observed gene expression trends when using a single patient sample is established. With this analysis, relevant gene expression dynamics are obtained from a single patient tumor, in the context of a precision medicine analysis, by leveraging a large cohort of repository data as a comparator. Together, these data establish the potential for state-of-the-art MS-based proteomics data to serve as resources for robust comparative analyses in precision medicine applications. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Introducing the CPL/MUW proteome database: interpretation of human liver and liver cancer proteome profiles by referring to isolated primary cells.

    PubMed

    Wimmer, Helge; Gundacker, Nina C; Griss, Johannes; Haudek, Verena J; Stättner, Stefan; Mohr, Thomas; Zwickl, Hannes; Paulitschke, Verena; Baron, David M; Trittner, Wolfgang; Kubicek, Markus; Bayer, Editha; Slany, Astrid; Gerner, Christopher

    2009-06-01

    Interpretation of proteome data with a focus on biomarker discovery largely relies on comparative proteome analyses. Here, we introduce a database-assisted interpretation strategy based on proteome profiles of primary cells. Both 2-D-PAGE and shotgun proteomics are applied. We obtain high data concordance with these two different techniques. When applying mass analysis of tryptic spot digests from 2-D gels of cytoplasmic fractions, we typically identify several hundred proteins. Using the same protein fractions, we usually identify more than thousand proteins by shotgun proteomics. The data consistency obtained when comparing these independent data sets exceeds 99% of the proteins identified in the 2-D gels. Many characteristic differences in protein expression of different cells can thus be independently confirmed. Our self-designed SQL database (CPL/MUW - database of the Clinical Proteomics Laboratories at the Medical University of Vienna accessible via www.meduniwien.ac.at/proteomics/database) facilitates (i) quality management of protein identification data, which are based on MS, (ii) the detection of cell type-specific proteins and (iii) of molecular signatures of specific functional cell states. Here, we demonstrate, how the interpretation of proteome profiles obtained from human liver tissue and hepatocellular carcinoma tissue is assisted by the Clinical Proteomics Laboratories at the Medical University of Vienna-database. Therefore, we suggest that the use of reference experiments supported by a tailored database may substantially facilitate data interpretation of proteome profiling experiments.

  14. Using Proteomics to Understand How Leishmania Parasites Survive inside the Host and Establish Infection

    PubMed Central

    Veras, Patrícia Sampaio Tavares; Bezerra de Menezes, Juliana Perrone

    2016-01-01

    Leishmania is a protozoan parasite that causes a wide range of different clinical manifestations in mammalian hosts. It is a major public health risk on different continents and represents one of the most important neglected diseases. Due to the high toxicity of the drugs currently used, and in the light of increasing drug resistance, there is a critical need to develop new drugs and vaccines to control Leishmania infection. Over the past few years, proteomics has become an important tool to understand the underlying biology of Leishmania parasites and host interaction. The large-scale study of proteins, both in parasites and within the host in response to infection, can accelerate the discovery of new therapeutic targets. By studying the proteomes of host cells and tissues infected with Leishmania, as well as changes in protein profiles among promastigotes and amastigotes, scientists hope to better understand the biology involved in the parasite survival and the host-parasite interaction. This review demonstrates the feasibility of proteomics as an approach to identify new proteins involved in Leishmania differentiation and intracellular survival. PMID:27548150

  15. Using Proteomics to Understand How Leishmania Parasites Survive inside the Host and Establish Infection.

    PubMed

    Veras, Patrícia Sampaio Tavares; Bezerra de Menezes, Juliana Perrone

    2016-08-19

    Leishmania is a protozoan parasite that causes a wide range of different clinical manifestations in mammalian hosts. It is a major public health risk on different continents and represents one of the most important neglected diseases. Due to the high toxicity of the drugs currently used, and in the light of increasing drug resistance, there is a critical need to develop new drugs and vaccines to control Leishmania infection. Over the past few years, proteomics has become an important tool to understand the underlying biology of Leishmania parasites and host interaction. The large-scale study of proteins, both in parasites and within the host in response to infection, can accelerate the discovery of new therapeutic targets. By studying the proteomes of host cells and tissues infected with Leishmania, as well as changes in protein profiles among promastigotes and amastigotes, scientists hope to better understand the biology involved in the parasite survival and the host-parasite interaction. This review demonstrates the feasibility of proteomics as an approach to identify new proteins involved in Leishmania differentiation and intracellular survival.

  16. Mass spectrometry-based proteomics: from cancer biology to protein biomarkers, drug targets, and clinical applications.

    PubMed

    Jimenez, Connie R; Verheul, Henk M W

    2014-01-01

    Proteomics is optimally suited to bridge the gap between genomic information on the one hand and biologic functions and disease phenotypes at the other, since it studies the expression and/or post-translational modification (especially phosphorylation) of proteins--the major cellular players bringing about cellular functions--at a global level in biologic specimens. Mass spectrometry technology and (bio)informatic tools have matured to the extent that they can provide high-throughput, comprehensive, and quantitative protein inventories of cells, tissues, and biofluids in clinical samples at low level. In this article, we focus on next-generation proteomics employing nanoliquid chromatography coupled to high-resolution tandem mass spectrometry for in-depth (phospho)protein profiling of tumor tissues and (proximal) biofluids, with a focus on studies employing clinical material. In addition, we highlight emerging proteogenomic approaches for the identification of tumor-specific protein variants, and targeted multiplex mass spectrometry strategies for large-scale biomarker validation. Below we provide a discussion of recent progress, some research highlights, and challenges that remain for clinical translation of proteomic discoveries.

  17. A Systematic Analysis of a Deep Mouse Epididymal Sperm Proteome

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chauvin, Theodore; Xie, Fang; Liu, Tao

    Spermatozoa are highly specialized cells that, when mature, are capable of navigating the female reproductive tract and fertilizing an oocyte. The sperm cell is thought to be largely quiescent in terms of transcriptional and translational activity. As a result, once it has left the male reproductive tract, the sperm cell is essentially operating with a static population of proteins. It is therefore theoretically possible to understand the protein networks contained in a sperm cell and to deduce its cellular function capabilities. To this end we have performed a proteomic analysis of mouse sperm isolated from the cauda epididymis and havemore » confidently identified 2,850 proteins, which is the most comprehensive sperm proteome for any species reported to date. These proteins comprise many complete cellular pathways, including those for energy production via glycolysis, β-oxidation and oxidative phosphorylation, protein folding and transport, and cell signaling systems. This proteome should prove a useful tool for assembly and testing of protein networks important for sperm function.« less

  18. Insights into temperature modulation of the Eucalyptus globulus and Eucalyptus grandis antioxidant and lignification subproteomes.

    PubMed

    de Santana Costa, Marília Gabriela; Mazzafera, Paulo; Balbuena, Tiago Santana

    2017-05-01

    Eucalyptus grandis and Eucalyptus globulus are among the most widely cultivated trees, differing in lignin composition and plantation areas, as E. grandis is mostly cultivated in tropical regions while E. globulus is preferred in temperate areas. As temperature is a key modulator in plant metabolism, a large-scale proteome analysis was carried out to investigate changes in the antioxidant system and the lignification metabolism in plantlets grown at different temperatures. Our strategy allowed the identification of 3111 stem proteins. A total of 103 antioxidant proteins were detected in the stems of both species. Hierarchical clustering revealed that alterations in the antioxidant proteins are more prominent when Eucalyptus seedlings were exposed to high temperature and that the superoxide isoforms coded by the gene Eucgr.B03930 are the most abundant antioxidant enzymes induced by thermal stimulus. Regarding the lignin biosynthesis, our proteomics approach resulted in the identification of 13 of the 17 core proteins involved in this metabolism, corroborating with gene predictions and the proposed lignin toolbox. Quantitative analyses revealed significant differences in 8 protein isoforms, including the ferulate 5-hydroxylase isoform F5H1, a key enzyme in catalyzing the synthesis of sinapyl alcohol, and the cinnamyl alcohol dehydrogenase isoform CAD2, the last enzyme in monolignol biosynthesis. Data are available via ProteomeXchange with identifier PXD005743. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. BIG: a large-scale data integration tool for renal physiology

    PubMed Central

    Zhao, Yue; Yang, Chin-Rang; Raghuram, Viswanathan; Parulekar, Jaya

    2016-01-01

    Due to recent advances in high-throughput techniques, we and others have generated multiple proteomic and transcriptomic databases to describe and quantify gene expression, protein abundance, or cellular signaling on the scale of the whole genome/proteome in kidney cells. The existence of so much data from diverse sources raises the following question: “How can researchers find information efficiently for a given gene product over all of these data sets without searching each data set individually?” This is the type of problem that has motivated the “Big-Data” revolution in Data Science, which has driven progress in fields such as marketing. Here we present an online Big-Data tool called BIG (Biological Information Gatherer) that allows users to submit a single online query to obtain all relevant information from all indexed databases. BIG is accessible at http://big.nhlbi.nih.gov/. PMID:27279488

  20. Proteinortho: detection of (co-)orthologs in large-scale analysis.

    PubMed

    Lechner, Marcus; Findeiss, Sven; Steiner, Lydia; Marz, Manja; Stadler, Peter F; Prohaska, Sonja J

    2011-04-28

    Orthology analysis is an important part of data analysis in many areas of bioinformatics such as comparative genomics and molecular phylogenetics. The ever-increasing flood of sequence data, and hence the rapidly increasing number of genomes that can be compared simultaneously, calls for efficient software tools as brute-force approaches with quadratic memory requirements become infeasible in practise. The rapid pace at which new data become available, furthermore, makes it desirable to compute genome-wide orthology relations for a given dataset rather than relying on relations listed in databases. The program Proteinortho described here is a stand-alone tool that is geared towards large datasets and makes use of distributed computing techniques when run on multi-core hardware. It implements an extended version of the reciprocal best alignment heuristic. We apply Proteinortho to compute orthologous proteins in the complete set of all 717 eubacterial genomes available at NCBI at the beginning of 2009. We identified thirty proteins present in 99% of all bacterial proteomes. Proteinortho significantly reduces the required amount of memory for orthology analysis compared to existing tools, allowing such computations to be performed on off-the-shelf hardware.

  1. Optimization of quantitative proteomic analysis of clots generated from plasma of patients with venous thromboembolism.

    PubMed

    Stachowicz, Aneta; Siudut, Jakub; Suski, Maciej; Olszanecki, Rafał; Korbut, Ryszard; Undas, Anetta; Wiśniewski, Jacek R

    2017-01-01

    It is well known that fibrin network binds a large variety of proteins, including inhibitors and activators of fibrinolysis, which may affect clot properties, such as stability and susceptibility to fibrinolysis. Specific plasma clot composition differs between individuals and may change in disease states. However, the plasma clot proteome has not yet been in-depth analyzed, mainly due to technical difficulty related to the presence of a highly abundant protein-fibrinogen and fibrin that forms a plasma clot. The aim of our study was to optimize quantitative proteomic analysis of fibrin clots prepared ex vivo from citrated plasma of the peripheral blood drawn from patients with prior venous thromboembolism (VTE). We used a multiple enzyme digestion filter aided sample preparation, a multienzyme digestion (MED) FASP method combined with LC-MS/MS analysis performed on a Proxeon Easy-nLC System coupled to the Q Exactive HF mass spectrometer. We also evaluated the impact of peptide fractionation with pipet-tip strong anion exchange (SAX) method on the obtained results. Our proteomic approach revealed 476 proteins repeatedly identified in the plasma fibrin clots from patients with VTE including extracellular vesicle-derived proteins, lipoproteins, fibrinolysis inhibitors, and proteins involved in immune responses. The MED FASP method using three different enzymes: LysC, trypsin and chymotrypsin increased the number of identified peptides and proteins and their sequence coverage as compared to a single step digestion. Peptide fractionation with a pipet-tip strong anion exchange (SAX) protocol increased the depth of proteomic analyses, but also extended the time needed for sample analysis with LC-MS/MS. The MED FASP method combined with a label-free quantification is an excellent proteomic approach for the analysis of fibrin clots prepared ex vivo from citrated plasma of patients with prior VTE.

  2. Data Use Agreement | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    CPTAC requests that data users abide by the same principles that were previously established in the Fort Lauderdale and Amsterdam meetings. The recommendations from the Fort Lauderdale meeting (2003) on best practices and principles for sharing large-scale genomic data address the roles and responsibilities of data producers, data users and funders of community resource projects.

  3. Large-scale Proteomics Analysis of the Human Kinome

    PubMed Central

    Oppermann, Felix S.; Gnad, Florian; Olsen, Jesper V.; Hornberger, Renate; Greff, Zoltán; Kéri, György; Mann, Matthias; Daub, Henrik

    2009-01-01

    Members of the human protein kinase superfamily are the major regulatory enzymes involved in the activity control of eukaryotic signal transduction pathways. As protein kinases reside at the nodes of phosphorylation-based signal transmission, comprehensive analysis of their cellular expression and site-specific phosphorylation can provide important insights into the architecture and functionality of signaling networks. However, in global proteome studies, low cellular abundance of protein kinases often results in rather minor peptide species that are occluded by a vast excess of peptides from other cellular proteins. These analytical limitations create a rationale for kinome-wide enrichment of protein kinases prior to mass spectrometry analysis. Here, we employed stable isotope labeling by amino acids in cell culture (SILAC) to compare the binding characteristics of three kinase-selective affinity resins by quantitative mass spectrometry. The evaluated pre-fractionation tools possessed pyrido[2,3-d]pyrimidine-based kinase inhibitors as immobilized capture ligands and retained considerable subsets of the human kinome. Based on these results, an affinity resin displaying the broadly selective kinase ligand VI16832 was employed to quantify the relative expression of more than 170 protein kinases across three different, SILAC-encoded cancer cell lines. These experiments demonstrated the feasibility of comparative kinome profiling in a compact experimental format. Interestingly, we found high levels of cytoplasmic and low levels of receptor tyrosine kinases in MV4–11 leukemia cells compared with the adherent cancer lines HCT116 and MDA-MB-435S. The VI16832 resin was further exploited to pre-fractionate kinases for targeted phosphoproteomics analysis, which revealed about 1200 distinct phosphorylation sites on more than 200 protein kinases. This hitherto largest survey of site-specific phosphorylation across the kinome significantly expands the basis for functional follow-up studies on protein kinase regulation. In conclusion, the straightforward experimental procedures described here enable different implementations of kinase-selective proteomics with considerable potential for future signal transduction and kinase drug target analysis. PMID:19369195

  4. Serum quantitative proteomic analysis reveals potential zinc-associated biomarkers for nonbacterial prostatitis.

    PubMed

    Yang, Xiaoli; Li, Hongtao; Zhang, Chengdong; Lin, Zhidi; Zhang, Xinhua; Zhang, Youjie; Yu, Yanbao; Liu, Kun; Li, Muyan; Zhang, Yuening; Lv, Wenxin; Xie, Yuanliang; Lu, Zheng; Wu, Chunlei; Teng, Ruobing; Lu, Shaoming; He, Min; Mo, Zengnan

    2015-10-01

    Prostatitis is one of the most common urological problems afflicting adult men. The etiology and pathogenesis of nonbacterial prostatitis, which accounts for 90-95% of cases, is largely unknown. As serum proteins often indicate the overall pathologic status of patients, we hypothesized that protein biomarkers of prostatitis might be identified by comparing the serum proteomes of patients with and without nonbacterial prostatitis. All untreated samples were collected from subjects attending the Fangchenggang Area Male Health and Examination Survey (FAMHES). We profiled pooled serum samples from four carefully selected groups of patients (n = 10/group) representing the various categories of nonbacterial prostatitis (IIIa, IIIb, and IV) and matched healthy controls using a mass spectrometry-based 4-plex iTRAQ proteomic approach. More than 160 samples were validated by ELISA. Overall, 69 proteins were identified. Among them, 42, 52, and 37 proteins were identified with differential expression in Category IIIa, IIIb, and IV prostatitis, respectively. The 19 common proteins were related to immunity and defense, ion binding, transport, and proteolysis. Two zinc-binding proteins, superoxide dismutase 3 (SOD3), and carbonic anhydrase I (CA1), were significantly higher in all types of prostatitis than in the control. A receiver operating characteristic curve estimated sensitivities of 50.4 and 68.1% and specificities of 92.1 and 83.8% for CA1 and SOD3, respectively, in detecting nonbacterial prostatitis. The serum CA1 concentration was inversely correlated to the zinc concentration in expressed-prostatic secretions. Our findings suggest that SOD3 and CA1 are potential diagnostic markers of nonbacterial prostatitis, although further large-scale studies are required. The molecular profiles of nonbacterial prostatitis pathogenesis may lay a foundation for discovery of new therapies. © 2015 Wiley Periodicals, Inc.

  5. Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework

    PubMed Central

    2012-01-01

    Background For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. Results We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. Conclusion The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources. PMID:23216909

  6. Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework.

    PubMed

    Lewis, Steven; Csordas, Attila; Killcoyne, Sarah; Hermjakob, Henning; Hoopmann, Michael R; Moritz, Robert L; Deutsch, Eric W; Boyle, John

    2012-12-05

    For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources.

  7. Expediting SRM assay development for large-scale targeted proteomics experiments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wu, Chaochao; Shi, Tujin; Brown, Joseph N.

    2014-08-22

    Due to their high sensitivity and specificity, targeted proteomics measurements, e.g. selected reaction monitoring (SRM), are becoming increasingly popular for biological and translational applications. Selection of optimal transitions and optimization of collision energy (CE) are important assay development steps for achieving sensitive detection and accurate quantification; however, these steps can be labor-intensive, especially for large-scale applications. Herein, we explored several options for accelerating SRM assay development evaluated in the context of a relatively large set of 215 synthetic peptide targets. We first showed that HCD fragmentation is very similar to CID in triple quadrupole (QQQ) instrumentation, and by selection ofmore » top six y fragment ions from HCD spectra, >86% of top transitions optimized from direct infusion on QQQ instrument are covered. We also demonstrated that the CE calculated by existing prediction tools was less accurate for +3 precursors, and a significant increase in intensity for transitions could be obtained using a new CE prediction equation constructed from the present experimental data. Overall, our study illustrates the feasibility of expediting the development of larger numbers of high-sensitivity SRM assays through automation of transitions selection and accurate prediction of optimal CE to improve both SRM throughput and measurement quality.« less

  8. Optimizing of MALDI-ToF-based low-molecular-weight serum proteome pattern analysis in detection of breast cancer patients; the effect of albumin removal on classification performance.

    PubMed

    Pietrowska, M; Marczak, L; Polanska, J; Nowicka, E; Behrent, K; Tarnawski, R; Stobiecki, M; Polanski, A; Widlak, P

    2010-01-01

    Mass spectrometry-based analysis of the serum proteome allows identifying multi-peptide patterns/signatures specific for blood of cancer patients, thus having high potential value for cancer diagnostics. However, because of problems with optimization and standardization of experimental and computational design, none of identified proteome patterns/signatures was approved for diagnostics in clinical practice as yet. Here we compared two methods of serum sample preparation for mass spectrometry-based proteome pattern analysis aimed to identify biomarkers that could be used in early detection of breast cancer patients. Blood samples were collected in a group of 92 patients diagnosed at early (I and II) stages of the disease before the start of therapy, and in a group of age-matched healthy controls (104 women). Serum specimens were purified and analyzed using MALDI-ToF spectrometry, either directly or after membrane filtration (50 kDa cut-off) to remove albumin and other large serum proteins. Mass spectra of the low-molecular-weight fraction (2-10 kDa) of the serum proteome were resolved using the Gaussian mixture decomposition, and identified spectral components were used to build classifiers that differentiated samples from breast cancer patients and healthy persons. Mass spectra of complete serum and membrane-filtered albumin-depleted samples have apparently different structure and peaks specific for both types of samples could be identified. The optimal classifier built for the complete serum specimens consisted of 8 spectral components, and had 81% specificity and 72% sensitivity, while that built for the membrane-filtered samples consisted of 4 components, and had 80% specificity and 81% sensitivity. We concluded that pre-processing of samples to remove albumin might be recommended before MALDI-ToF mass spectrometric analysis of the low-molecular-weight components of human serum Keywords: albumin removal; breast cancer; clinical proteomics; mass spectrometry; pattern analysis; serum proteome.

  9. A comprehensive and scalable database search system for metaproteomics.

    PubMed

    Chatterjee, Sandip; Stupp, Gregory S; Park, Sung Kyu Robin; Ducom, Jean-Christophe; Yates, John R; Su, Andrew I; Wolan, Dennis W

    2016-08-16

    Mass spectrometry-based shotgun proteomics experiments rely on accurate matching of experimental spectra against a database of protein sequences. Existing computational analysis methods are limited in the size of their sequence databases, which severely restricts the proteomic sequencing depth and functional analysis of highly complex samples. The growing amount of public high-throughput sequencing data will only exacerbate this problem. We designed a broadly applicable metaproteomic analysis method (ComPIL) that addresses protein database size limitations. Our approach to overcome this significant limitation in metaproteomics was to design a scalable set of sequence databases assembled for optimal library querying speeds. ComPIL was integrated with a modified version of the search engine ProLuCID (termed "Blazmass") to permit rapid matching of experimental spectra. Proof-of-principle analysis of human HEK293 lysate with a ComPIL database derived from high-quality genomic libraries was able to detect nearly all of the same peptides as a search with a human database (~500x fewer peptides in the database), with a small reduction in sensitivity. We were also able to detect proteins from the adenovirus used to immortalize these cells. We applied our method to a set of healthy human gut microbiome proteomic samples and showed a substantial increase in the number of identified peptides and proteins compared to previous metaproteomic analyses, while retaining a high degree of protein identification accuracy and allowing for a more in-depth characterization of the functional landscape of the samples. The combination of ComPIL with Blazmass allows proteomic searches to be performed with database sizes much larger than previously possible. These large database searches can be applied to complex meta-samples with unknown composition or proteomic samples where unexpected proteins may be identified. The protein database, proteomic search engine, and the proteomic data files for the 5 microbiome samples characterized and discussed herein are open source and available for use and additional analysis.

  10. WholePathwayScope: a comprehensive pathway-based analysis tool for high-throughput data

    PubMed Central

    Yi, Ming; Horton, Jay D; Cohen, Jonathan C; Hobbs, Helen H; Stephens, Robert M

    2006-01-01

    Background Analysis of High Throughput (HTP) Data such as microarray and proteomics data has provided a powerful methodology to study patterns of gene regulation at genome scale. A major unresolved problem in the post-genomic era is to assemble the large amounts of data generated into a meaningful biological context. We have developed a comprehensive software tool, WholePathwayScope (WPS), for deriving biological insights from analysis of HTP data. Result WPS extracts gene lists with shared biological themes through color cue templates. WPS statistically evaluates global functional category enrichment of gene lists and pathway-level pattern enrichment of data. WPS incorporates well-known biological pathways from KEGG (Kyoto Encyclopedia of Genes and Genomes) and Biocarta, GO (Gene Ontology) terms as well as user-defined pathways or relevant gene clusters or groups, and explores gene-term relationships within the derived gene-term association networks (GTANs). WPS simultaneously compares multiple datasets within biological contexts either as pathways or as association networks. WPS also integrates Genetic Association Database and Partial MedGene Database for disease-association information. We have used this program to analyze and compare microarray and proteomics datasets derived from a variety of biological systems. Application examples demonstrated the capacity of WPS to significantly facilitate the analysis of HTP data for integrative discovery. Conclusion This tool represents a pathway-based platform for discovery integration to maximize analysis power. The tool is freely available at . PMID:16423281

  11. Progressive muscle proteome changes in a clinically relevant pig model of Duchenne muscular dystrophy.

    PubMed

    Fröhlich, Thomas; Kemter, Elisabeth; Flenkenthaler, Florian; Klymiuk, Nikolai; Otte, Kathrin A; Blutke, Andreas; Krause, Sabine; Walter, Maggie C; Wanke, Rüdiger; Wolf, Eckhard; Arnold, Georg J

    2016-09-16

    Duchenne muscular dystrophy (DMD) is caused by genetic deficiency of dystrophin and characterized by massive structural and functional changes of skeletal muscle tissue, leading to terminal muscle failure. We recently generated a novel genetically engineered pig model reflecting pathological hallmarks of human DMD better than the widely used mdx mouse. To get insight into the hierarchy of molecular derangements during DMD progression, we performed a proteome analysis of biceps femoris muscle samples from 2-day-old and 3-month-old DMD and wild-type (WT) pigs. The extent of proteome changes in DMD vs. WT muscle increased markedly with age, reflecting progression of the pathological changes. In 3-month-old DMD muscle, proteins related to muscle repair such as vimentin, nestin, desmin and tenascin C were found to be increased, whereas a large number of respiratory chain proteins were decreased in abundance in DMD muscle, indicating serious disturbances in aerobic energy production and a reduction of functional muscle tissue. The combination of proteome data for fiber type specific myosin heavy chain proteins and immunohistochemistry showed preferential degeneration of fast-twitch fiber types in DMD muscle. The stage-specific proteome changes detected in this large animal model of clinically severe muscular dystrophy provide novel molecular readouts for future treatment trials.

  12. The Response of the Root Proteome to the Synthetic Strigolactone GR24 in Arabidopsis*

    PubMed Central

    Walton, Alan; Stes, Elisabeth; Goeminne, Geert; Braem, Lukas; Vuylsteke, Marnik; Matthys, Cedrick; De Cuyper, Carolien; Staes, An; Vandenbussche, Jonathan; Boyer, François-Didier; Vanholme, Ruben; Fromentin, Justine; Boerjan, Wout; Gevaert, Kris; Goormachtig, Sofie

    2016-01-01

    Strigolactones are plant metabolites that act as phytohormones and rhizosphere signals. Whereas most research on unraveling the action mechanisms of strigolactones is focused on plant shoots, we investigated proteome adaptation during strigolactone signaling in the roots of Arabidopsis thaliana. Through large-scale, time-resolved, and quantitative proteomics, the impact of the strigolactone analog rac-GR24 was elucidated on the root proteome of the wild type and the signaling mutant more axillary growth 2 (max2). Our study revealed a clear MAX2-dependent rac-GR24 response: an increase in abundance of enzymes involved in flavonol biosynthesis, which was reduced in the max2–1 mutant. Mass spectrometry-driven metabolite profiling and thin-layer chromatography experiments demonstrated that these changes in protein expression lead to the accumulation of specific flavonols. Moreover, quantitative RT-PCR revealed that the flavonol-related protein expression profile was caused by rac-GR24-induced changes in transcript levels of the corresponding genes. This induction of flavonol production was shown to be activated by the two pure enantiomers that together make up rac-GR24. Finally, our data provide much needed clues concerning the multiple roles played by MAX2 in the roots and a comprehensive view of the rac-GR24-induced response in the root proteome. PMID:27317401

  13. Classification of Complete Proteomes of Different Organisms and Protein Sets Based on Their Protein Distributions in Terms of Some Key Attributes of Proteins

    PubMed Central

    Ma, Yue; Tuskan, Gerald A.

    2018-01-01

    The existence of complete genome sequences makes it important to develop different approaches for classification of large-scale data sets and to make extraction of biological insights easier. Here, we propose an approach for classification of complete proteomes/protein sets based on protein distributions on some basic attributes. We demonstrate the usefulness of this approach by determining protein distributions in terms of two attributes: protein lengths and protein intrinsic disorder contents (ID). The protein distributions based on L and ID are surveyed for representative proteome organisms and protein sets from the three domains of life. The two-dimensional maps (designated as fingerprints here) from the protein distribution densities in the LD space defined by ln(L) and ID are then constructed. The fingerprints for different organisms and protein sets are found to be distinct with each other, and they can therefore be used for comparative studies. As a test case, phylogenetic trees have been constructed based on the protein distribution densities in the fingerprints of proteomes of organisms without performing any protein sequence comparison and alignments. The phylogenetic trees generated are biologically meaningful, demonstrating that the protein distributions in the LD space may serve as unique phylogenetic signals of the organisms at the proteome level. PMID:29686995

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kolker, Eugene

    Our project focused primarily on analysis of different types of data produced by global high-throughput technologies, data integration of gene annotation, and gene and protein expression information, as well as on getting a better functional annotation of Shewanella genes. Specifically, four of our numerous major activities and achievements include the development of: statistical models for identification and expression proteomics, superior to currently available approaches (including our own earlier ones); approaches to improve gene annotations on the whole-organism scale; standards for annotation, transcriptomics and proteomics approaches; and generalized approaches for data integration of gene annotation, gene and protein expression information.

  15. Review of software tools for design and analysis of large scale MRM proteomic datasets.

    PubMed

    Colangelo, Christopher M; Chung, Lisa; Bruce, Can; Cheung, Kei-Hoi

    2013-06-15

    Selective or Multiple Reaction monitoring (SRM/MRM) is a liquid-chromatography (LC)/tandem-mass spectrometry (MS/MS) method that enables the quantitation of specific proteins in a sample by analyzing precursor ions and the fragment ions of their selected tryptic peptides. Instrumentation software has advanced to the point that thousands of transitions (pairs of primary and secondary m/z values) can be measured in a triple quadrupole instrument coupled to an LC, by a well-designed scheduling and selection of m/z windows. The design of a good MRM assay relies on the availability of peptide spectra from previous discovery-phase LC-MS/MS studies. The tedious aspect of manually developing and processing MRM assays involving thousands of transitions has spurred to development of software tools to automate this process. Software packages have been developed for project management, assay development, assay validation, data export, peak integration, quality assessment, and biostatistical analysis. No single tool provides a complete end-to-end solution, thus this article reviews the current state and discusses future directions of these software tools in order to enable researchers to combine these tools for a comprehensive targeted proteomics workflow. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  16. Lysine acetylome profiling uncovers novel histone deacetylase substrate proteins in Arabidopsis.

    PubMed

    Hartl, Markus; Füßl, Magdalena; Boersema, Paul J; Jost, Jan-Oliver; Kramer, Katharina; Bakirbas, Ahmet; Sindlinger, Julia; Plöchinger, Magdalena; Leister, Dario; Uhrig, Glen; Moorhead, Greg Bg; Cox, Jürgen; Salvucci, Michael E; Schwarzer, Dirk; Mann, Matthias; Finkemeier, Iris

    2017-10-23

    Histone deacetylases have central functions in regulating stress defenses and development in plants. However, the knowledge about the deacetylase functions is largely limited to histones, although these enzymes were found in diverse subcellular compartments. In this study, we determined the proteome-wide signatures of the RPD3/HDA1 class of histone deacetylases in Arabidopsis Relative quantification of the changes in the lysine acetylation levels was determined on a proteome-wide scale after treatment of Arabidopsis leaves with deacetylase inhibitors apicidin and trichostatin A. We identified 91 new acetylated candidate proteins other than histones, which are potential substrates of the RPD3/HDA1-like histone deacetylases in Arabidopsis , of which at least 30 of these proteins function in nucleic acid binding. Furthermore, our analysis revealed that histone deacetylase 14 (HDA14) is the first organellar-localized RPD3/HDA1 class protein found to reside in the chloroplasts and that the majority of its protein targets have functions in photosynthesis. Finally, the analysis of HDA14 loss-of-function mutants revealed that the activation state of RuBisCO is controlled by lysine acetylation of RuBisCO activase under low-light conditions. © 2017 The Authors. Published under the terms of the CC BY 4.0 license.

  17. Network Analysis of Epidermal Growth Factor Signaling Using Integrated Genomic, Proteomic and Phosphorylation Data

    PubMed Central

    Waters, Katrina M.; Liu, Tao; Quesenberry, Ryan D.; Willse, Alan R.; Bandyopadhyay, Somnath; Kathmann, Loel E.; Weber, Thomas J.; Smith, Richard D.; Wiley, H. Steven; Thrall, Brian D.

    2012-01-01

    To understand how integration of multiple data types can help decipher cellular responses at the systems level, we analyzed the mitogenic response of human mammary epithelial cells to epidermal growth factor (EGF) using whole genome microarrays, mass spectrometry-based proteomics and large-scale western blots with over 1000 antibodies. A time course analysis revealed significant differences in the expression of 3172 genes and 596 proteins, including protein phosphorylation changes measured by western blot. Integration of these disparate data types showed that each contributed qualitatively different components to the observed cell response to EGF and that varying degrees of concordance in gene expression and protein abundance measurements could be linked to specific biological processes. Networks inferred from individual data types were relatively limited, whereas networks derived from the integrated data recapitulated the known major cellular responses to EGF and exhibited more highly connected signaling nodes than networks derived from any individual dataset. While cell cycle regulatory pathways were altered as anticipated, we found the most robust response to mitogenic concentrations of EGF was induction of matrix metalloprotease cascades, highlighting the importance of the EGFR system as a regulator of the extracellular environment. These results demonstrate the value of integrating multiple levels of biological information to more accurately reconstruct networks of cellular response. PMID:22479638

  18. Proteomic study reveals a co-occurrence of gallic acid-induced apoptosis and glycolysis in B16F10 melanoma cells.

    PubMed

    Liu, Cheng; Lin, Jen-Jie; Yang, Zih-Yan; Tsai, Chi-Chu; Hsu, Jue-Liang; Wu, Yu-Jen

    2014-12-03

    Gallic acid (GA) has long been associated with a wide range of biological activities. In this study, its antitumor effect against B16F10 melanoma cells was demonstrated by MTT assay, cell migration assay, wound-healing assay, and flow cytometric analysis. GA with a concentration >200 μM shows apoptotic activity toward B16F10 cells. According to Western blotting data, overexpressions of cleaved forms of caspase-9, caspase-3, and PARP-1 and pro-apoptotic Bax and Bad, accompanied by underexpressed anti-apoptotic Bcl-2 and Bcl-xL indicate that GA induces B16F10 cell apoptosis via mitochondrial pathway. The 2-DE based comparative proteomics was further employed in B16F10 cells with and without GA treatment for a large-scale protein expression profiling. A total of 41 differential protein spots were quantified, and their identities were characterized using LC-MS/MS analysis and database matching. In addition to some regulated proteins that were associated with apoptosis, interestingly, some identified proteins involved in glycolysis such as glucokinase, α-enolase, aldolase, pyruvate kinase, and GAPDH were simultaneously up-regulated, which reveals that the GA-induced cellular apoptosis in B16 melanoma cells is associated with metabolic glycolysis.

  19. Evaluation of empirical rule of linearly correlated peptide selection (ERLPS) for proteotypic peptide-based quantitative proteomics.

    PubMed

    Liu, Kehui; Zhang, Jiyang; Fu, Bin; Xie, Hongwei; Wang, Yingchun; Qian, Xiaohong

    2014-07-01

    Precise protein quantification is essential in comparative proteomics. Currently, quantification bias is inevitable when using proteotypic peptide-based quantitative proteomics strategy for the differences in peptides measurability. To improve quantification accuracy, we proposed an "empirical rule for linearly correlated peptide selection (ERLPS)" in quantitative proteomics in our previous work. However, a systematic evaluation on general application of ERLPS in quantitative proteomics under diverse experimental conditions needs to be conducted. In this study, the practice workflow of ERLPS was explicitly illustrated; different experimental variables, such as, different MS systems, sample complexities, sample preparations, elution gradients, matrix effects, loading amounts, and other factors were comprehensively investigated to evaluate the applicability, reproducibility, and transferability of ERPLS. The results demonstrated that ERLPS was highly reproducible and transferable within appropriate loading amounts and linearly correlated response peptides should be selected for each specific experiment. ERLPS was used to proteome samples from yeast to mouse and human, and in quantitative methods from label-free to O18/O16-labeled and SILAC analysis, and enabled accurate measurements for all proteotypic peptide-based quantitative proteomics over a large dynamic range. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. Compartment-resolved Proteomic Analysis of Mouse Aorta during Atherosclerotic Plaque Formation Reveals Osteoclast-specific Protein Expression.

    PubMed

    Wierer, Michael; Prestel, Matthias; Schiller, Herbert B; Yan, Guangyao; Schaab, Christoph; Azghandi, Sepiede; Werner, Julia; Kessler, Thorsten; Malik, Rainer; Murgia, Marta; Aherrahrou, Zouhair; Schunkert, Heribert; Dichgans, Martin; Mann, Matthias

    2018-02-01

    Atherosclerosis leads to vascular lesions that involve major rearrangements of the vascular proteome, especially of the extracellular matrix (ECM). Using single aortas from ApoE knock out mice, we quantified formation of plaques by single-run, high-resolution mass spectrometry (MS)-based proteomics. To probe localization on a proteome-wide scale we employed quantitative detergent solubility profiling. This compartment- and time-resolved resource of atherogenesis comprised 5117 proteins, 182 of which changed their expression status in response to vessel maturation and atherosclerotic plaque development. In the insoluble ECM proteome, 65 proteins significantly changed, including relevant collagens, matrix metalloproteinases and macrophage derived proteins. Among novel factors in atherosclerosis, we identified matrilin-2, the collagen IV crosslinking enzyme peroxidasin as well as the poorly characterized MAM-domain containing 2 (Mamdc2) protein as being up-regulated in the ECM during atherogenesis. Intriguingly, three subunits of the osteoclast specific V-ATPase complex were strongly increased in mature plaques with an enrichment in macrophages thus implying an active de-mineralization function. © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.

  1. Compartment-resolved Proteomic Analysis of Mouse Aorta during Atherosclerotic Plaque Formation Reveals Osteoclast-specific Protein Expression*

    PubMed Central

    Wierer, Michael; Prestel, Matthias; Schiller, Herbert B.; Yan, Guangyao; Schaab, Christoph; Azghandi, Sepiede; Werner, Julia; Kessler, Thorsten; Malik, Rainer; Murgia, Marta; Aherrahrou, Zouhair; Schunkert, Heribert; Dichgans, Martin; Mann, Matthias

    2018-01-01

    Atherosclerosis leads to vascular lesions that involve major rearrangements of the vascular proteome, especially of the extracellular matrix (ECM). Using single aortas from ApoE knock out mice, we quantified formation of plaques by single-run, high-resolution mass spectrometry (MS)-based proteomics. To probe localization on a proteome-wide scale we employed quantitative detergent solubility profiling. This compartment- and time-resolved resource of atherogenesis comprised 5117 proteins, 182 of which changed their expression status in response to vessel maturation and atherosclerotic plaque development. In the insoluble ECM proteome, 65 proteins significantly changed, including relevant collagens, matrix metalloproteinases and macrophage derived proteins. Among novel factors in atherosclerosis, we identified matrilin-2, the collagen IV crosslinking enzyme peroxidasin as well as the poorly characterized MAM-domain containing 2 (Mamdc2) protein as being up-regulated in the ECM during atherogenesis. Intriguingly, three subunits of the osteoclast specific V-ATPase complex were strongly increased in mature plaques with an enrichment in macrophages thus implying an active de-mineralization function. PMID:29208753

  2. Refining comparative proteomics by spectral counting to account for shared peptides and multiple search engines

    PubMed Central

    Chen, Yao-Yi; Dasari, Surendra; Ma, Ze-Qiang; Vega-Montoto, Lorenzo J.; Li, Ming

    2013-01-01

    Spectral counting has become a widely used approach for measuring and comparing protein abundance in label-free shotgun proteomics. However, when analyzing complex samples, the ambiguity of matching between peptides and proteins greatly affects the assessment of peptide and protein inventories, differentiation, and quantification. Meanwhile, the configuration of database searching algorithms that assign peptides to MS/MS spectra may produce different results in comparative proteomic analysis. Here, we present three strategies to improve comparative proteomics through spectral counting. We show that comparing spectral counts for peptide groups rather than for protein groups forestalls problems introduced by shared peptides. We demonstrate the advantage and flexibility of this new method in two datasets. We present four models to combine four popular search engines that lead to significant gains in spectral counting differentiation. Among these models, we demonstrate a powerful vote counting model that scales well for multiple search engines. We also show that semi-tryptic searching outperforms tryptic searching for comparative proteomics. Overall, these techniques considerably improve protein differentiation on the basis of spectral count tables. PMID:22552787

  3. Refining comparative proteomics by spectral counting to account for shared peptides and multiple search engines.

    PubMed

    Chen, Yao-Yi; Dasari, Surendra; Ma, Ze-Qiang; Vega-Montoto, Lorenzo J; Li, Ming; Tabb, David L

    2012-09-01

    Spectral counting has become a widely used approach for measuring and comparing protein abundance in label-free shotgun proteomics. However, when analyzing complex samples, the ambiguity of matching between peptides and proteins greatly affects the assessment of peptide and protein inventories, differentiation, and quantification. Meanwhile, the configuration of database searching algorithms that assign peptides to MS/MS spectra may produce different results in comparative proteomic analysis. Here, we present three strategies to improve comparative proteomics through spectral counting. We show that comparing spectral counts for peptide groups rather than for protein groups forestalls problems introduced by shared peptides. We demonstrate the advantage and flexibility of this new method in two datasets. We present four models to combine four popular search engines that lead to significant gains in spectral counting differentiation. Among these models, we demonstrate a powerful vote counting model that scales well for multiple search engines. We also show that semi-tryptic searching outperforms tryptic searching for comparative proteomics. Overall, these techniques considerably improve protein differentiation on the basis of spectral count tables.

  4. LFQuant: a label-free fast quantitative analysis tool for high-resolution LC-MS/MS proteomics data.

    PubMed

    Zhang, Wei; Zhang, Jiyang; Xu, Changming; Li, Ning; Liu, Hui; Ma, Jie; Zhu, Yunping; Xie, Hongwei

    2012-12-01

    Database searching based methods for label-free quantification aim to reconstruct the peptide extracted ion chromatogram based on the identification information, which can limit the search space and thus make the data processing much faster. The random effect of the MS/MS sampling can be remedied by cross-assignment among different runs. Here, we present a new label-free fast quantitative analysis tool, LFQuant, for high-resolution LC-MS/MS proteomics data based on database searching. It is designed to accept raw data in two common formats (mzXML and Thermo RAW), and database search results from mainstream tools (MASCOT, SEQUEST, and X!Tandem), as input data. LFQuant can handle large-scale label-free data with fractionation such as SDS-PAGE and 2D LC. It is easy to use and provides handy user interfaces for data loading, parameter setting, quantitative analysis, and quantitative data visualization. LFQuant was compared with two common quantification software packages, MaxQuant and IDEAL-Q, on the replication data set and the UPS1 standard data set. The results show that LFQuant performs better than them in terms of both precision and accuracy, and consumes significantly less processing time. LFQuant is freely available under the GNU General Public License v3.0 at http://sourceforge.net/projects/lfquant/. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Lipid remodeling and an altered membrane-associated proteome may drive the differential effects of EPA and DHA treatment on skeletal muscle glucose uptake and protein accretion.

    PubMed

    Jeromson, Stewart; Mackenzie, Ivor; Doherty, Mary K; Whitfield, Phillip D; Bell, Gordon; Dick, James; Shaw, Andy; Rao, Francesco V; Ashcroft, Stephen P; Philp, Andrew; Galloway, Stuart D R; Gallagher, Iain; Hamilton, D Lee

    2018-06-01

    In striated muscle, eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) have differential effects on the metabolism of glucose and differential effects on the metabolism of protein. We have shown that, despite similar incorporation, treatment of C 2 C 12 myotubes (CM) with EPA but not DHA improves glucose uptake and protein accretion. We hypothesized that these differential effects of EPA and DHA may be due to divergent shifts in lipidomic profiles leading to altered proteomic profiles. We therefore carried out an assessment of the impact of treating CM with EPA and DHA on lipidomic and proteomic profiles. Fatty acid methyl esters (FAME) analysis revealed that both EPA and DHA led to similar but substantials changes in fatty acid profiles with the exception of arachidonic acid, which was decreased only by DHA, and docosapentanoic acid (DPA), which was increased only by EPA treatment. Global lipidomic analysis showed that EPA and DHA induced large alterations in the cellular lipid profiles and in particular, the phospholipid classes. Subsequent targeted analysis confirmed that the most differentially regulated species were phosphatidylcholines and phosphatidylethanolamines containing long-chain fatty acids with five (EPA treatment) or six (DHA treatment) double bonds. As these are typically membrane-associated lipid species we hypothesized that these treatments differentially altered the membrane-associated proteome. Stable isotope labeling by amino acids in cell culture (SILAC)-based proteomics of the membrane fraction revealed significant divergence in the effects of EPA and DHA on the membrane-associated proteome. We conclude that the EPA-specific increase in polyunsaturated long-chain fatty acids in the phospholipid fraction is associated with an altered membrane-associated proteome and these may be critical events in the metabolic remodeling induced by EPA treatment.

  6. PNAC: a protein nucleolar association classifier

    PubMed Central

    2011-01-01

    Background Although primarily known as the site of ribosome subunit production, the nucleolus is involved in numerous and diverse cellular processes. Recent large-scale proteomics projects have identified thousands of human proteins that associate with the nucleolus. However, in most cases, we know neither the fraction of each protein pool that is nucleolus-associated nor whether their association is permanent or conditional. Results To describe the dynamic localisation of proteins in the nucleolus, we investigated the extent of nucleolar association of proteins by first collating an extensively curated literature-derived dataset. This dataset then served to train a probabilistic predictor which integrates gene and protein characteristics. Unlike most previous experimental and computational studies of the nucleolar proteome that produce large static lists of nucleolar proteins regardless of their extent of nucleolar association, our predictor models the fluidity of the nucleolus by considering different classes of nucleolar-associated proteins. The new method predicts all human proteins as either nucleolar-enriched, nucleolar-nucleoplasmic, nucleolar-cytoplasmic or non-nucleolar. Leave-one-out cross validation tests reveal sensitivity values for these four classes ranging from 0.72 to 0.90 and positive predictive values ranging from 0.63 to 0.94. The overall accuracy of the classifier was measured to be 0.85 on an independent literature-based test set and 0.74 using a large independent quantitative proteomics dataset. While the three nucleolar-association groups display vastly different Gene Ontology biological process signatures and evolutionary characteristics, they collectively represent the most well characterised nucleolar functions. Conclusions Our proteome-wide classification of nucleolar association provides a novel representation of the dynamic content of the nucleolus. This model of nucleolar localisation thus increases the coverage while providing accurate and specific annotations of the nucleolar proteome. It will be instrumental in better understanding the central role of the nucleolus in the cell and its interaction with other subcellular compartments. PMID:21272300

  7. Computational Framework for Analysis of Prey–Prey Associations in Interaction Proteomics Identifies Novel Human Protein–Protein Interactions and Networks

    PubMed Central

    Saha, Sudipto; Dazard, Jean-Eudes; Xu, Hua; Ewing, Rob M.

    2013-01-01

    Large-scale protein–protein interaction data sets have been generated for several species including yeast and human and have enabled the identification, quantification, and prediction of cellular molecular networks. Affinity purification-mass spectrometry (AP-MS) is the preeminent methodology for large-scale analysis of protein complexes, performed by immunopurifying a specific “bait” protein and its associated “prey” proteins. The analysis and interpretation of AP-MS data sets is, however, not straightforward. In addition, although yeast AP-MS data sets are relatively comprehensive, current human AP-MS data sets only sparsely cover the human interactome. Here we develop a framework for analysis of AP-MS data sets that addresses the issues of noise, missing data, and sparsity of coverage in the context of a current, real world human AP-MS data set. Our goal is to extend and increase the density of the known human interactome by integrating bait–prey and cocomplexed preys (prey–prey associations) into networks. Our framework incorporates a score for each identified protein, as well as elements of signal processing to improve the confidence of identified protein–protein interactions. We identify many protein networks enriched in known biological processes and functions. In addition, we show that integrated bait–prey and prey–prey interactions can be used to refine network topology and extend known protein networks. PMID:22845868

  8. The Pacific Northwest National Laboratory library of bacterial and archaeal proteomic biodiversity

    DOE PAGES

    Payne, Samuel H.; Monroe, Matthew E.; Overall, Christopher C.; ...

    2015-08-18

    This dataset deposition announces the submission to public repositories of the PNNL Biodiversity Library, a large collection of global proteomics data for 112 bacterial and archaeal organisms. The data comprises 35,162 tandem mass spectrometry (MS/MS) datasets from ~10 years of research. All data has been searched, annotated and organized in a consistent manner to promote reuse by the community. Protein identifications were cross-referenced with KEGG functional annotations which allows for pathway oriented investigation. We present the data as a freely available community resource. A variety of data re-use options are described for computational modeling, proteomics assay design and bioengineering. Instrumentmore » data and analysis files are available at ProteomeXchange via the MassIVE partner repository under the identifiers PXD001860 and MSV000079053.« less

  9. The Pacific Northwest National Laboratory library of bacterial and archaeal proteomic biodiversity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Payne, Samuel H.; Monroe, Matthew E.; Overall, Christopher C.

    This dataset deposition announces the submission to public repositories of the PNNL Biodiversity Library, a large collection of global proteomics data for 112 bacterial and archaeal organisms. The data comprises 35,162 tandem mass spectrometry (MS/MS) datasets from ~10 years of research. All data has been searched, annotated and organized in a consistent manner to promote reuse by the community. Protein identifications were cross-referenced with KEGG functional annotations which allows for pathway oriented investigation. We present the data as a freely available community resource. A variety of data re-use options are described for computational modeling, proteomics assay design and bioengineering. Instrumentmore » data and analysis files are available at ProteomeXchange via the MassIVE partner repository under the identifiers PXD001860 and MSV000079053.« less

  10. Novel "omics" approach for study of low-abundance, low-molecular-weight components of a complex biological tissue: regional differences between chorionic and basal plates of the human placenta.

    PubMed

    Kedia, Komal; Nichols, Caitlin A; Thulin, Craig D; Graves, Steven W

    2015-11-01

    Tissue proteomics has relied heavily on two-dimensional gel electrophoresis, for protein separation and quantification, then single protein isolation, trypsin digestion, and mass spectrometric protein identification. Such methods are predominantly used for study of high-abundance, full-length proteins. Tissue peptidomics has recently been developed but is still used to study the most highly abundant species, often resulting in observation and identification of dozens of peptides only. Tissue lipidomics is likewise new, and reported studies are limited. We have developed an "omics" approach that enables over 7,000 low-molecular-weight, low-abundance species to be surveyed and have applied this to human placental tissue. Because the placenta is believed to be involved in complications of pregnancy, its proteomic evaluation is of substantial interest. In previous research on the placental proteome, abundant, high-molecular-weight proteins have been studied. Application of large-scale, global proteomics or peptidomics to the placenta have been limited, and would be challenging owing to the anatomic complexity and broad concentration range of proteins in this tissue. In our approach, involving protein depletion, capillary liquid chromatography, and tandem mass spectrometry, we attempted to identify molecular differences between two regions of the same placenta with only slightly different cellular composition. Our analysis revealed 16 species with statistically significant differences between the two regions. Tandem mass spectrometry enabled successful sequencing, or otherwise enabled chemical characterization, of twelve of these. The successful discovery and identification of regional differences between the expression of low-abundance, low-molecular weight biomolecules reveals the potential of our approach.

  11. Long-Gradient Separations Coupled with Selected Reaction Monitoring for Highly Sensitive, Large Scale Targeted Protein Quantification in a Single Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shi, Tujin; Fillmore, Thomas L.; Gao, Yuqian

    2013-10-01

    Long-gradient separations coupled to tandem MS were recently demonstrated to provide a deep proteome coverage for global proteomics; however, such long-gradient separations have not been explored for targeted proteomics. Herein, we investigate the potential performance of the long-gradient separations coupled with selected reaction monitoring (LG-SRM) for targeted protein quantification. Direct comparison of LG-SRM (5 h gradient) and conventional LC-SRM (45 min gradient) showed that the long-gradient separations significantly reduced background interference levels and provided an 8- to 100-fold improvement in LOQ for target proteins in human female serum. Based on at least one surrogate peptide per protein, an LOQ ofmore » 10 ng/mL was achieved for the two spiked proteins in non-depleted human serum. The LG-SRM detection of seven out of eight endogenous plasma proteins expressed at ng/mL or sub-ng/mL levels in clinical patient sera was also demonstrated. A correlation coefficient of >0.99 was observed for the results of LG-SRM and ELISA measurements for prostate-specific antigen (PSA) in selected patient sera. Further enhancement of LG-SRM sensitivity was achieved by applying front-end IgY14 immunoaffinity depletion. Besides improved sensitivity, LG-SRM offers at least 3 times higher multiplexing capacity than conventional LC-SRM due to ~3-fold increase in average peak widths for a 300-min gradient compared to a 45-min gradient. Therefore, LG-SRM holds great potential for bridging the gap between global and targeted proteomics due to its advantages in both sensitivity and multiplexing capacity.« less

  12. Completed | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    Prior to the current Clinical Proteomic Tumor Analysis Consortium (CPTAC), previously funded initiatives associated with clinical proteomics research included: Clinical Proteomic Tumor Analysis Consortium (CPTAC 2.0) Clinical Proteomic Technologies for Cancer Initiative (CPTC) Mouse Proteomic Technologies Initiative

  13. Quantitative proteomics in Giardia duodenalis-Achievements and challenges.

    PubMed

    Emery, Samantha J; Lacey, Ernest; Haynes, Paul A

    2016-08-01

    Giardia duodenalis (syn. G. lamblia and G. intestinalis) is a protozoan parasite of vertebrates and a major contributor to the global burden of diarrheal diseases and gastroenteritis. The publication of multiple genome sequences in the G. duodenalis species complex has provided important insights into parasite biology, and made post-genomic technologies, including proteomics, significantly more accessible. The aims of proteomics are to identify and quantify proteins present in a cell, and assign functions to them within the context of dynamic biological systems. In Giardia, proteomics in the post-genomic era has transitioned from reliance on gel-based systems to utilisation of a diverse array of techniques based on bottom-up LC-MS/MS technologies. Together, these have generated crucial foundations for subcellular proteomes, elucidated intra- and inter-assemblage isolate variation, and identified pathways and markers in differentiation, host-parasite interactions and drug resistance. However, in Giardia, proteomics remains an emerging field, with considerable shortcomings evident from the published research. These include a bias towards assemblage A, a lack of emphasis on quantitative analytical techniques, and limited information on post-translational protein modifications. Additionally, there are multiple areas of research for which proteomic data is not available to add value to published transcriptomic data. The challenge of amalgamating data in the systems biology paradigm necessitates the further generation of large, high-quality quantitative datasets to accurately model parasite biology. This review surveys the current proteomic research available for Giardia and evaluates their technical and quantitative approaches, while contextualising their biological insights into parasite pathology, isolate variation and eukaryotic evolution. Finally, we propose areas of priority for the generation of future proteomic data to explore fundamental questions in Giardia, including the analysis of post-translational modifications, and the design of MS-based assays for validation of differentially expressed proteins in large datasets. Copyright © 2016 Elsevier B.V. All rights reserved.

  14. Combined analysis of transcriptome and proteome data as a tool for the identification of candidate biomarkers in renal cell carcinoma

    PubMed Central

    Seliger, Barbara; Dressler, Sven P.; Wang, Ena; Kellner, Roland; Recktenwald, Christian V.; Lottspeich, Friedrich; Marincola, Francesco M.; Baumgärtner, Maja; Atkins, Derek; Lichtenfels, Rudolf

    2012-01-01

    Results obtained from expression profilings of renal cell carcinoma using different “ome”-based approaches and comprehensive data analysis demonstrated that proteome-based technologies and cDNA microarray analyses complement each other during the discovery phase for disease-related candidate biomarkers. The integration of the respective data revealed the uniqueness and complementarities of the different technologies. While comparative cDNA microarray analyses though restricted to upregulated targets largely revealed genes involved in controlling gene/protein expression (19%) and signal transduction processes (13%), proteomics/PROTEOMEX-defined candidate biomarkers include enzymes of the cellular metabolism (36%), transport proteins (12%) and cell motility/structural molecules (10%). Candidate biomarkers defined by proteomics and PROTEOMEX are frequently shared, whereas the sharing rate between cDNA microarray and proteome-based profilings is limited. Putative candidate biomarkers provide insights into their cellular (dys)function and their diagnostic/prognostic value but still warrant further validation in larger patient numbers. Based on the fact that merely 3 candidate biomarkers were shared by all applied technologies, namely annexin A4, tubulin alpha-1A chain and ubiquitin carboxyl-terminal hydrolase L1 the analysis at a single hierarchical level of biological regulation seems to provide only limited results thus emphasizing the importance and benefit of performing rather combinatorial screenings which can complement the standard clinical predictors. PMID:19235166

  15. Proteomics-based network analysis characterizes biological processes and pathways activated by preconditioned mesenchymal stem cells in cardiac repair mechanisms.

    PubMed

    Di Silvestre, Dario; Brambilla, Francesca; Scardoni, Giovanni; Brunetti, Pietro; Motta, Sara; Matteucci, Marco; Laudanna, Carlo; Recchia, Fabio A; Lionetti, Vincenzo; Mauri, Pierluigi

    2017-05-01

    We have demonstrated that intramyocardial delivery of human mesenchymal stem cells preconditioned with a hyaluronan mixed ester of butyric and retinoic acid (MSCp + ) is more effective in preventing the decay of regional myocardial contractility in a swine model of myocardial infarction (MI). However, the understanding of the role of MSCp + in proteomic remodeling of cardiac infarcted tissue is not complete. We therefore sought to perform a comprehensive analysis of the proteome of infarct remote (RZ) and border zone (BZ) of pigs treated with MSCp + or unconditioned stem cells. Heart tissues were analyzed by MudPIT and differentially expressed proteins were selected by a label-free approach based on spectral counting. Protein profiles were evaluated by using PPI networks and their topological analysis. The proteomic remodeling was largely prevented in MSCp + group. Extracellular proteins involved in fibrosis were down-regulated, while energetic pathways were globally up-regulated. Cardioprotectant pathways involved in the production of keto acid metabolites were also activated. Additionally, we found that new hub proteins support the cardioprotective phenotype characterizing the left ventricular BZ treated with MSCp + . In fact, the up-regulation of angiogenic proteins NCL and RAC1 can be explained by the increase of capillary density induced by MSCp + . Our results show that angiogenic pathways appear to be uniquely positioned to integrate signaling with energetic pathways involving cardiac repair. Our findings prompt the use of proteomics-based network analysis to optimize new approaches preventing the post-ischemic proteomic remodeling that may underlie the limited self-repair ability of adult heart. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Aptamer-Based Multiplexed Proteomic Technology for Biomarker Discovery

    PubMed Central

    Gold, Larry; Ayers, Deborah; Bertino, Jennifer; Bock, Christopher; Bock, Ashley; Brody, Edward N.; Carter, Jeff; Dalby, Andrew B.; Eaton, Bruce E.; Fitzwater, Tim; Flather, Dylan; Forbes, Ashley; Foreman, Trudi; Fowler, Cate; Gawande, Bharat; Goss, Meredith; Gunn, Magda; Gupta, Shashi; Halladay, Dennis; Heil, Jim; Heilig, Joe; Hicke, Brian; Husar, Gregory; Janjic, Nebojsa; Jarvis, Thale; Jennings, Susan; Katilius, Evaldas; Keeney, Tracy R.; Kim, Nancy; Koch, Tad H.; Kraemer, Stephan; Kroiss, Luke; Le, Ngan; Levine, Daniel; Lindsey, Wes; Lollo, Bridget; Mayfield, Wes; Mehan, Mike; Mehler, Robert; Nelson, Sally K.; Nelson, Michele; Nieuwlandt, Dan; Nikrad, Malti; Ochsner, Urs; Ostroff, Rachel M.; Otis, Matt; Parker, Thomas; Pietrasiewicz, Steve; Resnicow, Daniel I.; Rohloff, John; Sanders, Glenn; Sattin, Sarah; Schneider, Daniel; Singer, Britta; Stanton, Martin; Sterkel, Alana; Stewart, Alex; Stratford, Suzanne; Vaught, Jonathan D.; Vrkljan, Mike; Walker, Jeffrey J.; Watrobka, Mike; Waugh, Sheela; Weiss, Allison; Wilcox, Sheri K.; Wolfson, Alexey; Wolk, Steven K.; Zhang, Chi; Zichi, Dom

    2010-01-01

    Background The interrogation of proteomes (“proteomics”) in a highly multiplexed and efficient manner remains a coveted and challenging goal in biology and medicine. Methodology/Principal Findings We present a new aptamer-based proteomic technology for biomarker discovery capable of simultaneously measuring thousands of proteins from small sample volumes (15 µL of serum or plasma). Our current assay measures 813 proteins with low limits of detection (1 pM median), 7 logs of overall dynamic range (∼100 fM–1 µM), and 5% median coefficient of variation. This technology is enabled by a new generation of aptamers that contain chemically modified nucleotides, which greatly expand the physicochemical diversity of the large randomized nucleic acid libraries from which the aptamers are selected. Proteins in complex matrices such as plasma are measured with a process that transforms a signature of protein concentrations into a corresponding signature of DNA aptamer concentrations, which is quantified on a DNA microarray. Our assay takes advantage of the dual nature of aptamers as both folded protein-binding entities with defined shapes and unique nucleotide sequences recognizable by specific hybridization probes. To demonstrate the utility of our proteomics biomarker discovery technology, we applied it to a clinical study of chronic kidney disease (CKD). We identified two well known CKD biomarkers as well as an additional 58 potential CKD biomarkers. These results demonstrate the potential utility of our technology to rapidly discover unique protein signatures characteristic of various disease states. Conclusions/Significance We describe a versatile and powerful tool that allows large-scale comparison of proteome profiles among discrete populations. This unbiased and highly multiplexed search engine will enable the discovery of novel biomarkers in a manner that is unencumbered by our incomplete knowledge of biology, thereby helping to advance the next generation of evidence-based medicine. PMID:21165148

  17. Proteomic analysis reveals the distinct energy and protein metabolism characteristics involved in myofiber type conversion and resistance of atrophy in the extensor digitorum longus muscle of hibernating Daurian ground squirrels.

    PubMed

    Chang, Hui; Jiang, Shanfeng; Ma, Xiufeng; Peng, Xin; Zhang, Jie; Wang, Zhe; Xu, Shenhui; Wang, Huiping; Gao, Yunfang

    2018-06-01

    Previous hibernation studies demonstrated that such a natural model of skeletal muscle disuse causes limited muscle atrophy and a significant fast-to-slow fiber type shift. However, the underlying mechanism as defined in a large-scale analysis remains unclarified. Isobaric tags for relative and absolute quantification (iTRAQ) based quantitative analysis were used to examine proteomic changes in the fast extensor digitorum longus muscles (EDL) of Daurian ground squirrels (Spermophilus dauricus). Although the wet weights and fiber cross-sectional area of the EDL muscle showed no significant decrease, the percentage of slow type fiber was 61% greater (P < 0.01) in the hibernation group. Proteomics analysis identified 264 proteins that were significantly changed (ratio < 0.83 or >1.2-fold and P < 0.05) in the hibernation group, of which 23 proteins were categorized into energy production and conversion and translation and 22 proteins were categorized into ribosomal structure and biogenesis. Along with the validation by western blot, MAPKAP kinase 2, ATP5D, ACADSB, calcineurin, CSTB and EIF2S were up-regulated in the hibernation group, whereas PDK4, COX II and EIF3C were down-regulated in the hibernation group. MAPKAP kinase 2 and PDK4 were associated with glycolysis, COX II and ATP5D were associated with oxidative phosphorylation, ACADSB was associated with fatty acid metabolism, calcineurin and CSTB were associated with catabolism, and EIF2S and EIF3C were associated with anabolism. Moreover, the total proteolysis rate of EDL in the hibernation group was significantly inhibited compared with that in the pre-hibernation group. These distinct energy and protein metabolism characteristics may be involved in myofiber type conversion and resistance to atrophy in the EDL of hibernating Daurian ground squirrels. Copyright © 2018 Elsevier Inc. All rights reserved.

  18. Serum Proteome Analysis for Profiling Predictive Protein Markers Associated with the Severity of Skin Lesions Induced by Ionizing Radiation.

    PubMed

    Chaze, Thibault; Hornez, Louis; Chambon, Christophe; Haddad, Iman; Vinh, Joelle; Peyrat, Jean-Philippe; Benderitter, Marc; Guipaud, Olivier

    2013-07-10

    The finding of new diagnostic and prognostic markers of local radiation injury, and particularly of the cutaneous radiation syndrome, is crucial for its medical management, in the case of both accidental exposure and radiotherapy side effects. Especially, a fast high-throughput method is still needed for triage of people accidentally exposed to ionizing radiation. In this study, we investigated the impact of localized irradiation of the skin on the early alteration of the serum proteome of mice in an effort to discover markers associated with the exposure and severity of impending damage. Using two different large-scale quantitative proteomic approaches, 2D-DIGE-MS and SELDI-TOF-MS, we performed global analyses of serum proteins collected in the clinical latency phase (days 3 and 7) from non-irradiated and locally irradiated mice exposed to high doses of 20, 40 and 80 Gy which will develop respectively erythema, moist desquamation and necrosis. Unsupervised and supervised multivariate statistical analyses (principal component analysis, partial-least square discriminant analysis and Random Forest analysis) using 2D-DIGE quantitative protein data allowed us to discriminate early between non-irradiated and irradiated animals, and between uninjured/slightly injured animals and animals that will develop severe lesions. On the other hand, despite a high number of animal replicates, PLS-DA and Random Forest analyses of SELDI-TOF-MS data failed to reveal sets of MS peaks able to discriminate between the different groups of animals. Our results show that, unlike SELDI-TOF-MS, the 2D-DIGE approach remains a powerful and promising method for the discovery of sets of proteins that could be used for the development of clinical tests for triage and the prognosis of the severity of radiation-induced skin lesions. We propose a list of 15 proteins which constitutes a set of candidate proteins for triage and prognosis of skin lesion outcomes.

  19. Serum Proteome Analysis for Profiling Predictive Protein Markers Associated with the Severity of Skin Lesions Induced by Ionizing Radiation

    PubMed Central

    Chaze, Thibault; Hornez, Louis; Chambon, Christophe; Haddad, Iman; Vinh, Joelle; Peyrat, Jean-Philippe; Benderitter, Marc; Guipaud, Olivier

    2013-01-01

    The finding of new diagnostic and prognostic markers of local radiation injury, and particularly of the cutaneous radiation syndrome, is crucial for its medical management, in the case of both accidental exposure and radiotherapy side effects. Especially, a fast high-throughput method is still needed for triage of people accidentally exposed to ionizing radiation. In this study, we investigated the impact of localized irradiation of the skin on the early alteration of the serum proteome of mice in an effort to discover markers associated with the exposure and severity of impending damage. Using two different large-scale quantitative proteomic approaches, 2D-DIGE-MS and SELDI-TOF-MS, we performed global analyses of serum proteins collected in the clinical latency phase (days 3 and 7) from non-irradiated and locally irradiated mice exposed to high doses of 20, 40 and 80 Gy which will develop respectively erythema, moist desquamation and necrosis. Unsupervised and supervised multivariate statistical analyses (principal component analysis, partial-least square discriminant analysis and Random Forest analysis) using 2D-DIGE quantitative protein data allowed us to discriminate early between non-irradiated and irradiated animals, and between uninjured/slightly injured animals and animals that will develop severe lesions. On the other hand, despite a high number of animal replicates, PLS-DA and Random Forest analyses of SELDI-TOF-MS data failed to reveal sets of MS peaks able to discriminate between the different groups of animals. Our results show that, unlike SELDI-TOF-MS, the 2D-DIGE approach remains a powerful and promising method for the discovery of sets of proteins that could be used for the development of clinical tests for triage and the prognosis of the severity of radiation-induced skin lesions. We propose a list of 15 proteins which constitutes a set of candidate proteins for triage and prognosis of skin lesion outcomes. PMID:28250398

  20. Quantitative Missense Variant Effect Prediction Using Large-Scale Mutagenesis Data.

    PubMed

    Gray, Vanessa E; Hause, Ronald J; Luebeck, Jens; Shendure, Jay; Fowler, Douglas M

    2018-01-24

    Large datasets describing the quantitative effects of mutations on protein function are becoming increasingly available. Here, we leverage these datasets to develop Envision, which predicts the magnitude of a missense variant's molecular effect. Envision combines 21,026 variant effect measurements from nine large-scale experimental mutagenesis datasets, a hitherto untapped training resource, with a supervised, stochastic gradient boosting learning algorithm. Envision outperforms other missense variant effect predictors both on large-scale mutagenesis data and on an independent test dataset comprising 2,312 TP53 variants whose effects were measured using a low-throughput approach. This dataset was never used for hyperparameter tuning or model training and thus serves as an independent validation set. Envision prediction accuracy is also more consistent across amino acids than other predictors. Finally, we demonstrate that Envision's performance improves as more large-scale mutagenesis data are incorporated. We precompute Envision predictions for every possible single amino acid variant in human, mouse, frog, zebrafish, fruit fly, worm, and yeast proteomes (https://envision.gs.washington.edu/). Copyright © 2017 Elsevier Inc. All rights reserved.

  1. CONVERGENT TRANSCRIPTOMICS AND PROTEOMICS OF ENVIRONMENTAL ENRICHMENT AND COCAINE IDENTIFIES NOVEL THERAPEUTIC STRATEGIES FOR ADDICTION

    PubMed Central

    ZHANG, YAFANG; CROFTON, ELIZABETH J.; FAN, XIUZHEN; LI, DINGGE; KONG, FANPING; SINHA, MALA; LUXON, BRUCE A.; SPRATT, HEIDI M.; LICHTI, CHERYL F.; GREEN, THOMAS A.

    2016-01-01

    Transcriptomic and proteomic approaches have separately proven effective at identifying novel mechanisms affecting addiction-related behavior; however, it is difficult to prioritize the many promising leads from each approach. A convergent secondary analysis of proteomic and transcriptomic results can glean additional information to help prioritize promising leads. The current study is a secondary analysis of the convergence of recently published separate transcriptomic and proteomic analyses of nucleus accumbens (NAc) tissue from rats subjected to environmental enrichment vs. isolation and cocaine self-administration vs. saline. Multiple bioinformatics approaches (e.g. Gene Ontology (GO) analysis, Ingenuity Pathway Analysis (IPA), and Gene Set Enrichment Analysis (GSEA)) were used to interrogate these rich data sets. Although there was little correspondence between mRNA vs. protein at the individual target level, good correspondence was found at the level of gene/protein sets, particularly for the environmental enrichment manipulation. These data identify gene sets where there is a positive relationship between changes in mRNA and protein (e.g. glycolysis, ATP synthesis, translation elongation factor activity, etc.) and gene sets where there is an inverse relationship (e.g. ribosomes, Rho GTPase signaling, protein ubiquitination, etc.). Overall environmental enrichment produced better correspondence than cocaine self-administration. The individual targets contributing to mRNA and protein effects were largely not overlapping. As a whole, these results confirm that robust transcriptomic and proteomic data sets can provide similar results at the gene/protein set level even when there is little correspondence at the individual target level and little overlap in the targets contributing to the effects. PMID:27717806

  2. Advancing Clinical Proteomics via Analysis Based on Biological Complexes: A Tale of Five Paradigms.

    PubMed

    Goh, Wilson Wen Bin; Wong, Limsoon

    2016-09-02

    Despite advances in proteomic technologies, idiosyncratic data issues, for example, incomplete coverage and inconsistency, resulting in large data holes, persist. Moreover, because of naïve reliance on statistical testing and its accompanying p values, differential protein signatures identified from such proteomics data have little diagnostic power. Thus, deploying conventional analytics on proteomics data is insufficient for identifying novel drug targets or precise yet sensitive biomarkers. Complex-based analysis is a new analytical approach that has potential to resolve these issues but requires formalization. We categorize complex-based analysis into five method classes or paradigms and propose an even-handed yet comprehensive evaluation rubric based on both simulated and real data. The first four paradigms are well represented in the literature. The fifth and newest paradigm, the network-paired (NP) paradigm, represented by a method called Extremely Small SubNET (ESSNET), dominates in precision-recall and reproducibility, maintains strong performance in small sample sizes, and sensitively detects low-abundance complexes. In contrast, the commonly used over-representation analysis (ORA) and direct-group (DG) test paradigms maintain good overall precision but have severe reproducibility issues. The other two paradigms considered here are the hit-rate and rank-based network analysis paradigms; both of these have good precision-recall and reproducibility, but they do not consider low-abundance complexes. Therefore, given its strong performance, NP/ESSNET may prove to be a useful approach for improving the analytical resolution of proteomics data. Additionally, given its stability, it may also be a powerful new approach toward functional enrichment tests, much like its ORA and DG counterparts.

  3. Global iTRAQ-based proteomic profiling of Toxoplasma gondii oocysts during sporulation.

    PubMed

    Zhou, Chun-Xue; Zhu, Xing-Quan; Elsheikha, Hany M; He, Shuai; Li, Qian; Zhou, Dong-Hui; Suo, Xun

    2016-10-04

    Toxoplasma gondii is a medically and economically important protozoan parasite. However, the molecular mechanisms of its sporulation remain largely unknown. Here, we applied iTRAQ coupled with 2D LC-MS/MS proteomic analysis to investigate the proteomic expression profile of T. gondii oocysts during sporulation. Of the 2095 non-redundant proteins identified, 587 were identified as differentially expressed proteins (DEPs). Based on Gene Ontology enrichment and KEGG pathway analyses the majority of these DEPs were found related to the metabolism of amino acids, carbon and energy. Protein interaction network analysis generated by STRING identified ATP-citrate lyase (ACL), GMP synthase, IMP dehydrogenase (IMPDH), poly (ADP-ribose) glycohydrolase (PARG), and bifunctional dihydrofolate reductase-thymidylate synthase (DHFR-TS) as the top five hubs. We also identified 25 parasite virulence factors that were expressed at relatively high levels in sporulated oocysts compared to non-sporulated oocysts, which might contribute to the infectivity of mature oocysts. Considering the importance of oocysts in the dissemination of toxoplasmosis these findings may help in the search of protein targets with a key role in infectiousness and ecological success of oocysts, creating new opportunities for the development of better means for disease prevention. The development of new preventative interventions against T. gondii infection relies on an improved understanding of the proteome and chemical pathways of this parasite. To identify proteins required for the development of environmentally resistant and infective T. gondii oocysts, we compared the proteome of non-sporulated (immature) oocysts with the proteome of sporulated (mature, infective) oocysts. iTRAQ 2D-LC-MS/MS analysis revealed proteomic changes that distinguish non-sporulated from sporulated oocysts. Many of the differentially expressed proteins were involved in metabolic pathways and 25 virulence factors were identified upregulated in the sporulated oocysts. This work provides the first quantitative characterization of the proteomic variations that occur in T. gondii oocyst stage during sporulation. Copyright © 2016. Published by Elsevier B.V.

  4. Evaluation of "shotgun" proteomics for identification of biological threat agents in complex environmental matrixes: experimental simulations.

    PubMed

    Verberkmoes, Nathan C; Hervey, W Judson; Shah, Manesh; Land, Miriam; Hauser, Loren; Larimer, Frank W; Van Berkel, Gary J; Goeringer, Douglas E

    2005-02-01

    There is currently a great need for rapid detection and positive identification of biological threat agents, as well as microbial species in general, directly from complex environmental samples. This need is most urgent in the area of homeland security, but also extends into medical, environmental, and agricultural sciences. Mass-spectrometry-based analysis is one of the leading technologies in the field with a diversity of different methodologies for biothreat detection. Over the past few years, "shotgun"proteomics has become one method of choice for the rapid analysis of complex protein mixtures by mass spectrometry. Recently, it was demonstrated that this methodology is capable of distinguishing a target species against a large database of background species from a single-component sample or dual-component mixtures with relatively the same concentration. Here, we examine the potential of shotgun proteomics to analyze a target species in a background of four contaminant species. We tested the capability of a common commercial mass-spectrometry-based shotgun proteomics platform for the detection of the target species (Escherichia coli) at four different concentrations and four different time points of analysis. We also tested the effect of database size on positive identification of the four microbes used in this study by testing a small (13-species) database and a large (261-species) database. The results clearly indicated that this technology could easily identify the target species at 20% in the background mixture at a 60, 120, 180, or 240 min analysis time with the small database. The results also indicated that the target species could easily be identified at 20% or 6% but could not be identified at 0.6% or 0.06% in either a 240 min analysis or a 30 h analysis with the small database. The effects of the large database were severe on the target species where detection above the background at any concentration used in this study was impossible, though the three other microbes used in this study were clearly identified above the background when analyzed with the large database. This study points to the potential application of this technology for biological threat agent detection but highlights many areas of needed research before the technology will be useful in real world samples.

  5. Urine proteome analysis in Dent's disease shows high selective changes potentially involved in chronic renal damage.

    PubMed

    Santucci, Laura; Candiano, Giovanni; Anglani, Franca; Bruschi, Maurizio; Tosetto, Enrica; Cremasco, Daniela; Murer, Luisa; D'Ambrosio, Chiara; Scaloni, Andrea; Petretto, Andrea; Caridi, Gianluca; Rossi, Roberta; Bonanni, Alice; Ghiggeri, Gian Marco

    2016-01-01

    Definition of the urinary protein composition would represent a potential tool for diagnosis in many clinical conditions. The use of new proteomic technologies allows detection of genetic and post-trasductional variants that increase sensitivity of the approach but complicates comparison within a heterogeneous patient population. Overall, this limits research of urinary biomarkers. Studying monogenic diseases are useful models to address this issue since genetic variability is reduced among first- and second-degree relatives of the same family. We applied this concept to Dent's disease, a monogenic condition characterised by low-molecular-weight proteinuria that is inherited following an X-linked trait. Results are presented here on a combined proteomic approach (LC-mass spectrometry, Western blot and zymograms for proteases and inhibitors) to characterise urine proteins in a large family (18 members, 6 hemizygous patients, 6 carrier females, and 6 normals) with Dent's diseases due to the 1070G>T mutation of the CLCN5. Gene ontology analysis on more than 1000 proteins showed that several clusters of proteins characterised urine of affected patients compared to carrier females and normal subjects: proteins involved in extracellular matrix remodelling were the major group. Specific analysis on metalloproteases and their inhibitors underscored unexpected mechanisms potentially involved in renal fibrosis. Studying with new-generation techniques for proteomic analysis of the members of a large family with Dent's disease sharing the same molecular defect allowed highly repetitive results that justify conclusions. Identification in urine of proteins actively involved in interstitial matrix remodelling poses the question of active anti-fibrotic drugs in Dent's patients. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. multiplierz v2.0: A Python-based ecosystem for shared access and analysis of native mass spectrometry data.

    PubMed

    Alexander, William M; Ficarro, Scott B; Adelmant, Guillaume; Marto, Jarrod A

    2017-08-01

    The continued evolution of modern mass spectrometry instrumentation and associated methods represents a critical component in efforts to decipher the molecular mechanisms which underlie normal physiology and understand how dysregulation of biological pathways contributes to human disease. The increasing scale of these experiments combined with the technological diversity of mass spectrometers presents several challenges for community-wide data access, analysis, and distribution. Here we detail a redesigned version of multiplierz, our Python software library which leverages our common application programming interface (mzAPI) for analysis and distribution of proteomic data. New features include support for a wider range of native mass spectrometry file types, interfaces to additional database search engines, compatibility with new reporting formats, and high-level tools to perform post-search proteomic analyses. A GUI desktop environment, mzDesktop, provides access to multiplierz functionality through a user friendly interface. multiplierz is available for download from: https://github.com/BlaisProteomics/multiplierz; and mzDesktop is available for download from: https://sourceforge.net/projects/multiplierz/. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Alternatively Spliced Homologous Exons Have Ancient Origins and Are Highly Expressed at the Protein Level

    PubMed Central

    Abascal, Federico; Ezkurdia, Iakes; Rodriguez-Rivas, Juan; Rodriguez, Jose Manuel; del Pozo, Angela; Vázquez, Jesús; Valencia, Alfonso; Tress, Michael L.

    2015-01-01

    Alternative splicing of messenger RNA can generate a wide variety of mature RNA transcripts, and these transcripts may produce protein isoforms with diverse cellular functions. While there is much supporting evidence for the expression of alternative transcripts, the same is not true for the alternatively spliced protein products. Large-scale mass spectroscopy experiments have identified evidence of alternative splicing at the protein level, but with conflicting results. Here we carried out a rigorous analysis of the peptide evidence from eight large-scale proteomics experiments to assess the scale of alternative splicing that is detectable by high-resolution mass spectroscopy. We find fewer splice events than would be expected: we identified peptides for almost 64% of human protein coding genes, but detected just 282 splice events. This data suggests that most genes have a single dominant isoform at the protein level. Many of the alternative isoforms that we could identify were only subtly different from the main splice isoform. Very few of the splice events identified at the protein level disrupted functional domains, in stark contrast to the two thirds of splice events annotated in the human genome that would lead to the loss or damage of functional domains. The most striking result was that more than 20% of the splice isoforms we identified were generated by substituting one homologous exon for another. This is significantly more than would be expected from the frequency of these events in the genome. These homologous exon substitution events were remarkably conserved—all the homologous exons we identified evolved over 460 million years ago—and eight of the fourteen tissue-specific splice isoforms we identified were generated from homologous exons. The combination of proteomics evidence, ancient origin and tissue-specific splicing indicates that isoforms generated from homologous exons may have important cellular roles. PMID:26061177

  8. Step-by-step strategy for protein enrichment and proteome characterisation of extracellular polymeric substances in wastewater treatment systems.

    PubMed

    Silva, Ana F; Carvalho, Gilda; Soares, Renata; Coelho, Ana V; Barreto Crespo, M Teresa

    2012-08-01

    Extracellular polymeric substances (EPS) are keys in biomass aggregation and settleability in wastewater treatment systems. In membrane bioreactors (MBR), EPS are an important factor as they are considered to be largely responsible for membrane fouling. Proteins were shown to be the major component of EPS produced by activated sludge and to be correlated with the properties of the sludge, like settling, hydrophobicity and cell aggregation. Previous EPS proteomic studies of activated sludge revealed several problems, like the interference of other EPS molecules in protein analysis. In this study, a successful strategy was outlined to identify the proteins from soluble and bound EPS extracted from activated sludge of a lab-scale MBR. EPS samples were first subjected to pre-concentration through lyophilisation, centrifugal ultrafiltration or concentration with a dialysis membrane coated by a highly absorbent powder of polyacrylate-polyalcohol, preceded or not by a dialysis step. The highest protein concentration factors were achieved with the highly absorbent powder method without previous dialysis step. Four protein precipitation methods were then tested: acetone, trichloroacetic acid (TCA), perchloric acid and a commercial kit. Protein profiles were compared in 4-12 % sodium dodecyl sulphate polyacrylamide gel electrophoresis gels. Both acetone and TCA should be applied for the highest coverage for soluble EPS proteins, whereas TCA was the best method for bound EPS proteins. All visible bands of selected profiles were subjected to mass spectrometry analysis. A high number of proteins (25-32 for soluble EPS and 17 for bound EPS) were identified. As a conclusion of this study, a workflow is proposed for the successful proteome characterisation of soluble and bound EPS from activated sludge samples.

  9. N-acetylcysteine with apocynin prevents hyperoxaluria-induced mitochondrial protein perturbations in nephrolithiasis.

    PubMed

    Sharma, Minu; Sud, Amit; Kaur, Tanzeer; Tandon, Chanderdeep; Singla, S K

    2016-09-01

    Diminished mitochondrial activities were deemed to play an imperative role in surged oxidative damage perceived in hyperoxaluric renal tissue. Proteomics is particularly valuable to delineate the damaging effects of oxidative stress on mitochondrial proteins. The present study was designed to apply large-scale proteomics to describe systematically how mitochondrial proteins/pathways govern the renal damage and calcium oxalate crystal adhesion in hyperoxaluria. Furthermore, the potential beneficial effects of combinatorial therapy with N-acetylcysteine (NAC) and apocynin were studied to establish its credibility in the modulation of hyperoxaluria-induced alterations in mitochondrial proteins. In an experimental setup with male Wistar rats, five groups were designed for 9 d. At the end of the experiment, 24-h urine was collected and rats were euthanized. Urinary samples were analyzed for kidney injury marker and creatinine clearance. Transmission electron microscopy revealed distorted renal mitochondria in hyperoxaluria but combinatorial therapy restored the normal mitochondrial architecture. Mitochondria were isolated from renal tissue of experimental rats, and mitochondrial membrane potential was analyzed. The two-dimensional electrophoresis (2-DE) based comparative proteomic analysis was performed on proteins isolated from renal mitochondria. The results revealed eight differentially expressed mitochondrial proteins in hyperoxaluric rats, which were identified by Matrix-assisted laser desorption/ionization time of flight/time of flight (MALDI-TOF/TOF) analysis. Identified proteins including those involved in important mitochondrial processes, e.g. antioxidant defense, energy metabolism, and electron transport chain. Therapeutic administration of NAC with apocynin significantly expunged hyperoxaluria-induced discrepancy in the renal mitochondrial proteins, bringing them closer to the controls. The results provide insights to further understand the underlying mechanisms in the development of hyperoxaluria-induced nephrolithiasis and the therapeutic relevance of the combinatorial therapy.

  10. Proteome Analyses of Strains ATCC 51142 and PCC 7822 of the Diazotrophic Cyanobacterium Cyanothece sp under Culture Conditions Resulting in Enhanced H-2 Production

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aryal, Uma K.; Callister, Stephen J.; Mishra, Sujata

    2013-02-01

    Cultures of the cyanobacterial genus Cyanothece have been shown to produce high levels of biohydrogen. These strains are diazotrophic and undergo pronounced diurnal cycles when grown under N2-fixing conditions in light-dark cycles. We seek to better understand the way in which proteins respond to these diurnal changes and we performed quantitative proteome analysis of Cyanothece ATCC 51142 and PCC 7822 grown under 8 different nutritional conditions. Nitrogenase expression was limited to N2-fixing conditions, and in the absence of glycerol, nitrogenase gene expression was linked to the dark period. However, glycerol induced expression of nitrogenase during part of the light period,more » together with cytochrome c oxidase (Cox), glycogen phosphorylase (Glp), and glycolytic and pentose-phosphate pathway (PPP) enzymes. This indicated that nitrogenase expression in the light was facilitated via higher respiration and glycogen breakdown. Key enzymes of the Calvin cycle were inhibited in Cyanothece ATCC 51142 in the presence of glycerol under H2 producing conditions, suggesting a competition between these sources of carbon. However, in Cyanothece PCC 7822, the Calvin cycle still played a role in cofactor recycling during H2 production. Our data comprise the first comprehensive profiling of proteome changes in Cyanothece PCC 7822, and allows an in-depth comparative analysis of major physiological and biochemical processes that influence H2-production in both the strains. Our results revealed many previously uncharacterized proteins that may play a role in nitrogenase activity and in other metabolic pathways and may provide suitable targets for genetic manipulation that would lead to improvement of large scale H2 production.« less

  11. Large-Scale SRM Screen of Urothelial Bladder Cancer Candidate Biomarkers in Urine.

    PubMed

    Duriez, Elodie; Masselon, Christophe D; Mesmin, Cédric; Court, Magali; Demeure, Kevin; Allory, Yves; Malats, Núria; Matondo, Mariette; Radvanyi, François; Garin, Jérôme; Domon, Bruno

    2017-04-07

    Urothelial bladder cancer is a condition associated with high recurrence and substantial morbidity and mortality. Noninvasive urinary tests that would detect bladder cancer and tumor recurrence are required to significantly improve patient care. Over the past decade, numerous bladder cancer candidate biomarkers have been identified in the context of extensive proteomics or transcriptomics studies. To translate these findings in clinically useful biomarkers, the systematic evaluation of these candidates remains the bottleneck. Such evaluation involves large-scale quantitative LC-SRM (liquid chromatography-selected reaction monitoring) measurements, targeting hundreds of signature peptides by monitoring thousands of transitions in a single analysis. The design of highly multiplexed SRM analyses is driven by several factors: throughput, robustness, selectivity and sensitivity. Because of the complexity of the samples to be analyzed, some measurements (transitions) can be interfered by coeluting isobaric species resulting in biased or inconsistent estimated peptide/protein levels. Thus the assessment of the quality of SRM data is critical to allow flagging these inconsistent data. We describe an efficient and robust method to process large SRM data sets, including the processing of the raw data, the detection of low-quality measurements, the normalization of the signals for each protein, and the estimation of protein levels. Using this methodology, a variety of proteins previously associated with bladder cancer have been assessed through the analysis of urine samples from a large cohort of cancer patients and corresponding controls in an effort to establish a priority list of most promising candidates to guide subsequent clinical validation studies.

  12. Proteomics meets blue biotechnology: a wealth of novelties and opportunities.

    PubMed

    Hartmann, Erica M; Durighello, Emie; Pible, Olivier; Nogales, Balbina; Beltrametti, Fabrizio; Bosch, Rafael; Christie-Oleza, Joseph A; Armengaud, Jean

    2014-10-01

    Blue biotechnology, in which aquatic environments provide the inspiration for various products such as food additives, aquaculture, biosensors, green chemistry, bioenergy, and pharmaceuticals, holds enormous promise. Large-scale efforts to sequence aquatic genomes and metagenomes, as well as campaigns to isolate new organisms and culture-based screenings, are helping to push the boundaries of known organisms. Mass spectrometry-based proteomics can complement 16S gene sequencing in the effort to discover new organisms of potential relevance to blue biotechnology by facilitating the rapid screening of microbial isolates and by providing in depth profiles of the proteomes and metaproteomes of marine organisms, both model cultivable isolates and, more recently, exotic non-cultivable species and communities. Proteomics has already contributed to blue biotechnology by identifying aquatic proteins with potential applications to food fermentation, the textile industry, and biomedical drug development. In this review, we discuss historical developments in blue biotechnology, the current limitations to the known marine biosphere, and the ways in which mass spectrometry can expand that knowledge. We further speculate about directions that research in blue biotechnology will take given current and near-future technological advancements in mass spectrometry. Copyright © 2014 Elsevier B.V. All rights reserved.

  13. Generation of High-Quality SWATH® Acquisition Data for Label-free Quantitative Proteomics Studies Using TripleTOF® Mass Spectrometers

    PubMed Central

    Schilling, Birgit; Gibson, Bradford W.; Hunter, Christie L.

    2017-01-01

    Data-independent acquisition is a powerful mass spectrometry technique that enables comprehensive MS and MS/MS analysis of all detectable species, providing an information rich data file that can be mined deeply. Here, we describe how to acquire high-quality SWATH® Acquisition data to be used for large quantitative proteomic studies. We specifically focus on using variable sized Q1 windows for acquisition of MS/MS data for generating higher specificity quantitative data. PMID:28188533

  14. Proteomic data analysis of glioma cancer stem-cell lines based on novel nonlinear dimensional data reduction techniques

    NASA Astrophysics Data System (ADS)

    Lespinats, Sylvain; Pinker-Domenig, Katja; Wengert, Georg; Houben, Ivo; Lobbes, Marc; Stadlbauer, Andreas; Meyer-Bäse, Anke

    2016-05-01

    Glioma-derived cancer stem cells (GSCs) are tumor-initiating cells and may be refractory to radiation and chemotherapy and thus have important implications for tumor biology and therapeutics. The analysis and interpretation of large proteomic data sets requires the development of new data mining and visualization approaches. Traditional techniques are insufficient to interpret and visualize these resulting experimental data. The emphasis of this paper lies in the application of novel approaches for the visualization, clustering and projection representation to unveil hidden data structures relevant for the accurate interpretation of biological experiments. These qualitative and quantitative methods are applied to the proteomic analysis of data sets derived from the GSCs. The achieved clustering and visualization results provide a more detailed insight into the protein-level fold changes and putative upstream regulators for the GSCs. However the extracted molecular information is insufficient in classifying GSCs and paving the pathway to an improved therapeutics of the heterogeneous glioma.

  15. Principles of proteome allocation are revealed using proteomic data and genome-scale models

    PubMed Central

    Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.; Ebrahim, Ali; Saunders, Michael A.; Palsson, Bernhard O.

    2016-01-01

    Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the “generalist” (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions, prediction errors for growth rate and metabolic fluxes were 69% and 14% lower, respectively. The sector-constrained ME model thus represents a generalist ME model reflecting both growth rate maximization and “hedging” against uncertain environments and stresses, as indicated by significant enrichment of these sectors for the general stress response sigma factor σS. Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally-related protein groups) as demonstrated here. This flexible formalism provides an accessible approach for narrowing the gap between the complexity captured by omics data and governing principles of proteome allocation described by systems-level models. PMID:27857205

  16. Principles of proteome allocation are revealed using proteomic data and genome-scale models

    DOE PAGES

    Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.; ...

    2016-11-18

    Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the “generalist” (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions, prediction errors for growth rate and metabolic fluxes were 69% and 14% lower, respectively. The sector-constrained ME model thusmore » represents a generalist ME model reflecting both growth rate maximization and “hedging” against uncertain environments and stresses, as indicated by significant enrichment of these sectors for the general stress response sigma factor σS. Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally-related protein groups) as demonstrated here. Furthermore, this flexible formalism provides an accessible approach for narrowing the gap between the complexity captured by omics data and governing principles of proteome allocation described by systems-level models.« less

  17. Novel phage group infecting Lactobacillus delbrueckii subsp. lactis, as revealed by genomic and proteomic analysis of bacteriophage Ldl1.

    PubMed

    Casey, Eoghan; Mahony, Jennifer; Neve, Horst; Noben, Jean-Paul; Dal Bello, Fabio; van Sinderen, Douwe

    2015-02-01

    Ldl1 is a virulent phage infecting the dairy starter Lactobacillus delbrueckii subsp. lactis LdlS. Electron microscopy analysis revealed that this phage exhibits a large head and a long tail and bears little resemblance to other characterized phages infecting Lactobacillus delbrueckii. In vitro propagation of this phage revealed a latent period of 30 to 40 min and a burst size of 59.9 +/- 1.9 phage particles. Comparative genomic and proteomic analyses showed remarkable similarity between the genome of Ldl1 and that of Lactobacillus plantarum phage ATCC 8014-B2. The genomic and proteomic characteristics of Ldl1 demonstrate that this phage does not belong to any of the four previously recognized L. delbrueckii phage groups, necessitating the creation of a new group, called group e, thus adding to the knowledge on the diversity of phages targeting strains of this industrially important lactic acid bacterial species.

  18. Abundant Lysine Methylation and N-Terminal Acetylation in Sulfolobus islandicus Revealed by Bottom-Up and Top-Down Proteomics*

    PubMed Central

    Vorontsov, Egor A.; Rensen, Elena; Prangishvili, David; Krupovic, Mart; Chamot-Rooke, Julia

    2016-01-01

    Protein post-translational methylation has been reported to occur in archaea, including members of the genus Sulfolobus, but has never been characterized on a proteome-wide scale. Among important Sulfolobus proteins carrying such modification are the chromatin proteins that have been described to be methylated on lysine side chains, resembling eukaryotic histones in that aspect. To get more insight into the extent of this modification and its dynamics during the different growth steps of the thermoacidophylic archaeon S. islandicus LAL14/1, we performed a global and deep proteomic analysis using a combination of high-throughput bottom-up and top-down approaches on a single high-resolution mass spectrometer. 1,931 methylation sites on 751 proteins were found by the bottom-up analysis, with methylation sites on 526 proteins monitored throughout three cell culture growth stages: early-exponential, mid-exponential, and stationary. The top-down analysis revealed 3,978 proteoforms arising from 681 proteins, including 292 methylated proteoforms, 85 of which were comprehensively characterized. Methylated proteoforms of the five chromatin proteins (Alba1, Alba2, Cren7, Sul7d1, Sul7d2) were fully characterized by a combination of bottom-up and top-down data. The top-down analysis also revealed an increase of methylation during cell growth for two chromatin proteins, which had not been evidenced by bottom-up. These results shed new light on the ubiquitous lysine methylation throughout the S. islandicus proteome. Furthermore, we found that S. islandicus proteins are frequently acetylated at the N terminus, following the removal of the N-terminal methionine. This study highlights the great value of combining bottom-up and top-down proteomics for obtaining an unprecedented level of accuracy in detecting differentially modified intact proteoforms. The data have been deposited to the ProteomeXchange with identifiers PXD003074 and PXD004179. PMID:27555370

  19. The “Dark Side” of Food Stuff Proteomics: The CPLL-Marshals Investigate

    PubMed Central

    Righetti, Pier Giorgio; Fasoli, Elisa; D’Amato, Alfonsina; Boschetti, Egisto

    2014-01-01

    The present review deals with analysis of the proteome of animal and plant-derived food stuff, as well as of non-alcoholic and alcoholic beverages. The survey is limited to those systems investigated with the help of combinatorial peptide ligand libraries, a most powerful technique allowing access to low- to very-low-abundance proteins, i.e., to those proteins that might characterize univocally a given biological system and, in the case of commercial food preparations, attest their genuineness or adulteration. Among animal foods the analysis of cow’s and donkey’s milk is reported, together with the proteomic composition of egg white and yolk, as well as of honey, considered as a hybrid between floral and animal origin. In terms of plant and fruits, a survey is offered of spinach, artichoke, banana, avocado, mango and lemon proteomics, considered as recalcitrant tissues in that small amounts of proteins are dispersed into a large body of plant polymers and metabolites. As examples of non-alcoholic beverages, ginger ale, coconut milk, a cola drink, almond milk and orgeat syrup are analyzed. Finally, the trace proteome of white and red wines, beer and aperitifs is reported, with the aim of tracing the industrial manipulations and herbal usage prior to their commercialization. PMID:28234315

  20. A proteomic analysis of the chromoplasts isolated from sweet orange fruits [Citrus sinensis (L.) Osbeck].

    PubMed

    Zeng, Yunliu; Pan, Zhiyong; Ding, Yuduan; Zhu, Andan; Cao, Hongbo; Xu, Qiang; Deng, Xiuxin

    2011-11-01

    Here, a comprehensive proteomic analysis of the chromoplasts purified from sweet orange using Nycodenz density gradient centrifugation is reported. A GeLC-MS/MS shotgun approach was used to identify the proteins of pooled chromoplast samples. A total of 493 proteins were identified from purified chromoplasts, of which 418 are putative plastid proteins based on in silico sequence homology and functional analyses. Based on the predicted functions of these identified plastid proteins, a large proportion (∼60%) of the chromoplast proteome of sweet orange is constituted by proteins involved in carbohydrate metabolism, amino acid/protein synthesis, and secondary metabolism. Of note, HDS (hydroxymethylbutenyl 4-diphosphate synthase), PAP (plastid-lipid-associated protein), and psHSPs (plastid small heat shock proteins) involved in the synthesis or storage of carotenoid and stress response are among the most abundant proteins identified. A comparison of chromoplast proteomes between sweet orange and tomato suggested a high level of conservation in a broad range of metabolic pathways. However, the citrus chromoplast was characterized by more extensive carotenoid synthesis, extensive amino acid synthesis without nitrogen assimilation, and evidence for lipid metabolism concerning jasmonic acid synthesis. In conclusion, this study provides an insight into the major metabolic pathways as well as some unique characteristics of the sweet orange chromoplasts at the whole proteome level.

  1. Directed proteomic analysis of the human nucleolus.

    PubMed

    Andersen, Jens S; Lyon, Carol E; Fox, Archa H; Leung, Anthony K L; Lam, Yun Wah; Steen, Hanno; Mann, Matthias; Lamond, Angus I

    2002-01-08

    The nucleolus is a subnuclear organelle containing the ribosomal RNA gene clusters and ribosome biogenesis factors. Recent studies suggest it may also have roles in RNA transport, RNA modification, and cell cycle regulation. Despite over 150 years of research into nucleoli, many aspects of their structure and function remain uncharacterized. We report a proteomic analysis of human nucleoli. Using a combination of mass spectrometry (MS) and sequence database searches, including online analysis of the draft human genome sequence, 271 proteins were identified. Over 30% of the nucleolar proteins were encoded by novel or uncharacterized genes, while the known proteins included several unexpected factors with no previously known nucleolar functions. MS analysis of nucleoli isolated from HeLa cells in which transcription had been inhibited showed that a subset of proteins was enriched. These data highlight the dynamic nature of the nucleolar proteome and show that proteins can either associate with nucleoli transiently or accumulate only under specific metabolic conditions. This extensive proteomic analysis shows that nucleoli have a surprisingly large protein complexity. The many novel factors and separate classes of proteins identified support the view that the nucleolus may perform additional functions beyond its known role in ribosome subunit biogenesis. The data also show that the protein composition of nucleoli is not static and can alter significantly in response to the metabolic state of the cell.

  2. Proteome Analysis of Peroxisomes from Etiolated Arabidopsis Seedlings Identifies a Peroxisomal Protease Involved in β-Oxidation and Development1[C][W][OPEN

    PubMed Central

    Quan, Sheng; Yang, Pingfang; Cassin-Ross, Gaëlle; Kaur, Navneet; Switzenberg, Robert; Aung, Kyaw; Li, Jiying; Hu, Jianping

    2013-01-01

    Plant peroxisomes are highly dynamic organelles that mediate a suite of metabolic processes crucial to development. Peroxisomes in seeds/dark-grown seedlings and in photosynthetic tissues constitute two major subtypes of plant peroxisomes, which had been postulated to contain distinct primary biochemical properties. Multiple in-depth proteomic analyses had been performed on leaf peroxisomes, yet the major makeup of peroxisomes in seeds or dark-grown seedlings remained unclear. To compare the metabolic pathways of the two dominant plant peroxisomal subtypes and discover new peroxisomal proteins that function specifically during seed germination, we performed proteomic analysis of peroxisomes from etiolated Arabidopsis (Arabidopsis thaliana) seedlings. The detection of 77 peroxisomal proteins allowed us to perform comparative analysis with the peroxisomal proteome of green leaves, which revealed a large overlap between these two primary peroxisomal variants. Subcellular targeting analysis by fluorescence microscopy validated around 10 new peroxisomal proteins in Arabidopsis. Mutant analysis suggested the role of the cysteine protease RESPONSE TO DROUGHT21A-LIKE1 in β-oxidation, seed germination, and growth. This work provides a much-needed road map of a major type of plant peroxisome and has established a basis for future investigations of peroxisomal proteolytic processes to understand their roles in development and in plant interaction with the environment. PMID:24130194

  3. Large-scale label-free comparative proteomics analysis of polo-like kinase 1 inhibition via the small-molecule inhibitor BI 6727 (Volasertib) in BRAF(V600E) mutant melanoma cells.

    PubMed

    Cholewa, Brian D; Pellitteri-Hahn, Molly C; Scarlett, Cameron O; Ahmad, Nihal

    2014-11-07

    Polo-like kinase 1 (Plk1) is a serine/threonine kinase that plays a key role during the cell cycle by regulating mitotic entry, progression, and exit. Plk1 is overexpressed in a variety of human cancers and is essential to sustained oncogenic proliferation, thus making Plk1 an attractive therapeutic target. However, the clinical efficacy of Plk1 inhibition has not emulated the preclinical success, stressing an urgent need for a better understanding of Plk1 signaling. This study addresses that need by utilizing a quantitative proteomics strategy to compare the proteome of BRAF(V600E) mutant melanoma cells following treatment with the Plk1-specific inhibitor BI 6727. Employing label-free nano-LC-MS/MS technology on a Q-exactive followed by SIEVE processing, we identified more than 20 proteins of interest, many of which have not been previously associated with Plk1 signaling. Here we report the down-regulation of multiple metabolic proteins with an associated decrease in cellular metabolism, as assessed by lactate and NAD levels. Furthermore, we have also identified the down-regulation of multiple proteasomal subunits, resulting in a significant decrease in 20S proteasome activity. Additionally, we have identified a novel association between Plk1 and p53 through heterogeneous ribonucleoprotein C1/C2 (hnRNPC), thus providing valuable insight into Plk1's role in cancer cell survival.

  4. Translational value of liquid chromatography coupled with tandem mass spectrometry-based quantitative proteomics for in vitro-in vivo extrapolation of drug metabolism and transport and considerations in selecting appropriate techniques.

    PubMed

    Al Feteisi, Hajar; Achour, Brahim; Rostami-Hodjegan, Amin; Barber, Jill

    2015-01-01

    Drug-metabolizing enzymes and transporters play an important role in drug absorption, distribution, metabolism and excretion and, consequently, they influence drug efficacy and toxicity. Quantification of drug-metabolizing enzymes and transporters in various tissues is therefore essential for comprehensive elucidation of drug absorption, distribution, metabolism and excretion. Recent advances in liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) have improved the quantification of pharmacologically relevant proteins. This report presents an overview of mass spectrometry-based methods currently used for the quantification of drug-metabolizing enzymes and drug transporters, mainly focusing on applications and cost associated with various quantitative strategies based on stable isotope-labeled standards (absolute quantification peptide standards, quantification concatemers, protein standards for absolute quantification) and label-free analysis. In mass spectrometry, there is no simple relationship between signal intensity and analyte concentration. Proteomic strategies are therefore complex and several factors need to be considered when selecting the most appropriate method for an intended application, including the number of proteins and samples. Quantitative strategies require appropriate mass spectrometry platforms, yet choice is often limited by the availability of appropriate instrumentation. Quantitative proteomics research requires specialist practical skills and there is a pressing need to dedicate more effort and investment to training personnel in this area. Large-scale multicenter collaborations are also needed to standardize quantitative strategies in order to improve physiologically based pharmacokinetic models.

  5. Preparation of the low molecular weight serum proteome for mass spectrometry analysis.

    PubMed

    Waybright, Timothy J; Chan, King C; Veenstra, Timothy D; Xiao, Zhen

    2013-01-01

    The discovery of viable biomarkers or indicators of disease states is complicated by the inherent complexity of the chosen biological specimen. Every sample, whether it is serum, plasma, urine, tissue, cells, or a host of others, contains thousands of large and small components, each interacting in multiple ways. The need to concentrate on a group of these components to narrow the focus on a potential biomarker candidate becomes, out of necessity, a priority, especially in the search for immune-related low molecular weight serum biomarkers. One such method in the field of proteomics is to divide the sample proteome into groups based on the size of the protein, analyze each group, and mine the data for statistically significant items. This chapter details a portion of this method, concentrating on a method for fractionating and analyzing the low molecular weight proteome of human serum.

  6. Optimized approaches for quantification of drug transporters in tissues and cells by MRM proteomics.

    PubMed

    Prasad, Bhagwat; Unadkat, Jashvant D

    2014-07-01

    Drug transporter expression in tissues (in vivo) usually differs from that in cell lines used to measure transporter activity (in vitro). Therefore, quantification of transporter expression in tissues and cell lines is important to develop scaling factor for in vitro to in vivo extrapolation (IVIVE) of transporter-mediated drug disposition. Since traditional immunoquantification methods are semiquantitative, targeted proteomics is now emerging as a superior method to quantify proteins, including membrane transporters. This superiority is derived from the selectivity, precision, accuracy, and speed of analysis by liquid chromatography tandem mass spectrometry (LC-MS/MS) in multiple reaction monitoring (MRM) mode. Moreover, LC-MS/MS proteomics has broader applicability because it does not require selective antibodies for individual proteins. There are a number of recent research and review papers that discuss the use of LC-MS/MS for transporter quantification. Here, we have compiled from the literature various elements of MRM proteomics to provide a comprehensive systematic strategy to quantify drug transporters. This review emphasizes practical aspects and challenges in surrogate peptide selection, peptide qualification, peptide synthesis and characterization, membrane protein isolation, protein digestion, sample preparation, LC-MS/MS parameter optimization, method validation, and sample analysis. In particular, bioinformatic tools used in method development and sample analysis are discussed in detail. Various pre-analytical and analytical sources of variability that should be considered during transporter quantification are highlighted. All these steps are illustrated using P-glycoprotein (P-gp) as a case example. Greater use of quantitative transporter proteomics will lead to a better understanding of the role of drug transporters in drug disposition.

  7. Proteomics profiling of interactome dynamics by colocalisation analysis (COLA).

    PubMed

    Mardakheh, Faraz K; Sailem, Heba Z; Kümper, Sandra; Tape, Christopher J; McCully, Ryan R; Paul, Angela; Anjomani-Virmouni, Sara; Jørgensen, Claus; Poulogiannis, George; Marshall, Christopher J; Bakal, Chris

    2016-12-20

    Localisation and protein function are intimately linked in eukaryotes, as proteins are localised to specific compartments where they come into proximity of other functionally relevant proteins. Significant co-localisation of two proteins can therefore be indicative of their functional association. We here present COLA, a proteomics based strategy coupled with a bioinformatics framework to detect protein-protein co-localisations on a global scale. COLA reveals functional interactions by matching proteins with significant similarity in their subcellular localisation signatures. The rapid nature of COLA allows mapping of interactome dynamics across different conditions or treatments with high precision.

  8. Classification of Complete Proteomes of Different Organisms and Protein Sets Based on Their Protein Distributions in Terms of Some Key Attributes of Proteins

    DOE PAGES

    Guo, Hao-Bo; Ma, Yue; Tuskan, Gerald A.; ...

    2018-01-01

    The existence of complete genome sequences makes it important to develop different approaches for classification of large-scale data sets and to make extraction of biological insights easier. Here, we propose an approach for classification of complete proteomes/protein sets based on protein distributions on some basic attributes. We demonstrate the usefulness of this approach by determining protein distributions in terms of two attributes: protein lengths and protein intrinsic disorder contents (ID). The protein distributions based on L and ID are surveyed for representative proteome organisms and protein sets from the three domains of life. The two-dimensional maps (designated as fingerprints here)more » from the protein distribution densities in the LD space defined by ln( L ) and ID are then constructed. The fingerprints for different organisms and protein sets are found to be distinct with each other, and they can therefore be used for comparative studies. As a test case, phylogenetic trees have been constructed based on the protein distribution densities in the fingerprints of proteomes of organisms without performing any protein sequence comparison and alignments. The phylogenetic trees generated are biologically meaningful, demonstrating that the protein distributions in the LD space may serve as unique phylogenetic signals of the organisms at the proteome level.« less

  9. Proteomic Approaches to Quantify Cysteine Reversible Modifications in Aging and Neurodegenerative Diseases

    PubMed Central

    Gu, Liqing; Robinson, Renã A. S.

    2016-01-01

    Cysteine is a highly reactive amino acid and is subject to a variety of reversible post-translational modifications (PTMs), including nitrosylation, glutathionylation, palmitoylation, as well as formation of sulfenic acid and disulfides. These modifications are not only involved in normal biological activities, such as enzymatic catalysis, redox signaling and cellular homeostasis, but can also be the result of oxidative damage. Especially in aging and neurodegenerative diseases, oxidative stress leads to aberrant cysteine oxidations that affect protein structure and function leading to neurodegeneration as well as other detrimental effects. Methods that can identify cysteine modifications by type, including the site of modification, as well as the relative stoichiometry of the modification can be very helpful for understanding the role of the thiol proteome and redox homeostasis in the context of disease. Cysteine reversible modifications however, are challenging to investigate as they are low abundant, diverse, and labile especially under endogenous conditions. Thanks to the development of redox proteomic approaches, large-scale quantification of cysteine reversible modifications is possible. These approaches cover a range of strategies to enrich, identify, and quantify cysteine reversible modifications from biological samples. This review will focus on nongel-based redox proteomics workflows that give quantitative information about cysteine PTMs and highlight how these strategies have been useful for investigating the redox thiol proteome in aging and neurodegenerative diseases. PMID:27666938

  10. Classification of Complete Proteomes of Different Organisms and Protein Sets Based on Their Protein Distributions in Terms of Some Key Attributes of Proteins

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Guo, Hao-Bo; Ma, Yue; Tuskan, Gerald A.

    The existence of complete genome sequences makes it important to develop different approaches for classification of large-scale data sets and to make extraction of biological insights easier. Here, we propose an approach for classification of complete proteomes/protein sets based on protein distributions on some basic attributes. We demonstrate the usefulness of this approach by determining protein distributions in terms of two attributes: protein lengths and protein intrinsic disorder contents (ID). The protein distributions based on L and ID are surveyed for representative proteome organisms and protein sets from the three domains of life. The two-dimensional maps (designated as fingerprints here)more » from the protein distribution densities in the LD space defined by ln( L ) and ID are then constructed. The fingerprints for different organisms and protein sets are found to be distinct with each other, and they can therefore be used for comparative studies. As a test case, phylogenetic trees have been constructed based on the protein distribution densities in the fingerprints of proteomes of organisms without performing any protein sequence comparison and alignments. The phylogenetic trees generated are biologically meaningful, demonstrating that the protein distributions in the LD space may serve as unique phylogenetic signals of the organisms at the proteome level.« less

  11. PATIKA: an integrated visual environment for collaborative construction and analysis of cellular pathways.

    PubMed

    Demir, E; Babur, O; Dogrusoz, U; Gursoy, A; Nisanci, G; Cetin-Atalay, R; Ozturk, M

    2002-07-01

    Availability of the sequences of entire genomes shifts the scientific curiosity towards the identification of function of the genomes in large scale as in genome studies. In the near future, data produced about cellular processes at molecular level will accumulate with an accelerating rate as a result of proteomics studies. In this regard, it is essential to develop tools for storing, integrating, accessing, and analyzing this data effectively. We define an ontology for a comprehensive representation of cellular events. The ontology presented here enables integration of fragmented or incomplete pathway information and supports manipulation and incorporation of the stored data, as well as multiple levels of abstraction. Based on this ontology, we present the architecture of an integrated environment named Patika (Pathway Analysis Tool for Integration and Knowledge Acquisition). Patika is composed of a server-side, scalable, object-oriented database and client-side editors to provide an integrated, multi-user environment for visualizing and manipulating network of cellular events. This tool features automated pathway layout, functional computation support, advanced querying and a user-friendly graphical interface. We expect that Patika will be a valuable tool for rapid knowledge acquisition, microarray generated large-scale data interpretation, disease gene identification, and drug development. A prototype of Patika is available upon request from the authors.

  12. Large-scale protein analysis of European beech trees following four vegetation periods of twice ambient ozone exposure.

    PubMed

    Kerner, René; Delgado-Eckert, Edgar; Ernst, Dieter; Dupuy, Jean-William; Grams, Thorsten E E; Barbro Winkler, J; Lindermayr, Christian; Müller-Starck, Gerhard

    2014-09-23

    In the present study, we performed a large-scale protein analysis based on 2-DE DIGE to examine the effects of ozone on the leaves of juvenile European beech (Fagus sylvatica L.), one of the most important deciduous tree species in Central Europe. To this end, beech trees were grown under field conditions and subjected to ambient and twice ambient ozone concentrations during the vegetation periods of four consecutive years. The twice ambient ozone concentration altered the abundance of 237 protein spots, which showed relative ratios higher than 30% compared to the ambient control trees. A total of 74 protein spots were subjected to mass spectrometry identification (LC-MS/MS), followed by homology-driven searches. The differentially expressed proteins participate in key biological processes including the Calvin cycle and photosynthesis, carbon metabolism, defense- and stress-related responses, detoxification mechanisms, protein folding and degradation, and mechanisms involved in senescence. The ozone-induced responses provide evidence of a changing carbon metabolism and counteraction against increased levels of reactive oxygen species. This study provides useful information on how European beech, an economically and ecologically important tree species, reacts on the molecular level to increased ozone concentrations expected in the near future. The main emphasis in the present study was placed on identifying differentially abundant proteins after long-term ozone exposure under climatically realistic settings, rather than short-term responses or reactions under laboratory conditions. Additionally, using nursery-grown beech trees, we took into account the natural genotypic variation of this species. As such, the results presented here provide information on molecular responses to ozone in an experimental plant system at very close to natural conditions. Furthermore, this proteomic approach was supported by previous studies on the present experiment. Ultimately, the combination of this proteomic approach with several approaches including transcriptomics, analysis of non-structural carbohydrates, and morphological effects contributes to a more global picture of how beech trees react under increased ozone concentrations. Copyright © 2014. Published by Elsevier B.V.

  13. hEIDI: An Intuitive Application Tool To Organize and Treat Large-Scale Proteomics Data.

    PubMed

    Hesse, Anne-Marie; Dupierris, Véronique; Adam, Claire; Court, Magali; Barthe, Damien; Emadali, Anouk; Masselon, Christophe; Ferro, Myriam; Bruley, Christophe

    2016-10-07

    Advances in high-throughput proteomics have led to a rapid increase in the number, size, and complexity of the associated data sets. Managing and extracting reliable information from such large series of data sets require the use of dedicated software organized in a consistent pipeline to reduce, validate, exploit, and ultimately export data. The compilation of multiple mass-spectrometry-based identification and quantification results obtained in the context of a large-scale project represents a real challenge for developers of bioinformatics solutions. In response to this challenge, we developed a dedicated software suite called hEIDI to manage and combine both identifications and semiquantitative data related to multiple LC-MS/MS analyses. This paper describes how, through a user-friendly interface, hEIDI can be used to compile analyses and retrieve lists of nonredundant protein groups. Moreover, hEIDI allows direct comparison of series of analyses, on the basis of protein groups, while ensuring consistent protein inference and also computing spectral counts. hEIDI ensures that validated results are compliant with MIAPE guidelines as all information related to samples and results is stored in appropriate databases. Thanks to the database structure, validated results generated within hEIDI can be easily exported in the PRIDE XML format for subsequent publication. hEIDI can be downloaded from http://biodev.extra.cea.fr/docs/heidi .

  14. Direct Detection of Alternative Open Reading Frames Translation Products in Human Significantly Expands the Proteome

    PubMed Central

    Vanderperre, Benoît; Lucier, Jean-François; Bissonnette, Cyntia; Motard, Julie; Tremblay, Guillaume; Vanderperre, Solène; Wisztorski, Maxence; Salzet, Michel; Boisvert, François-Michel; Roucou, Xavier

    2013-01-01

    A fully mature mRNA is usually associated to a reference open reading frame encoding a single protein. Yet, mature mRNAs contain unconventional alternative open reading frames (AltORFs) located in untranslated regions (UTRs) or overlapping the reference ORFs (RefORFs) in non-canonical +2 and +3 reading frames. Although recent ribosome profiling and footprinting approaches have suggested the significant use of unconventional translation initiation sites in mammals, direct evidence of large-scale alternative protein expression at the proteome level is still lacking. To determine the contribution of alternative proteins to the human proteome, we generated a database of predicted human AltORFs revealing a new proteome mainly composed of small proteins with a median length of 57 amino acids, compared to 344 amino acids for the reference proteome. We experimentally detected a total of 1,259 alternative proteins by mass spectrometry analyses of human cell lines, tissues and fluids. In plasma and serum, alternative proteins represent up to 55% of the proteome and may be a potential unsuspected new source for biomarkers. We observed constitutive co-expression of RefORFs and AltORFs from endogenous genes and from transfected cDNAs, including tumor suppressor p53, and provide evidence that out-of-frame clones representing AltORFs are mistakenly rejected as false positive in cDNAs screening assays. Functional importance of alternative proteins is strongly supported by significant evolutionary conservation in vertebrates, invertebrates, and yeast. Our results imply that coding of multiple proteins in a single gene by the use of AltORFs may be a common feature in eukaryotes, and confirm that translation of unconventional ORFs generates an as yet unexplored proteome. PMID:23950983

  15. Spectrum-to-Spectrum Searching Using a Proteome-wide Spectral Library*

    PubMed Central

    Yen, Chia-Yu; Houel, Stephane; Ahn, Natalie G.; Old, William M.

    2011-01-01

    The unambiguous assignment of tandem mass spectra (MS/MS) to peptide sequences remains a key unsolved problem in proteomics. Spectral library search strategies have emerged as a promising alternative for peptide identification, in which MS/MS spectra are directly compared against a reference library of confidently assigned spectra. Two problems relate to library size. First, reference spectral libraries are limited to rediscovery of previously identified peptides and are not applicable to new peptides, because of their incomplete coverage of the human proteome. Second, problems arise when searching a spectral library the size of the entire human proteome. We observed that traditional dot product scoring methods do not scale well with spectral library size, showing reduction in sensitivity when library size is increased. We show that this problem can be addressed by optimizing scoring metrics for spectrum-to-spectrum searches with large spectral libraries. MS/MS spectra for the 1.3 million predicted tryptic peptides in the human proteome are simulated using a kinetic fragmentation model (MassAnalyzer version2.1) to create a proteome-wide simulated spectral library. Searches of the simulated library increase MS/MS assignments by 24% compared with Mascot, when using probabilistic and rank based scoring methods. The proteome-wide coverage of the simulated library leads to 11% increase in unique peptide assignments, compared with parallel searches of a reference spectral library. Further improvement is attained when reference spectra and simulated spectra are combined into a hybrid spectral library, yielding 52% increased MS/MS assignments compared with Mascot searches. Our study demonstrates the advantages of using probabilistic and rank based scores to improve performance of spectrum-to-spectrum search strategies. PMID:21532008

  16. Identification of new intrinsic proteins in Arabidopsis plasma membrane proteome.

    PubMed

    Marmagne, Anne; Rouet, Marie-Aude; Ferro, Myriam; Rolland, Norbert; Alcon, Carine; Joyard, Jacques; Garin, Jérome; Barbier-Brygoo, Hélène; Ephritikhine, Geneviève

    2004-07-01

    Identification and characterization of anion channel genes in plants represent a goal for a better understanding of their central role in cell signaling, osmoregulation, nutrition, and metabolism. Though channel activities have been well characterized in plasma membrane by electrophysiology, the corresponding molecular entities are little documented. Indeed, the hydrophobic protein equipment of plant plasma membrane still remains largely unknown, though several proteomic approaches have been reported. To identify new putative transport systems, we developed a new proteomic strategy based on mass spectrometry analyses of a plasma membrane fraction enriched in hydrophobic proteins. We produced from Arabidopsis cell suspensions a highly purified plasma membrane fraction and characterized it in detail by immunological and enzymatic tests. Using complementary methods for the extraction of hydrophobic proteins and mass spectrometry analyses on mono-dimensional gels, about 100 proteins have been identified, 95% of which had never been found in previous proteomic studies. The inventory of the plasma membrane proteome generated by this approach contains numerous plasma membrane integral proteins, one-third displaying at least four transmembrane segments. The plasma membrane localization was confirmed for several proteins, therefore validating such proteomic strategy. An in silico analysis shows a correlation between the putative functions of the identified proteins and the expected roles for plasma membrane in transport, signaling, cellular traffic, and metabolism. This analysis also reveals 10 proteins that display structural properties compatible with transport functions and will constitute interesting targets for further functional studies.

  17. Automated Interpretation of Subcellular Patterns in Fluorescence Microscope Images for Location Proteomics

    PubMed Central

    Chen, Xiang; Velliste, Meel; Murphy, Robert F.

    2010-01-01

    Proteomics, the large scale identification and characterization of many or all proteins expressed in a given cell type, has become a major area of biological research. In addition to information on protein sequence, structure and expression levels, knowledge of a protein’s subcellular location is essential to a complete understanding of its functions. Currently subcellular location patterns are routinely determined by visual inspection of fluorescence microscope images. We review here research aimed at creating systems for automated, systematic determination of location. These employ numerical feature extraction from images, feature reduction to identify the most useful features, and various supervised learning (classification) and unsupervised learning (clustering) methods. These methods have been shown to perform significantly better than human interpretation of the same images. When coupled with technologies for tagging large numbers of proteins and high-throughput microscope systems, the computational methods reviewed here enable the new subfield of location proteomics. This subfield will make critical contributions in two related areas. First, it will provide structured, high-resolution information on location to enable Systems Biology efforts to simulate cell behavior from the gene level on up. Second, it will provide tools for Cytomics projects aimed at characterizing the behaviors of all cell types before, during and after the onset of various diseases. PMID:16752421

  18. Melanie II--a third-generation software package for analysis of two-dimensional electrophoresis images: I. Features and user interface.

    PubMed

    Appel, R D; Palagi, P M; Walther, D; Vargas, J R; Sanchez, J C; Ravier, F; Pasquali, C; Hochstrasser, D F

    1997-12-01

    Although two-dimensional electrophoresis (2-DE) computer analysis software packages have existed ever since 2-DE technology was developed, it is only now that the hardware and software technology allows large-scale studies to be performed on low-cost personal computers or workstations, and that setting up a 2-DE computer analysis system in a small laboratory is no longer considered a luxury. After a first attempt in the seventies and early eighties to develop 2-DE analysis software systems on hardware that had poor or even no graphical capabilities, followed in the late eighties by a wave of innovative software developments that were possible thanks to new graphical interface standards such as XWindows, a third generation of 2-DE analysis software packages has now come to maturity. It can be run on a variety of low-cost, general-purpose personal computers, thus making the purchase of a 2-DE analysis system easily attainable for even the smallest laboratory that is involved in proteome research. Melanie II 2-D PAGE, developed at the University Hospital of Geneva, is such a third-generation software system for 2-DE analysis. Based on unique image processing algorithms, this user-friendly object-oriented software package runs on multiple platforms, including Unix, MS-Windows 95 and NT, and Power Macintosh. It provides efficient spot detection and quantitation, state-of-the-art image comparison, statistical data analysis facilities, and is Internet-ready. Linked to proteome databases such as those available on the World Wide Web, it represents a valuable tool for the "Virtual Lab" of the post-genome area.

  19. Proteinortho: Detection of (Co-)orthologs in large-scale analysis

    PubMed Central

    2011-01-01

    Background Orthology analysis is an important part of data analysis in many areas of bioinformatics such as comparative genomics and molecular phylogenetics. The ever-increasing flood of sequence data, and hence the rapidly increasing number of genomes that can be compared simultaneously, calls for efficient software tools as brute-force approaches with quadratic memory requirements become infeasible in practise. The rapid pace at which new data become available, furthermore, makes it desirable to compute genome-wide orthology relations for a given dataset rather than relying on relations listed in databases. Results The program Proteinortho described here is a stand-alone tool that is geared towards large datasets and makes use of distributed computing techniques when run on multi-core hardware. It implements an extended version of the reciprocal best alignment heuristic. We apply Proteinortho to compute orthologous proteins in the complete set of all 717 eubacterial genomes available at NCBI at the beginning of 2009. We identified thirty proteins present in 99% of all bacterial proteomes. Conclusions Proteinortho significantly reduces the required amount of memory for orthology analysis compared to existing tools, allowing such computations to be performed on off-the-shelf hardware. PMID:21526987

  20. PANDORA: keyword-based analysis of protein sets by integration of annotation sources.

    PubMed

    Kaplan, Noam; Vaaknin, Avishay; Linial, Michal

    2003-10-01

    Recent advances in high-throughput methods and the application of computational tools for automatic classification of proteins have made it possible to carry out large-scale proteomic analyses. Biological analysis and interpretation of sets of proteins is a time-consuming undertaking carried out manually by experts. We have developed PANDORA (Protein ANnotation Diagram ORiented Analysis), a web-based tool that provides an automatic representation of the biological knowledge associated with any set of proteins. PANDORA uses a unique approach of keyword-based graphical analysis that focuses on detecting subsets of proteins that share unique biological properties and the intersections of such sets. PANDORA currently supports SwissProt keywords, NCBI Taxonomy, InterPro entries and the hierarchical classification terms from ENZYME, SCOP and GO databases. The integrated study of several annotation sources simultaneously allows a representation of biological relations of structure, function, cellular location, taxonomy, domains and motifs. PANDORA is also integrated into the ProtoNet system, thus allowing testing thousands of automatically generated clusters. We illustrate how PANDORA enhances the biological understanding of large, non-uniform sets of proteins originating from experimental and computational sources, without the need for prior biological knowledge on individual proteins.

  1. Proteomic Cinderella: Customized analysis of bulky MS/MS data in one night.

    PubMed

    Kiseleva, Olga; Poverennaya, Ekaterina; Shargunov, Alexander; Lisitsa, Andrey

    2018-02-01

    Proteomic challenges, stirred up by the advent of high-throughput technologies, produce large amount of MS data. Nowadays, the routine manual search does not satisfy the "speed" of modern science any longer. In our work, the necessity of single-thread analysis of bulky data emerged during interpretation of HepG2 proteome profiling results for proteoforms searching. We compared the contribution of each of the eight search engines (X!Tandem, MS-GF[Formula: see text], MS Amanda, MyriMatch, Comet, Tide, Andromeda, and OMSSA) integrated in an open-source graphical user interface SearchGUI ( http://searchgui.googlecode.com ) into total result of proteoforms identification and optimized set of engines working simultaneously. We also compared the results of our search combination with Mascot results using protein kit UPS2, containing 48 human proteins. We selected combination of X!Tandem, MS-GF[Formula: see text] and OMMSA as the most time-efficient and productive combination of search. We added homemade java-script to automatize pipeline from file picking to report generation. These settings resulted in rise of the efficiency of our customized pipeline unobtainable by manual scouting: the analysis of 192 files searched against human proteome (42153 entries) downloaded from UniProt took 11[Formula: see text]h.

  2. In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development.

    PubMed

    Ozerov, Ivan V; Lezhnina, Ksenia V; Izumchenko, Evgeny; Artemov, Artem V; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N; Labat, Ivan; West, Michael D; Buzdin, Anton; Cantor, Charles R; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex

    2016-11-16

    Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy.

  3. In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development

    PubMed Central

    Ozerov, Ivan V.; Lezhnina, Ksenia V.; Izumchenko, Evgeny; Artemov, Artem V.; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N.; Labat, Ivan; West, Michael D.; Buzdin, Anton; Cantor, Charles R.; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex

    2016-01-01

    Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy. PMID:27848968

  4. Modification of the Creator recombination system for proteomics applications--improved expression by addition of splice sites.

    PubMed

    Colwill, Karen; Wells, Clark D; Elder, Kelly; Goudreault, Marilyn; Hersi, Kadija; Kulkarni, Sarang; Hardy, W Rod; Pawson, Tony; Morin, Gregg B

    2006-03-06

    Recombinational systems have been developed to rapidly shuttle Open Reading Frames (ORFs) into multiple expression vectors in order to analyze the large number of cDNAs available in the post-genomic era. In the Creator system, an ORF introduced into a donor vector can be transferred with Cre recombinase to a library of acceptor vectors optimized for different applications. Usability of the Creator system is impacted by the ability to easily manipulate DNA, the number of acceptor vectors for downstream applications, and the level of protein expression from Creator vectors. To date, we have developed over 20 novel acceptor vectors that employ a variety of promoters and epitope tags commonly employed for proteomics applications and gene function analysis. We also made several enhancements to the donor vectors including addition of different multiple cloning sites to allow shuttling from pre-existing vectors and introduction of the lacZ alpha reporter gene to allow for selection. Importantly, in order to ameliorate any effects on protein expression of the loxP site between a 5' tag and ORF, we introduced a splicing event into our expression vectors. The message produced from the resulting 'Creator Splice' vector undergoes splicing in mammalian systems to remove the loxP site. Upon analysis of our Creator Splice constructs, we discovered that protein expression levels were also significantly increased. The development of new donor and acceptor vectors has increased versatility during the cloning process and made this system compatible with a wider variety of downstream applications. The modifications introduced in our Creator Splice system were designed to remove extraneous sequences due to recombination but also aided in downstream analysis by increasing protein expression levels. As a result, we can now employ epitope tags that are detected less efficiently and reduce our assay scale to allow for higher throughput. The Creator Splice system appears to be an extremely useful tool for proteomics.

  5. Modification of the Creator recombination system for proteomics applications – improved expression by addition of splice sites

    PubMed Central

    Colwill, Karen; Wells, Clark D; Elder, Kelly; Goudreault, Marilyn; Hersi, Kadija; Kulkarni, Sarang; Hardy, W Rod; Pawson, Tony; Morin, Gregg B

    2006-01-01

    Background Recombinational systems have been developed to rapidly shuttle Open Reading Frames (ORFs) into multiple expression vectors in order to analyze the large number of cDNAs available in the post-genomic era. In the Creator system, an ORF introduced into a donor vector can be transferred with Cre recombinase to a library of acceptor vectors optimized for different applications. Usability of the Creator system is impacted by the ability to easily manipulate DNA, the number of acceptor vectors for downstream applications, and the level of protein expression from Creator vectors. Results To date, we have developed over 20 novel acceptor vectors that employ a variety of promoters and epitope tags commonly employed for proteomics applications and gene function analysis. We also made several enhancements to the donor vectors including addition of different multiple cloning sites to allow shuttling from pre-existing vectors and introduction of the lacZ alpha reporter gene to allow for selection. Importantly, in order to ameliorate any effects on protein expression of the loxP site between a 5' tag and ORF, we introduced a splicing event into our expression vectors. The message produced from the resulting 'Creator Splice' vector undergoes splicing in mammalian systems to remove the loxP site. Upon analysis of our Creator Splice constructs, we discovered that protein expression levels were also significantly increased. Conclusion The development of new donor and acceptor vectors has increased versatility during the cloning process and made this system compatible with a wider variety of downstream applications. The modifications introduced in our Creator Splice system were designed to remove extraneous sequences due to recombination but also aided in downstream analysis by increasing protein expression levels. As a result, we can now employ epitope tags that are detected less efficiently and reduce our assay scale to allow for higher throughput. The Creator Splice system appears to be an extremely useful tool for proteomics. PMID:16519801

  6. Systems biology definition of the core proteome of metabolism and expression is consistent with high-throughput data.

    PubMed

    Yang, Laurence; Tan, Justin; O'Brien, Edward J; Monk, Jonathan M; Kim, Donghyuk; Li, Howard J; Charusanti, Pep; Ebrahim, Ali; Lloyd, Colton J; Yurkovich, James T; Du, Bin; Dräger, Andreas; Thomas, Alex; Sun, Yuekai; Saunders, Michael A; Palsson, Bernhard O

    2015-08-25

    Finding the minimal set of gene functions needed to sustain life is of both fundamental and practical importance. Minimal gene lists have been proposed by using comparative genomics-based core proteome definitions. A definition of a core proteome that is supported by empirical data, is understood at the systems-level, and provides a basis for computing essential cell functions is lacking. Here, we use a systems biology-based genome-scale model of metabolism and expression to define a functional core proteome consisting of 356 gene products, accounting for 44% of the Escherichia coli proteome by mass based on proteomics data. This systems biology core proteome includes 212 genes not found in previous comparative genomics-based core proteome definitions, accounts for 65% of known essential genes in E. coli, and has 78% gene function overlap with minimal genomes (Buchnera aphidicola and Mycoplasma genitalium). Based on transcriptomics data across environmental and genetic backgrounds, the systems biology core proteome is significantly enriched in nondifferentially expressed genes and depleted in differentially expressed genes. Compared with the noncore, core gene expression levels are also similar across genetic backgrounds (two times higher Spearman rank correlation) and exhibit significantly more complex transcriptional and posttranscriptional regulatory features (40% more transcription start sites per gene, 22% longer 5'UTR). Thus, genome-scale systems biology approaches rigorously identify a functional core proteome needed to support growth. This framework, validated by using high-throughput datasets, facilitates a mechanistic understanding of systems-level core proteome function through in silico models; it de facto defines a paleome.

  7. Proteome-scale human interactomics

    PubMed Central

    Luck, Katja; Sheynkman, Gloria M.; Zhang, Ivy; Vidal, Marc

    2017-01-01

    Cellular functions are mediated by complex interactome networks of physical, biochemical, and functional interactions between DNA sequences, RNA molecules, proteins, lipids, and small metabolites. A thorough understanding of cellular organization requires accurate and relatively complete models of interactome networks at proteome-scale. The recent publication of four human protein-protein interaction (PPI) maps represents a technological breakthrough and an unprecedented resource for the scientific community, heralding a new era of proteome-scale human interactomics. Our knowledge gained from these and complementary studies provides fresh insights into the opportunities and challenges when analyzing systematically generated interactome data, defines a clear roadmap towards the generation of a first reference interactome, and reveals new perspectives on the organization of cellular life. PMID:28284537

  8. Computing Prediction and Functional Analysis of Prokaryotic Propionylation.

    PubMed

    Wang, Li-Na; Shi, Shao-Ping; Wen, Ping-Ping; Zhou, Zhi-You; Qiu, Jian-Ding

    2017-11-27

    Identification and systematic analysis of candidates for protein propionylation are crucial steps for understanding its molecular mechanisms and biological functions. Although several proteome-scale methods have been performed to delineate potential propionylated proteins, the majority of lysine-propionylated substrates and their role in pathological physiology still remain largely unknown. By gathering various databases and literatures, experimental prokaryotic propionylation data were collated to be trained in a support vector machine with various features via a three-step feature selection method. A novel online tool for seeking potential lysine-propionylated sites (PropSeek) ( http://bioinfo.ncu.edu.cn/PropSeek.aspx ) was built. Independent test results of leave-one-out and n-fold cross-validation were similar to each other, showing that PropSeek is a stable and robust predictor with satisfying performance. Meanwhile, analyses of Gene Ontology, Kyoto Encyclopedia of Genes and Genomes pathways, and protein-protein interactions implied a potential role of prokaryotic propionylation in protein synthesis and metabolism.

  9. MannDB: A microbial annotation database for protein characterization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhou, C; Lam, M; Smith, J

    2006-05-19

    MannDB was created to meet a need for rapid, comprehensive automated protein sequence analyses to support selection of proteins suitable as targets for driving the development of reagents for pathogen or protein toxin detection. Because a large number of open-source tools were needed, it was necessary to produce a software system to scale the computations for whole-proteome analysis. Thus, we built a fully automated system for executing software tools and for storage, integration, and display of automated protein sequence analysis and annotation data. MannDB is a relational database that organizes data resulting from fully automated, high-throughput protein-sequence analyses using open-sourcemore » tools. Types of analyses provided include predictions of cleavage, chemical properties, classification, features, functional assignment, post-translational modifications, motifs, antigenicity, and secondary structure. Proteomes (lists of hypothetical and known proteins) are downloaded and parsed from Genbank and then inserted into MannDB, and annotations from SwissProt are downloaded when identifiers are found in the Genbank entry or when identical sequences are identified. Currently 36 open-source tools are run against MannDB protein sequences either on local systems or by means of batch submission to external servers. In addition, BLAST against protein entries in MvirDB, our database of microbial virulence factors, is performed. A web client browser enables viewing of computational results and downloaded annotations, and a query tool enables structured and free-text search capabilities. When available, links to external databases, including MvirDB, are provided. MannDB contains whole-proteome analyses for at least one representative organism from each category of biological threat organism listed by APHIS, CDC, HHS, NIAID, USDA, USFDA, and WHO. MannDB comprises a large number of genomes and comprehensive protein sequence analyses representing organisms listed as high-priority agents on the websites of several governmental organizations concerned with bio-terrorism. MannDB provides the user with a BLAST interface for comparison of native and non-native sequences and a query tool for conveniently selecting proteins of interest. In addition, the user has access to a web-based browser that compiles comprehensive and extensive reports.« less

  10. Large-scale collision cross-section profiling on a travelling wave ion mobility mass spectrometer

    PubMed Central

    Lietz, Christopher B.; Yu, Qing; Li, Lingjun

    2014-01-01

    Ion mobility (IM) is a gas-phase electrophoretic method that separates ions according to charge and ion-neutral collision cross-section (CCS). Herein, we attempt to apply a travelling wave (TW) IM polyalanine calibration method to shotgun proteomics and create a large peptide CCS database. Mass spectrometry methods that utilize IM, such as HDMSE, often use high transmission voltages for sensitive analysis. However, polyalanine calibration has only been demonstrated with low voltage transmission used to prevent gas-phase activation. If polyalanine ions change conformation under higher transmission voltages used for HDMSE, the calibration may no longer be valid. Thus, we aimed to characterize the accuracy of calibration and CCS measurement under high transmission voltages on a TW IM instrument using the polyalanine calibration method and found that the additional error was not significant. We also evaluated the potential error introduced by liquid chromatography (LC)-HDMSE analysis, and found it to be insignificant as well, validating the calibration method. Finally, we demonstrated the utility of building a large-population peptide CCS database by investigating the effects of terminal lysine position, via LysC or LysN digestion, on the formation of two structural sub-families formed by triply charged ions. PMID:24845359

  11. Proteomic Profiling of Cranial (Superior) Cervical Ganglia Reveals Beta-Amyloid and Ubiquitin Proteasome System Perturbations in an Equine Multiple System Neuropathy.

    PubMed

    McGorum, Bruce C; Pirie, R Scott; Eaton, Samantha L; Keen, John A; Cumyn, Elizabeth M; Arnott, Danielle M; Chen, Wenzhang; Lamont, Douglas J; Graham, Laura C; Llavero Hurtado, Maica; Pemberton, Alan; Wishart, Thomas M

    2015-11-01

    Equine grass sickness (EGS) is an acute, predominantly fatal, multiple system neuropathy of grazing horses with reported incidence rates of ∼2%. An apparently identical disease occurs in multiple species, including but not limited to cats, dogs, and rabbits. Although the precise etiology remains unclear, ultrastructural findings have suggested that the primary lesion lies in the glycoprotein biosynthetic pathway of specific neuronal populations. The goal of this study was therefore to identify the molecular processes underpinning neurodegeneration in EGS. Here, we use a bottom-up approach beginning with the application of modern proteomic tools to the analysis of cranial (superior) cervical ganglion (CCG, a consistently affected tissue) from EGS-affected patients and appropriate control cases postmortem. In what appears to be the proteomic application of modern proteomic tools to equine neuronal tissues and/or to an inherent neurodegenerative disease of large animals (not a model of human disease), we identified 2,311 proteins in CCG extracts, with 320 proteins increased and 186 decreased by greater than 20% relative to controls. Further examination of selected proteomic candidates by quantitative fluorescent Western blotting (QFWB) and subcellular expression profiling by immunohistochemistry highlighted a previously unreported dysregulation in proteins commonly associated with protein misfolding/aggregation responses seen in a myriad of human neurodegenerative conditions, including but not limited to amyloid precursor protein (APP), microtubule associated protein (Tau), and multiple components of the ubiquitin proteasome system (UPS). Differentially expressed proteins eligible for in silico pathway analysis clustered predominantly into the following biofunctions: (1) diseases and disorders, including; neurological disease and skeletal and muscular disorders and (2) molecular and cellular functions, including cellular assembly and organization, cell-to-cell signaling and interaction (including epinephrine, dopamine, and adrenergic signaling and receptor function), and small molecule biochemistry. Interestingly, while the biofunctions identified in this study may represent pathways underpinning EGS-induced neurodegeneration, this is also the first demonstration of potential molecular conservation (including previously unreported dysregulation of the UPS and APP) spanning the degenerative cascades from an apparently unrelated condition of large animals, to small animal models with altered neuronal vulnerability, and human neurological conditions. Importantly, this study highlights the feasibility and benefits of applying modern proteomic techniques to veterinary investigations of neurodegenerative processes in diseases of large animals. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

  12. Plasma proteomic analysis reveals altered protein abundances in cardiovascular disease.

    PubMed

    Lygirou, Vasiliki; Latosinska, Agnieszka; Makridakis, Manousos; Mullen, William; Delles, Christian; Schanstra, Joost P; Zoidakis, Jerome; Pieske, Burkert; Mischak, Harald; Vlahou, Antonia

    2018-04-17

    Cardiovascular disease (CVD) describes the pathological conditions of the heart and blood vessels. Despite the large number of studies on CVD and its etiology, its key modulators remain largely unknown. To this end, we performed a comprehensive proteomic analysis of blood plasma, with the scope to identify disease-associated changes after placing them in the context of existing knowledge, and generate a well characterized dataset for further use in CVD multi-omics integrative analysis. LC-MS/MS was employed to analyze plasma from 32 subjects (19 cases of various CVD phenotypes and 13 controls) in two steps: discovery (13 cases and 8 controls) and test (6 cases and 5 controls) set analysis. Following label-free quantification, the detected proteins were correlated to existing plasma proteomics datasets (plasma proteome database; PPD) and functionally annotated (Cytoscape, Ingenuity Pathway Analysis). Differential expression was defined based on identification confidence (≥ 2 peptides per protein), statistical significance (Mann-Whitney p value ≤ 0.05) and a minimum of twofold change. Peptides detected in at least 50% of samples per group were considered, resulting in a total of 3796 identified proteins (838 proteins based on ≥ 2 peptides). Pathway annotation confirmed the functional relevance of the findings (representation of complement cascade, fibrin clot formation, platelet degranulation, etc.). Correlation of the relative abundance of the proteins identified in the discovery set with their reported concentrations in the PPD was significant, confirming the validity of the quantification method. The discovery set analysis revealed 100 differentially expressed proteins between cases and controls, 39 of which were verified (≥ twofold change) in the test set. These included proteins already studied in the context of CVD (such as apolipoprotein B, alpha-2-macroglobulin), as well as novel findings (such as low density lipoprotein receptor related protein 2 [LRP2], protein SZT2) for which a mechanism of action is suggested. This proteomic study provides a comprehensive dataset to be used for integrative and functional studies in the field. The observed protein changes reflect known CVD-related processes (e.g. lipid uptake, inflammation) but also novel hypotheses for further investigation including a potential pleiotropic role of LPR2 but also links of SZT2 to CVD.

  13. Microgravity-driven remodeling of the proteome reveals insights into molecular mechanisms and signal networks involved in response to the space flight environment.

    PubMed

    Rea, Giuseppina; Cristofaro, Francesco; Pani, Giuseppe; Pascucci, Barbara; Ghuge, Sandip A; Corsetto, Paola Antonia; Imbriani, Marcello; Visai, Livia; Rizzo, Angela M

    2016-03-30

    Space is a hostile environment characterized by high vacuum, extreme temperatures, meteoroids, space debris, ionospheric plasma, microgravity and space radiation, which all represent risks for human health. A deep understanding of the biological consequences of exposure to the space environment is required to design efficient countermeasures to minimize their negative impact on human health. Recently, proteomic approaches have received a significant amount of attention in the effort to further study microgravity-induced physiological changes. In this review, we summarize the current knowledge about the effects of microgravity on microorganisms (in particular Cupriavidus metallidurans CH34, Bacillus cereus and Rhodospirillum rubrum S1H), plants (whole plants, organs, and cell cultures), mammalian cells (endothelial cells, bone cells, chondrocytes, muscle cells, thyroid cancer cells, immune system cells) and animals (invertebrates, vertebrates and mammals). Herein, we describe their proteome's response to microgravity, focusing on proteomic discoveries and their future potential applications in space research. Space experiments and operational flight experience have identified detrimental effects on human health and performance because of exposure to weightlessness, even when currently available countermeasures are implemented. Many experimental tools and methods have been developed to study microgravity induced physiological changes. Recently, genomic and proteomic approaches have received a significant amount of attention. This review summarizes the recent research studies of the proteome response to microgravity inmicroorganisms, plants, mammalians cells and animals. Current proteomic tools allow large-scale, high-throughput analyses for the detection, identification, and functional investigation of all proteomes. Understanding gene and/or protein expression is the key to unlocking the mechanisms behind microgravity-induced problems and to finding effective countermeasures to spaceflight-induced alterations but also for the study of diseases on earth. Future perspectives are also highlighted. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. Experimental Null Method to Guide the Development of Technical Procedures and to Control False-Positive Discovery in Quantitative Proteomics.

    PubMed

    Shen, Xiaomeng; Hu, Qiang; Li, Jun; Wang, Jianmin; Qu, Jun

    2015-10-02

    Comprehensive and accurate evaluation of data quality and false-positive biomarker discovery is critical to direct the method development/optimization for quantitative proteomics, which nonetheless remains challenging largely due to the high complexity and unique features of proteomic data. Here we describe an experimental null (EN) method to address this need. Because the method experimentally measures the null distribution (either technical or biological replicates) using the same proteomic samples, the same procedures and the same batch as the case-vs-contol experiment, it correctly reflects the collective effects of technical variability (e.g., variation/bias in sample preparation, LC-MS analysis, and data processing) and project-specific features (e.g., characteristics of the proteome and biological variation) on the performances of quantitative analysis. To show a proof of concept, we employed the EN method to assess the quantitative accuracy and precision and the ability to quantify subtle ratio changes between groups using different experimental and data-processing approaches and in various cellular and tissue proteomes. It was found that choices of quantitative features, sample size, experimental design, data-processing strategies, and quality of chromatographic separation can profoundly affect quantitative precision and accuracy of label-free quantification. The EN method was also demonstrated as a practical tool to determine the optimal experimental parameters and rational ratio cutoff for reliable protein quantification in specific proteomic experiments, for example, to identify the necessary number of technical/biological replicates per group that affords sufficient power for discovery. Furthermore, we assessed the ability of EN method to estimate levels of false-positives in the discovery of altered proteins, using two concocted sample sets mimicking proteomic profiling using technical and biological replicates, respectively, where the true-positives/negatives are known and span a wide concentration range. It was observed that the EN method correctly reflects the null distribution in a proteomic system and accurately measures false altered proteins discovery rate (FADR). In summary, the EN method provides a straightforward, practical, and accurate alternative to statistics-based approaches for the development and evaluation of proteomic experiments and can be universally adapted to various types of quantitative techniques.

  15. Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data.

    PubMed

    Tekwe, Carmen D; Carroll, Raymond J; Dabney, Alan R

    2012-08-01

    Protein abundance in quantitative proteomics is often based on observed spectral features derived from liquid chromatography mass spectrometry (LC-MS) or LC-MS/MS experiments. Peak intensities are largely non-normal in distribution. Furthermore, LC-MS-based proteomics data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model and accelerated failure time-model with log-normal, log-logistic and Weibull distributions were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated datasets. Survival methods generally have greater statistical power than standard differential expression methods when the proportion of missing protein level data is 5% or more. In particular, the AFT models we consider consistently achieve greater statistical power than standard testing procedures, with the discrepancy widening with increasing missingness in the proportions. The testing procedures discussed in this article can all be performed using readily available software such as R. The R codes are provided as supplemental materials. ctekwe@stat.tamu.edu.

  16. Proteomic analyses of host and pathogen responses during bovine mastitis.

    PubMed

    Boehmer, Jamie L

    2011-12-01

    The pursuit of biomarkers for use as clinical screening tools, measures for early detection, disease monitoring, and as a means for assessing therapeutic responses has steadily evolved in human and veterinary medicine over the past two decades. Concurrently, advances in mass spectrometry have markedly expanded proteomic capabilities for biomarker discovery. While initial mass spectrometric biomarker discovery endeavors focused primarily on the detection of modulated proteins in human tissues and fluids, recent efforts have shifted to include proteomic analyses of biological samples from food animal species. Mastitis continues to garner attention in veterinary research due mainly to affiliated financial losses and food safety concerns over antimicrobial use, but also because there are only a limited number of efficacious mastitis treatment options. Accordingly, comparative proteomic analyses of bovine milk have emerged in recent years. Efforts to prevent agricultural-related food-borne illness have likewise fueled an interest in the proteomic evaluation of several prominent strains of bacteria, including common mastitis pathogens. The interest in establishing biomarkers of the host and pathogen responses during bovine mastitis stems largely from the need to better characterize mechanisms of the disease, to identify reliable biomarkers for use as measures of early detection and drug efficacy, and to uncover potentially novel targets for the development of alternative therapeutics. The following review focuses primarily on comparative proteomic analyses conducted on healthy versus mastitic bovine milk. However, a comparison of the host defense proteome of human and bovine milk and the proteomic analysis of common veterinary pathogens are likewise introduced.

  17. A proteomic analysis of the chromoplasts isolated from sweet orange fruits [Citrus sinensis (L.) Osbeck

    PubMed Central

    Zeng, Yunliu; Pan, Zhiyong; Ding, Yuduan; Zhu, Andan; Cao, Hongbo; Xu, Qiang; Deng, Xiuxin

    2011-01-01

    Here, a comprehensive proteomic analysis of the chromoplasts purified from sweet orange using Nycodenz density gradient centrifugation is reported. A GeLC-MS/MS shotgun approach was used to identify the proteins of pooled chromoplast samples. A total of 493 proteins were identified from purified chromoplasts, of which 418 are putative plastid proteins based on in silico sequence homology and functional analyses. Based on the predicted functions of these identified plastid proteins, a large proportion (∼60%) of the chromoplast proteome of sweet orange is constituted by proteins involved in carbohydrate metabolism, amino acid/protein synthesis, and secondary metabolism. Of note, HDS (hydroxymethylbutenyl 4-diphosphate synthase), PAP (plastid-lipid-associated protein), and psHSPs (plastid small heat shock proteins) involved in the synthesis or storage of carotenoid and stress response are among the most abundant proteins identified. A comparison of chromoplast proteomes between sweet orange and tomato suggested a high level of conservation in a broad range of metabolic pathways. However, the citrus chromoplast was characterized by more extensive carotenoid synthesis, extensive amino acid synthesis without nitrogen assimilation, and evidence for lipid metabolism concerning jasmonic acid synthesis. In conclusion, this study provides an insight into the major metabolic pathways as well as some unique characteristics of the sweet orange chromoplasts at the whole proteome level. PMID:21841170

  18. Identification of IGFBP2 and IGFBP3 As Compensatory Biomarkers for CA19-9 in Early-Stage Pancreatic Cancer Using a Combination of Antibody-Based and LC-MS/MS-Based Proteomics

    PubMed Central

    Yoneyama, Toshihiro; Ohtsuki, Sumio; Honda, Kazufumi; Kobayashi, Makoto; Iwasaki, Motoki; Uchida, Yasuo; Okusaka, Takuji; Nakamori, Shoji; Shimahara, Masashi; Ueno, Takaaki; Tsuchida, Akihiko; Sata, Naohiro; Ioka, Tatsuya; Yasunami, Yohichi; Kosuge, Tomoo; Kaneda, Takashi; Kato, Takao; Yagihara, Kazuhiro; Fujita, Shigeyuki; Huang, Wilber; Yamada, Tesshi; Tachikawa, Masanori; Terasaki, Tetsuya

    2016-01-01

    Pancreatic cancer is one of the most lethal tumors, and reliable detection of early-stage pancreatic cancer and risk diseases for pancreatic cancer is essential to improve the prognosis. As 260 genes were previously reported to be upregulated in invasive ductal adenocarcinoma of pancreas (IDACP) cells, quantification of the corresponding proteins in plasma might be useful for IDACP diagnosis. Therefore, the purpose of the present study was to identify plasma biomarkers for early detection of IDACP by using two proteomics strategies: antibody-based proteomics and liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based proteomics. Among the 260 genes, we focused on 130 encoded proteins with known function for which antibodies were available. Twenty-three proteins showed values of the area under the curve (AUC) of more than 0.8 in receiver operating characteristic (ROC) analysis of reverse-phase protein array (RPPA) data of IDACP patients compared with healthy controls, and these proteins were selected as biomarker candidates. We then used our high-throughput selected reaction monitoring or multiple reaction monitoring (SRM/MRM) methodology, together with an automated sample preparation system, micro LC and auto analysis system, to quantify these candidate proteins in plasma from healthy controls and IDACP patients on a large scale. The results revealed that insulin-like growth factor-binding protein (IGFBP)2 and IGFBP3 have the ability to discriminate IDACP patients at an early stage from healthy controls, and IGFBP2 appeared to be increased in risk diseases of pancreatic malignancy, such as intraductal papillary mucinous neoplasms (IPMNs). Furthermore, diagnosis of IDACP using the combination of carbohydrate antigen 19–9 (CA19-9), IGFBP2 and IGFBP3 is significantly more effective than CA19-9 alone. This suggests that IGFBP2 and IGFBP3 may serve as compensatory biomarkers for CA19-9. Early diagnosis with this marker combination may improve the prognosis of IDACP patients. PMID:27579675

  19. Functional proteomic analysis reveals the involvement of KIAA1199 in breast cancer growth, motility and invasiveness

    PubMed Central

    2014-01-01

    Background KIAA1199 is a recently identified novel gene that is up-regulated in human cancer with poor survival. Our proteomic study on signaling polarity in chemotactic cells revealed KIAA1199 as a novel protein target that may be involved in cellular chemotaxis and motility. In the present study, we examined the functional significance of KIAA1199 expression in breast cancer growth, motility and invasiveness. Methods We validated the previous microarray observation by tissue microarray immunohistochemistry using a TMA slide containing 12 breast tumor tissue cores and 12 corresponding normal tissues. We performed the shRNA-mediated knockdown of KIAA1199 in MDA-MB-231 and HS578T cells to study the role of this protein in cell proliferation, migration and apoptosis in vitro. We studied the effects of KIAA1199 knockdown in vivo in two groups of mice (n = 5). We carried out the SILAC LC-MS/MS based proteomic studies on the involvement of KIAA1199 in breast cancer. Results KIAA1199 mRNA and protein was significantly overexpressed in breast tumor specimens and cell lines as compared with non-neoplastic breast tissues from large-scale microarray and studies of breast cancer cell lines and tumors. To gain deeper insights into the novel role of KIAA1199 in breast cancer, we modulated KIAA1199 expression using shRNA-mediated knockdown in two breast cancer cell lines (MDA-MB-231 and HS578T), expressing higher levels of KIAA1199. The KIAA1199 knockdown cells showed reduced motility and cell proliferation in vitro. Moreover, when the knockdown cells were injected into the mammary fat pads of female athymic nude mice, there was a significant decrease in tumor incidence and growth. In addition, quantitative proteomic analysis revealed that knockdown of KIAA1199 in breast cancer (MDA-MB-231) cells affected a broad range of cellular functions including apoptosis, metabolism and cell motility. Conclusions Our findings indicate that KIAA1199 may play an important role in breast tumor growth and invasiveness, and that it may represent a novel target for biomarker development and a novel therapeutic target for breast cancer. PMID:24628760

  20. Neural Stem Cells (NSCs) and Proteomics*

    PubMed Central

    Shoemaker, Lorelei D.; Kornblum, Harley I.

    2016-01-01

    Neural stem cells (NSCs) can self-renew and give rise to the major cell types of the CNS. Studies of NSCs include the investigation of primary, CNS-derived cells as well as animal and human embryonic stem cell (ESC)-derived and induced pluripotent stem cell (iPSC)-derived sources. NSCs provide a means with which to study normal neural development, neurodegeneration, and neurological disease and are clinically relevant sources for cellular repair to the damaged and diseased CNS. Proteomics studies of NSCs have the potential to delineate molecules and pathways critical for NSC biology and the means by which NSCs can participate in neural repair. In this review, we provide a background to NSC biology, including the means to obtain them and the caveats to these processes. We then focus on advances in the proteomic interrogation of NSCs. This includes the analysis of posttranslational modifications (PTMs); approaches to analyzing different proteomic compartments, such the secretome; as well as approaches to analyzing temporal differences in the proteome to elucidate mechanisms of differentiation. We also discuss some of the methods that will undoubtedly be useful in the investigation of NSCs but which have not yet been applied to the field. While many proteomics studies of NSCs have largely catalogued the proteome or posttranslational modifications of specific cellular states, without delving into specific functions, some have led to understandings of functional processes or identified markers that could not have been identified via other means. Many challenges remain in the field, including the precise identification and standardization of NSCs used for proteomic analyses, as well as how to translate fundamental proteomics studies to functional biology. The next level of investigation will require interdisciplinary approaches, combining the skills of those interested in the biochemistry of proteomics with those interested in modulating NSC function. PMID:26494823

  1. An Analysis of the Sensitivity of Proteogenomic Mapping of Somatic Mutations and Novel Splicing Events in Cancer.

    PubMed

    Ruggles, Kelly V; Tang, Zuojian; Wang, Xuya; Grover, Himanshu; Askenazi, Manor; Teubl, Jennifer; Cao, Song; McLellan, Michael D; Clauser, Karl R; Tabb, David L; Mertins, Philipp; Slebos, Robbert; Erdmann-Gilmore, Petra; Li, Shunqiang; Gunawardena, Harsha P; Xie, Ling; Liu, Tao; Zhou, Jian-Ying; Sun, Shisheng; Hoadley, Katherine A; Perou, Charles M; Chen, Xian; Davies, Sherri R; Maher, Christopher A; Kinsinger, Christopher R; Rodland, Karen D; Zhang, Hui; Zhang, Zhen; Ding, Li; Townsend, R Reid; Rodriguez, Henry; Chan, Daniel; Smith, Richard D; Liebler, Daniel C; Carr, Steven A; Payne, Samuel; Ellis, Matthew J; Fenyő, David

    2016-03-01

    Improvements in mass spectrometry (MS)-based peptide sequencing provide a new opportunity to determine whether polymorphisms, mutations, and splice variants identified in cancer cells are translated. Herein, we apply a proteogenomic data integration tool (QUILTS) to illustrate protein variant discovery using whole genome, whole transcriptome, and global proteome datasets generated from a pair of luminal and basal-like breast-cancer-patient-derived xenografts (PDX). The sensitivity of proteogenomic analysis for singe nucleotide variant (SNV) expression and novel splice junction (NSJ) detection was probed using multiple MS/MS sample process replicates defined here as an independent tandem MS experiment using identical sample material. Despite analysis of over 30 sample process replicates, only about 10% of SNVs (somatic and germline) detected by both DNA and RNA sequencing were observed as peptides. An even smaller proportion of peptides corresponding to NSJ observed by RNA sequencing were detected (<0.1%). Peptides mapping to DNA-detected SNVs without a detectable mRNA transcript were also observed, suggesting that transcriptome coverage was incomplete (∼80%). In contrast to germline variants, somatic variants were less likely to be detected at the peptide level in the basal-like tumor than in the luminal tumor, raising the possibility of differential translation or protein degradation effects. In conclusion, this large-scale proteogenomic integration allowed us to determine the degree to which mutations are translated and identify gaps in sequence coverage, thereby benchmarking current technology and progress toward whole cancer proteome and transcriptome analysis. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  2. Whole-proteome phylogeny of large dsDNA viruses and parvoviruses through a composition vector method related to dynamical language model

    PubMed Central

    2010-01-01

    Background The vast sequence divergence among different virus groups has presented a great challenge to alignment-based analysis of virus phylogeny. Due to the problems caused by the uncertainty in alignment, existing tools for phylogenetic analysis based on multiple alignment could not be directly applied to the whole-genome comparison and phylogenomic studies of viruses. There has been a growing interest in alignment-free methods for phylogenetic analysis using complete genome data. Among the alignment-free methods, a dynamical language (DL) method proposed by our group has successfully been applied to the phylogenetic analysis of bacteria and chloroplast genomes. Results In this paper, the DL method is used to analyze the whole-proteome phylogeny of 124 large dsDNA viruses and 30 parvoviruses, two data sets with large difference in genome size. The trees from our analyses are in good agreement to the latest classification of large dsDNA viruses and parvoviruses by the International Committee on Taxonomy of Viruses (ICTV). Conclusions The present method provides a new way for recovering the phylogeny of large dsDNA viruses and parvoviruses, and also some insights on the affiliation of a number of unclassified viruses. In comparison, some alignment-free methods such as the CV Tree method can be used for recovering the phylogeny of large dsDNA viruses, but they are not suitable for resolving the phylogeny of parvoviruses with a much smaller genome size. PMID:20565983

  3. Proteomic Screening of Antigenic Proteins from the Hard Tick, Haemaphysalis longicornis (Acari: Ixodidae)

    PubMed Central

    Kim, Young-Ha; slam, Mohammad Saiful; You, Myung-Jo

    2015-01-01

    Proteomic tools allow large-scale, high-throughput analyses for the detection, identification, and functional investigation of proteome. For detection of antigens from Haemaphysalis longicornis, 1-dimensional electrophoresis (1-DE) quantitative immunoblotting technique combined with 2-dimensional electrophoresis (2-DE) immunoblotting was used for whole body proteins from unfed and partially fed female ticks. Reactivity bands and 2-DE immunoblotting were performed following 2-DE electrophoresis to identify protein spots. The proteome of the partially fed female had a larger number of lower molecular weight proteins than that of the unfed female tick. The total number of detected spots was 818 for unfed and 670 for partially fed female ticks. The 2-DE immunoblotting identified 10 antigenic spots from unfed females and 8 antigenic spots from partially fed females. Matrix Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry (MALDI-TOF) of relevant spots identified calreticulin, putative secreted WC salivary protein, and a conserved hypothetical protein from the National Center for Biotechnology Information and Swiss Prot protein sequence databases. These findings indicate that most of the whole body components of these ticks are non-immunogenic. The data reported here will provide guidance in the identification of antigenic proteins to prevent infestation and diseases transmitted by H. longicornis. PMID:25748713

  4. Identification of methyllysine peptides binding to chromobox protein homolog 6 chromodomain in the human proteome.

    PubMed

    Li, Nan; Stein, Richard S L; He, Wei; Komives, Elizabeth; Wang, Wei

    2013-10-01

    Methylation is one of the important post-translational modifications that play critical roles in regulating protein functions. Proteomic identification of this post-translational modification and understanding how it affects protein activity remain great challenges. We tackled this problem from the aspect of methylation mediating protein-protein interaction. Using the chromodomain of human chromobox protein homolog 6 as a model system, we developed a systematic approach that integrates structure modeling, bioinformatics analysis, and peptide microarray experiments to identify lysine residues that are methylated and recognized by the chromodomain in the human proteome. Given the important role of chromobox protein homolog 6 as a reader of histone modifications, it was interesting to find that the majority of its interacting partners identified via this approach function in chromatin remodeling and transcriptional regulation. Our study not only illustrates a novel angle for identifying methyllysines on a proteome-wide scale and elucidating their potential roles in regulating protein function, but also suggests possible strategies for engineering the chromodomain-peptide interface to enhance the recognition of and manipulate the signal transduction mediated by such interactions.

  5. Proteomic Challenges: Sample Preparation Techniques for Microgram-Quantity Protein Analysis from Biological Samples

    PubMed Central

    Feist, Peter; Hummon, Amanda B.

    2015-01-01

    Proteins regulate many cellular functions and analyzing the presence and abundance of proteins in biological samples are central focuses in proteomics. The discovery and validation of biomarkers, pathways, and drug targets for various diseases can be accomplished using mass spectrometry-based proteomics. However, with mass-limited samples like tumor biopsies, it can be challenging to obtain sufficient amounts of proteins to generate high-quality mass spectrometric data. Techniques developed for macroscale quantities recover sufficient amounts of protein from milligram quantities of starting material, but sample losses become crippling with these techniques when only microgram amounts of material are available. To combat this challenge, proteomicists have developed micro-scale techniques that are compatible with decreased sample size (100 μg or lower) and still enable excellent proteome coverage. Extraction, contaminant removal, protein quantitation, and sample handling techniques for the microgram protein range are reviewed here, with an emphasis on liquid chromatography and bottom-up mass spectrometry-compatible techniques. Also, a range of biological specimens, including mammalian tissues and model cell culture systems, are discussed. PMID:25664860

  6. The effect of disease on human cardiac protein expression profiles in paired samples from right and left ventricles

    PubMed Central

    2014-01-01

    Background Cardiac diseases (e.g. coronary and valve) are associated with ventricular cellular remodeling. However, ventricular biopsies from left and right ventricles from patients with different pathologies are rare and thus little is known about disease-induced cellular remodeling in both sides of the heart and between different diseases. We hypothesized that the protein expression profiles between right and left ventricles of patients with aortic valve stenosis (AVS) and patients with coronary artery disease (CAD) are different and that the protein profile is different between the two diseases. Left and right ventricular biopsies were collected from patients with either CAD or AVS. The biopsies were processed for proteomic analysis using isobaric tandem mass tagging and analyzed by reverse phase nano-LC-MS/MS. Western blot for selected proteins showed strong correlation with proteomic analysis. Results Proteomic analysis between ventricles of the same disease (intra-disease) and between ventricles of different diseases (inter-disease) identified more than 500 proteins detected in all relevant ventricular biopsies. Comparison between ventricles and disease state was focused on proteins with relatively high fold (±1.2 fold difference) and significant (P < 0.05) differences. Intra-disease protein expression differences between left and right ventricles were largely structural for AVS patients and largely signaling/metabolism for CAD. Proteins commonly associated with hypertrophy were also different in the AVS group but with lower fold difference. Inter-disease differences between left ventricles of AVS and CAD were detected in 9 proteins. However, inter-disease differences between the right ventricles of CAD and AVS patients were associated with differences in 73 proteins. The majority of proteins which had a significant difference in one ventricle compared to the other pathology also had a similar trend in the adjacent ventricle. Conclusions This work demonstrates for the first time that left and right ventricles have a different proteome and that the difference is dependent on the type of disease. Inter-disease differential expression was more prominent for right ventricles. The finding that a protein change in one ventricle was often associated with a similar trend in the adjacent ventricle for a large number of proteins suggests cross-talk proteome remodeling between adjacent ventricles. PMID:25249829

  7. MS Data Miner: a web-based software tool to analyze, compare, and share mass spectrometry protein identifications.

    PubMed

    Dyrlund, Thomas F; Poulsen, Ebbe T; Scavenius, Carsten; Sanggaard, Kristian W; Enghild, Jan J

    2012-09-01

    Data processing and analysis of proteomics data are challenging and time consuming. In this paper, we present MS Data Miner (MDM) (http://sourceforge.net/p/msdataminer), a freely available web-based software solution aimed at minimizing the time required for the analysis, validation, data comparison, and presentation of data files generated in MS software, including Mascot (Matrix Science), Mascot Distiller (Matrix Science), and ProteinPilot (AB Sciex). The program was developed to significantly decrease the time required to process large proteomic data sets for publication. This open sourced system includes a spectra validation system and an automatic screenshot generation tool for Mascot-assigned spectra. In addition, a Gene Ontology term analysis function and a tool for generating comparative Excel data reports are included. We illustrate the benefits of MDM during a proteomics study comprised of more than 200 LC-MS/MS analyses recorded on an AB Sciex TripleTOF 5600, identifying more than 3000 unique proteins and 3.5 million peptides. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Phylogenetic Tracings of Proteome Size Support the Gradual Accretion of Protein Structural Domains and the Early Origin of Viruses from Primordial Cells

    PubMed Central

    Nasir, Arshan; Kim, Kyung Mo; Caetano-Anollés, Gustavo

    2017-01-01

    Untangling the origin and evolution of viruses remains a challenging proposition. We recently studied the global distribution of protein domain structures in thousands of completely sequenced viral and cellular proteomes with comparative genomics, phylogenomics, and multidimensional scaling methods. A tree of life describing the evolution of proteomes revealed viruses emerging from the base of the tree as a fourth supergroup of life. A tree of domains indicated an early origin of modern viral lineages from ancient cells that co-existed with the cellular ancestors. However, it was recently argued that the rooting of our trees and the basal placement of viruses was artifactually induced by small genome (proteome) size. Here we show that these claims arise from misunderstanding and misinterpretations of cladistic methodology. Trees are reconstructed unrooted, and thus, their topologies cannot be distorted a posteriori by the rooting methodology. Tracing proteome size in trees and multidimensional views of evolutionary relationships as well as tests of leaf stability and exclusion/inclusion of taxa demonstrated that the smallest proteomes were neither attracted toward the root nor caused any topological distortions of the trees. Simulations confirmed that taxa clustering patterns were independent of proteome size and were determined by the presence of known evolutionary relatives in data matrices, highlighting the need for broader taxon sampling in phylogeny reconstruction. Instead, phylogenetic tracings of proteome size revealed a slowdown in innovation of the structural domain vocabulary and four regimes of allometric scaling that reflected a Heaps law. These regimes explained increasing economies of scale in the evolutionary growth and accretion of kernel proteome repertoires of viruses and cellular organisms that resemble growth of human languages with limited vocabulary sizes. Results reconcile dynamic and static views of domain frequency distributions that are consistent with the axiom of spatiotemporal continuity that is tenet of evolutionary thinking. PMID:28690608

  9. One Sample, One Shot - Evaluation of sample preparation protocols for the mass spectrometric proteome analysis of human bile fluid without extensive fractionation.

    PubMed

    Megger, Dominik A; Padden, Juliet; Rosowski, Kristin; Uszkoreit, Julian; Bracht, Thilo; Eisenacher, Martin; Gerges, Christian; Neuhaus, Horst; Schumacher, Brigitte; Schlaak, Jörg F; Sitek, Barbara

    2017-02-10

    The proteome analysis of bile fluid represents a promising strategy to identify biomarker candidates for various diseases of the hepatobiliary system. However, to obtain substantive results in biomarker discovery studies large patient cohorts necessarily need to be analyzed. Consequently, this would lead to an unmanageable number of samples to be analyzed if sample preparation protocols with extensive fractionation methods are applied. Hence, the performance of simple workflows allowing for "one sample, one shot" experiments have been evaluated in this study. In detail, sixteen different protocols implying modifications at the stages of desalting, delipidation, deglycosylation and tryptic digestion have been examined. Each method has been individually evaluated regarding various performance criteria and comparative analyses have been conducted to uncover possible complementarities. Here, the best performance in terms of proteome coverage has been assessed for a combination of acetone precipitation with in-gel digestion. Finally, a mapping of all obtained protein identifications with putative biomarkers for hepatocellular carcinoma (HCC) and cholangiocellular carcinoma (CCC) revealed several proteins easily detectable in bile fluid. These results can build the basis for future studies with large and well-defined patient cohorts in a more disease-related context. Human bile fluid is a proximal body fluid and supposed to be a potential source of disease markers. However, due to its biochemical composition, the proteome analysis of bile fluid still represents a challenging task and is therefore mostly conducted using extensive fractionation procedures. This in turn leads to a high number of mass spectrometric measurements for one biological sample. Considering the fact that in order to overcome the biological variability a high number of biological samples needs to be analyzed in biomarker discovery studies, this leads to the dilemma of an unmanageable number of necessary MS-based analyses. Hence, easy sample preparation protocols are demanded representing a compromise between proteome coverage and simplicity. In the presented study, such protocols have been evaluated regarding various technical criteria (e.g. identification rates, missed cleavages, chromatographic separation) uncovering the strengths and weaknesses of various methods. Furthermore, a cumulative bile proteome list has been generated that extends the current bile proteome catalog by 248 proteins. Finally, a mapping with putative biomarkers for hepatocellular carcinoma (HCC) and cholangiocellular carcinoma (CCC) derived from tissue-based studies, revealed several of these proteins being easily and reproducibly detectable in human bile. Therefore, the presented technical work represents a solid base for future disease-related studies. Copyright © 2016 Elsevier B.V. All rights reserved.

  10. A flexible statistical model for alignment of label-free proteomics data – incorporating ion mobility and product ion information

    PubMed Central

    2013-01-01

    Background The goal of many proteomics experiments is to determine the abundance of proteins in biological samples, and the variation thereof in various physiological conditions. High-throughput quantitative proteomics, specifically label-free LC-MS/MS, allows rapid measurement of thousands of proteins, enabling large-scale studies of various biological systems. Prior to analyzing these information-rich datasets, raw data must undergo several computational processing steps. We present a method to address one of the essential steps in proteomics data processing - the matching of peptide measurements across samples. Results We describe a novel method for label-free proteomics data alignment with the ability to incorporate previously unused aspects of the data, particularly ion mobility drift times and product ion information. We compare the results of our alignment method to PEPPeR and OpenMS, and compare alignment accuracy achieved by different versions of our method utilizing various data characteristics. Our method results in increased match recall rates and similar or improved mismatch rates compared to PEPPeR and OpenMS feature-based alignment. We also show that the inclusion of drift time and product ion information results in higher recall rates and more confident matches, without increases in error rates. Conclusions Based on the results presented here, we argue that the incorporation of ion mobility drift time and product ion information are worthy pursuits. Alignment methods should be flexible enough to utilize all available data, particularly with recent advancements in experimental separation methods. PMID:24341404

  11. A flexible statistical model for alignment of label-free proteomics data--incorporating ion mobility and product ion information.

    PubMed

    Benjamin, Ashlee M; Thompson, J Will; Soderblom, Erik J; Geromanos, Scott J; Henao, Ricardo; Kraus, Virginia B; Moseley, M Arthur; Lucas, Joseph E

    2013-12-16

    The goal of many proteomics experiments is to determine the abundance of proteins in biological samples, and the variation thereof in various physiological conditions. High-throughput quantitative proteomics, specifically label-free LC-MS/MS, allows rapid measurement of thousands of proteins, enabling large-scale studies of various biological systems. Prior to analyzing these information-rich datasets, raw data must undergo several computational processing steps. We present a method to address one of the essential steps in proteomics data processing--the matching of peptide measurements across samples. We describe a novel method for label-free proteomics data alignment with the ability to incorporate previously unused aspects of the data, particularly ion mobility drift times and product ion information. We compare the results of our alignment method to PEPPeR and OpenMS, and compare alignment accuracy achieved by different versions of our method utilizing various data characteristics. Our method results in increased match recall rates and similar or improved mismatch rates compared to PEPPeR and OpenMS feature-based alignment. We also show that the inclusion of drift time and product ion information results in higher recall rates and more confident matches, without increases in error rates. Based on the results presented here, we argue that the incorporation of ion mobility drift time and product ion information are worthy pursuits. Alignment methods should be flexible enough to utilize all available data, particularly with recent advancements in experimental separation methods.

  12. Development of an open source laboratory information management system for 2-D gel electrophoresis-based proteomics workflow

    PubMed Central

    Morisawa, Hiraku; Hirota, Mikako; Toda, Tosifusa

    2006-01-01

    Background In the post-genome era, most research scientists working in the field of proteomics are confronted with difficulties in management of large volumes of data, which they are required to keep in formats suitable for subsequent data mining. Therefore, a well-developed open source laboratory information management system (LIMS) should be available for their proteomics research studies. Results We developed an open source LIMS appropriately customized for 2-D gel electrophoresis-based proteomics workflow. The main features of its design are compactness, flexibility and connectivity to public databases. It supports the handling of data imported from mass spectrometry software and 2-D gel image analysis software. The LIMS is equipped with the same input interface for 2-D gel information as a clickable map on public 2DPAGE databases. The LIMS allows researchers to follow their own experimental procedures by reviewing the illustrations of 2-D gel maps and well layouts on the digestion plates and MS sample plates. Conclusion Our new open source LIMS is now available as a basic model for proteome informatics, and is accessible for further improvement. We hope that many research scientists working in the field of proteomics will evaluate our LIMS and suggest ways in which it can be improved. PMID:17018156

  13. Sequential protein extraction as an efficient method for improved proteome coverage in larvae of Atlantic salmon (Salmo salar).

    PubMed

    Nuez-Ortín, Waldo G; Carter, Chris G; Nichols, Peter D; Wilson, Richard

    2016-07-01

    Understanding diet- and environmentally induced physiological changes in fish larvae is a major goal for the aquaculture industry. Proteomic analysis of whole fish larvae comprising multiple tissues offers considerable potential but is challenging due to the very large dynamic range of protein abundance. To extend the coverage of the larval phase of the Atlantic salmon (Salmo salar) proteome, we applied a two-step sequential extraction (SE) method, based on differential protein solubility, using a nondenaturing buffer containing 150 mM NaCl followed by a denaturing buffer containing 7 M urea and 2 M thiourea. Extracts prepared using SE and one-step direct extraction were characterized via label-free shotgun proteomics using nanoLC-MS/MS (LTQ-Orbitrap). SE partitioned the proteins into two fractions of approximately equal amounts, but with very distinct protein composition, leading to identification of ∼40% more proteins than direct extraction. This fractionation strategy enabled the most detailed characterization of the salmon larval proteome to date and provides a platform for greater understanding of physiological changes in whole fish larvae. The MS data are available via the ProteomeXchange Consortium PRIDE partner repository, dataset PXD003366. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Proteome-Scale Human Interactomics.

    PubMed

    Luck, Katja; Sheynkman, Gloria M; Zhang, Ivy; Vidal, Marc

    2017-05-01

    Cellular functions are mediated by complex interactome networks of physical, biochemical, and functional interactions between DNA sequences, RNA molecules, proteins, lipids, and small metabolites. A thorough understanding of cellular organization requires accurate and relatively complete models of interactome networks at proteome scale. The recent publication of four human protein-protein interaction (PPI) maps represents a technological breakthrough and an unprecedented resource for the scientific community, heralding a new era of proteome-scale human interactomics. Our knowledge gained from these and complementary studies provides fresh insights into the opportunities and challenges when analyzing systematically generated interactome data, defines a clear roadmap towards the generation of a first reference interactome, and reveals new perspectives on the organization of cellular life. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. Global Proteomics Analysis of the Response to Starvation in C. elegans*

    PubMed Central

    Larance, Mark; Pourkarimi, Ehsan; Wang, Bin; Brenes Murillo, Alejandro; Kent, Robert; Lamond, Angus I.; Gartner, Anton

    2015-01-01

    Periodic starvation of animals induces large shifts in metabolism but may also influence many other cellular systems and can lead to adaption to prolonged starvation conditions. To date, there is limited understanding of how starvation affects gene expression, particularly at the protein level. Here, we have used mass-spectrometry-based quantitative proteomics to identify global changes in the Caenorhabditis elegans proteome due to acute starvation of young adult animals. Measuring changes in the abundance of over 5,000 proteins, we show that acute starvation rapidly alters the levels of hundreds of proteins, many involved in central metabolic pathways, highlighting key regulatory responses. Surprisingly, we also detect changes in the abundance of chromatin-associated proteins, including specific linker histones, histone variants, and histone posttranslational modifications associated with the epigenetic control of gene expression. To maximize community access to these data, they are presented in an online searchable database, the Encyclopedia of Proteome Dynamics (http://www.peptracker.com/epd/). PMID:25963834

  16. Interaction Analysis through Proteomic Phage Display

    PubMed Central

    2014-01-01

    Phage display is a powerful technique for profiling specificities of peptide binding domains. The method is suited for the identification of high-affinity ligands with inhibitor potential when using highly diverse combinatorial peptide phage libraries. Such experiments further provide consensus motifs for genome-wide scanning of ligands of potential biological relevance. A complementary but considerably less explored approach is to display expression products of genomic DNA, cDNA, open reading frames (ORFs), or oligonucleotide libraries designed to encode defined regions of a target proteome on phage particles. One of the main applications of such proteomic libraries has been the elucidation of antibody epitopes. This review is focused on the use of proteomic phage display to uncover protein-protein interactions of potential relevance for cellular function. The method is particularly suited for the discovery of interactions between peptide binding domains and their targets. We discuss the largely unexplored potential of this method in the discovery of domain-motif interactions of potential biological relevance. PMID:25295249

  17. Monitoring Peptidase Activities in Complex Proteomes by MALDI-TOF Mass Spectrometry

    PubMed Central

    Villanueva, Josep; Nazarian, Arpi; Lawlor, Kevin; Tempst, Paul

    2009-01-01

    Measuring enzymatic activities in biological fluids is a form of activity-based proteomics and may be utilized as a means of developing disease biomarkers. Activity-based assays allow amplification of output signals, thus potentially visualizing low-abundant enzymes on a virtually transparent whole-proteome background. The protocol presented here describes a semi-quantitative in vitro assay of proteolytic activities in complex proteomes by monitoring breakdown of designer peptide-substrates using robotic extraction and a MALDI-TOF mass spectrometric read-out. Relative quantitation of the peptide metabolites is done by comparison with spiked internal standards, followed by statistical analysis of the resulting mini-peptidome. Partial automation provides reproducibility and throughput essential for comparing large sample sets. The approach may be employed for diagnostic or predictive purposes and enables profiling of 96 samples in 30 hours. It could be tailored to many diagnostic and pharmaco-dynamic purposes, as a read-out of catalytic and metabolic activities in body fluids or tissues. PMID:19617888

  18. Evaluation of Selected Binding Domains for the Analysis of Ubiquitinated Proteomes

    NASA Astrophysics Data System (ADS)

    Nakayasu, Ernesto S.; Ansong, Charles; Brown, Joseph N.; Yang, Feng; Lopez-Ferrer, Daniel; Qian, Wei-Jun; Smith, Richard D.; Adkins, Joshua N.

    2013-08-01

    Ubiquitination is an abundant post-translational modification that consists of covalent attachment of ubiquitin to lysine residues or the N-terminus of proteins. Mono- and polyubiquitination have been shown to be involved in many critical eukaryotic cellular functions and are often disrupted by intracellular bacterial pathogens. Affinity enrichment of ubiquitinated proteins enables global analysis of this key modification. In this context, the use of ubiquitin-binding domains is a promising but relatively unexplored alternative to more broadly used immunoaffinity or tagged affinity enrichment methods. In this study, we evaluated the application of eight ubiquitin-binding domains that have differing affinities for ubiquitination states. Small-scale proteomics analysis identified ~200 ubiquitinated protein candidates per ubiquitin-binding domain pull-down experiment. Results from subsequent Western blot analyses that employed anti-ubiquitin or monoclonal antibodies against polyubiquitination at lysine 48 and 63 suggest that ubiquitin-binding domains from Dsk2 and ubiquilin-1 have the broadest specificity in that they captured most types of ubiquitination, whereas the binding domain from NBR1 was more selective to polyubiquitination. These data demonstrate that with optimized purification conditions, ubiquitin-binding domains can be an alternative tool for proteomic applications. This approach is especially promising for the analysis of tissues or cells resistant to transfection, of which the overexpression of tagged ubiquitin is a major hurdle.

  19. Elucidating Proteoform Families from Proteoform Intact-Mass and Lysine-Count Measurements

    PubMed Central

    2016-01-01

    Proteomics is presently dominated by the “bottom-up” strategy, in which proteins are enzymatically digested into peptides for mass spectrometric identification. Although this approach is highly effective at identifying large numbers of proteins present in complex samples, the digestion into peptides renders it impossible to identify the proteoforms from which they were derived. We present here a powerful new strategy for the identification of proteoforms and the elucidation of proteoform families (groups of related proteoforms) from the experimental determination of the accurate proteoform mass and number of lysine residues contained. Accurate proteoform masses are determined by standard LC–MS analysis of undigested protein mixtures in an Orbitrap mass spectrometer, and the lysine count is determined using the NeuCode isotopic tagging method. We demonstrate the approach in analysis of the yeast proteome, revealing 8637 unique proteoforms and 1178 proteoform families. The elucidation of proteoforms and proteoform families afforded here provides an unprecedented new perspective upon proteome complexity and dynamics. PMID:26941048

  20. Novel Phage Group Infecting Lactobacillus delbrueckii subsp. lactis, as Revealed by Genomic and Proteomic Analysis of Bacteriophage Ldl1

    PubMed Central

    Casey, Eoghan; Mahony, Jennifer; Neve, Horst; Noben, Jean-Paul; Dal Bello, Fabio

    2014-01-01

    Ldl1 is a virulent phage infecting the dairy starter Lactobacillus delbrueckii subsp. lactis LdlS. Electron microscopy analysis revealed that this phage exhibits a large head and a long tail and bears little resemblance to other characterized phages infecting Lactobacillus delbrueckii. In vitro propagation of this phage revealed a latent period of 30 to 40 min and a burst size of 59.9 ± 1.9 phage particles. Comparative genomic and proteomic analyses showed remarkable similarity between the genome of Ldl1 and that of Lactobacillus plantarum phage ATCC 8014-B2. The genomic and proteomic characteristics of Ldl1 demonstrate that this phage does not belong to any of the four previously recognized L. delbrueckii phage groups, necessitating the creation of a new group, called group e, thus adding to the knowledge on the diversity of phages targeting strains of this industrially important lactic acid bacterial species. PMID:25501478

  1. Targeted proteomic assays for quantitation of proteins identified by proteogenomic analysis of ovarian cancer

    DOE PAGES

    Song, Ehwang; Gao, Yuqian; Wu, Chaochao; ...

    2017-07-19

    Here, mass spectrometry (MS) based targeted proteomic methods such as selected reaction monitoring (SRM) are becoming the method of choice for preclinical verification of candidate protein biomarkers. The Clinical Proteomic Tumor Analysis Consortium (CPTAC) of the National Cancer Institute has investigated the standardization and analytical validation of the SRM assays and demonstrated robust analytical performance on different instruments across different laboratories. An Assay Portal has also been established by CPTAC to provide the research community a resource consisting of large set of targeted MS-based assays, and a depository to share assays publicly, providing that assays meet the guidelines proposed bymore » CPTAC. Herein, we report 98 SRM assays covering 70 candidate protein biomarkers previously reported as associated with ovarian cancer that have been thoroughly characterized according to the CPTAC Assay Characterization Guidance Document. The experiments, methods and results for characterizing these SRM assays for their MS response, repeatability, selectivity, stability, and reproducible detection of endogenous analytes are described in detail.« less

  2. Sequential Extraction Results in Improved Proteome Profiling of Medicinal Plant Pinellia ternata Tubers, Which Contain Large Amounts of High-Abundance Proteins

    PubMed Central

    An, SuFang; Gong, FangPing; Wang, Wei

    2012-01-01

    Pinellia ternata tuber is one of the well-known Chinese traditional medicines. In order to understand the pharmacological properties of tuber proteins, it is necessary to perform proteome analysis of P. ternata tubers. However, a few high-abundance proteins (HAPs), mainly mannose-binding lectin (agglutinin), exist in aggregates of various sizes in the tubers and seriously interfere with proteome profiling by two-dimensional electrophoresis (2-DE). Therefore, selective depletion of these HAPs is a prerequisite for enhanced proteome analysis of P. ternata tubers. Based on differential protein solubility, we developed a novel protocol involving two sequential extractions for depletion of some HAPs and prefractionation of tuber proteins prior to 2-DE. The first extraction using 10% acetic acid selectively extracted acid-soluble HAPs and the second extraction using the SDS-containing buffer extracted remaining acid-insoluble proteins. After application of the protocol, 2-DE profiles of P. ternata tuber proteins were greatly improved and more protein spots were detected, especially low-abundance proteins. Moreover, the subunit composition of P. ternata lectin was analyzed by electrophoresis. Native lectin consists of two hydrogen-bonded subunits (11 kDa and 25 kDa) and the 11 kDa subunit was a glycoprotein. Subsequently, major HAPs in the tubers were analyzed by mass spectrometry, with nine protein spots being identified as lectin isoforms. The methodology was easy to perform and required no specialized apparatus. It would be useful for proteome analysis of other tuber plants of Araceae. PMID:23185632

  3. Sequential extraction results in improved proteome profiling of medicinal plant Pinellia ternata tubers, which contain large amounts of high-abundance proteins.

    PubMed

    Wu, Xiaolin; Xiong, Erhui; An, Sufang; Gong, Fangping; Wang, Wei

    2012-01-01

    Pinellia ternata tuber is one of the well-known Chinese traditional medicines. In order to understand the pharmacological properties of tuber proteins, it is necessary to perform proteome analysis of P. ternata tubers. However, a few high-abundance proteins (HAPs), mainly mannose-binding lectin (agglutinin), exist in aggregates of various sizes in the tubers and seriously interfere with proteome profiling by two-dimensional electrophoresis (2-DE). Therefore, selective depletion of these HAPs is a prerequisite for enhanced proteome analysis of P. ternata tubers. Based on differential protein solubility, we developed a novel protocol involving two sequential extractions for depletion of some HAPs and prefractionation of tuber proteins prior to 2-DE. The first extraction using 10% acetic acid selectively extracted acid-soluble HAPs and the second extraction using the SDS-containing buffer extracted remaining acid-insoluble proteins. After application of the protocol, 2-DE profiles of P. ternata tuber proteins were greatly improved and more protein spots were detected, especially low-abundance proteins. Moreover, the subunit composition of P. ternata lectin was analyzed by electrophoresis. Native lectin consists of two hydrogen-bonded subunits (11 kDa and 25 kDa) and the 11 kDa subunit was a glycoprotein. Subsequently, major HAPs in the tubers were analyzed by mass spectrometry, with nine protein spots being identified as lectin isoforms. The methodology was easy to perform and required no specialized apparatus. It would be useful for proteome analysis of other tuber plants of Araceae.

  4. Genomics pipelines and data integration: challenges and opportunities in the research setting

    PubMed Central

    Davis-Turak, Jeremy; Courtney, Sean M.; Hazard, E. Starr; Glen, W. Bailey; da Silveira, Willian; Wesselman, Timothy; Harbin, Larry P.; Wolf, Bethany J.; Chung, Dongjun; Hardiman, Gary

    2017-01-01

    Introduction The emergence and mass utilization of high-throughput (HT) technologies, including sequencing technologies (genomics) and mass spectrometry (proteomics, metabolomics, lipids), has allowed geneticists, biologists, and biostatisticians to bridge the gap between genotype and phenotype on a massive scale. These new technologies have brought rapid advances in our understanding of cell biology, evolutionary history, microbial environments, and are increasingly providing new insights and applications towards clinical care and personalized medicine. Areas covered The very success of this industry also translates into daunting big data challenges for researchers and institutions that extend beyond the traditional academic focus of algorithms and tools. The main obstacles revolve around analysis provenance, data management of massive datasets, ease of use of software, interpretability and reproducibility of results. Expert Commentary The authors review the challenges associated with implementing bioinformatics best practices in a large-scale setting, and highlight the opportunity for establishing bioinformatics pipelines that incorporate data tracking and auditing, enabling greater consistency and reproducibility for basic research, translational or clinical settings. PMID:28092471

  5. Genomics pipelines and data integration: challenges and opportunities in the research setting.

    PubMed

    Davis-Turak, Jeremy; Courtney, Sean M; Hazard, E Starr; Glen, W Bailey; da Silveira, Willian A; Wesselman, Timothy; Harbin, Larry P; Wolf, Bethany J; Chung, Dongjun; Hardiman, Gary

    2017-03-01

    The emergence and mass utilization of high-throughput (HT) technologies, including sequencing technologies (genomics) and mass spectrometry (proteomics, metabolomics, lipids), has allowed geneticists, biologists, and biostatisticians to bridge the gap between genotype and phenotype on a massive scale. These new technologies have brought rapid advances in our understanding of cell biology, evolutionary history, microbial environments, and are increasingly providing new insights and applications towards clinical care and personalized medicine. Areas covered: The very success of this industry also translates into daunting big data challenges for researchers and institutions that extend beyond the traditional academic focus of algorithms and tools. The main obstacles revolve around analysis provenance, data management of massive datasets, ease of use of software, interpretability and reproducibility of results. Expert commentary: The authors review the challenges associated with implementing bioinformatics best practices in a large-scale setting, and highlight the opportunity for establishing bioinformatics pipelines that incorporate data tracking and auditing, enabling greater consistency and reproducibility for basic research, translational or clinical settings.

  6. Organellar proteomics reveals hundreds of novel nuclear proteins in the malaria parasite Plasmodium falciparum

    PubMed Central

    2012-01-01

    Background The post-genomic era of malaria research provided unprecedented insights into the biology of Plasmodium parasites. Due to the large evolutionary distance to model eukaryotes, however, we lack a profound understanding of many processes in Plasmodium biology. One example is the cell nucleus, which controls the parasite genome in a development- and cell cycle-specific manner through mostly unknown mechanisms. To study this important organelle in detail, we conducted an integrative analysis of the P. falciparum nuclear proteome. Results We combined high accuracy mass spectrometry and bioinformatic approaches to present for the first time an experimentally determined core nuclear proteome for P. falciparum. Besides a large number of factors implicated in known nuclear processes, one-third of all detected proteins carry no functional annotation, including many phylum- or genus-specific factors. Importantly, extensive experimental validation using 30 transgenic cell lines confirmed the high specificity of this inventory, and revealed distinct nuclear localization patterns of hitherto uncharacterized proteins. Further, our detailed analysis identified novel protein domains potentially implicated in gene transcription pathways, and sheds important new light on nuclear compartments and processes including regulatory complexes, the nucleolus, nuclear pores, and nuclear import pathways. Conclusion Our study provides comprehensive new insight into the biology of the Plasmodium nucleus and will serve as an important platform for dissecting general and parasite-specific nuclear processes in malaria parasites. Moreover, as the first nuclear proteome characterized in any protist organism, it will provide an important resource for studying evolutionary aspects of nuclear biology. PMID:23181666

  7. Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0

    NASA Astrophysics Data System (ADS)

    The, Matthew; MacCoss, Michael J.; Noble, William S.; Käll, Lukas

    2016-11-01

    Percolator is a widely used software tool that increases yield in shotgun proteomics experiments and assigns reliable statistical confidence measures, such as q values and posterior error probabilities, to peptides and peptide-spectrum matches (PSMs) from such experiments. Percolator's processing speed has been sufficient for typical data sets consisting of hundreds of thousands of PSMs. With our new scalable approach, we can now also analyze millions of PSMs in a matter of minutes on a commodity computer. Furthermore, with the increasing awareness for the need for reliable statistics on the protein level, we compared several easy-to-understand protein inference methods and implemented the best-performing method—grouping proteins by their corresponding sets of theoretical peptides and then considering only the best-scoring peptide for each protein—in the Percolator package. We used Percolator 3.0 to analyze the data from a recent study of the draft human proteome containing 25 million spectra (PM:24870542). The source code and Ubuntu, Windows, MacOS, and Fedora binary packages are available from http://percolator.ms/ under an Apache 2.0 license.

  8. Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0.

    PubMed

    The, Matthew; MacCoss, Michael J; Noble, William S; Käll, Lukas

    2016-11-01

    Percolator is a widely used software tool that increases yield in shotgun proteomics experiments and assigns reliable statistical confidence measures, such as q values and posterior error probabilities, to peptides and peptide-spectrum matches (PSMs) from such experiments. Percolator's processing speed has been sufficient for typical data sets consisting of hundreds of thousands of PSMs. With our new scalable approach, we can now also analyze millions of PSMs in a matter of minutes on a commodity computer. Furthermore, with the increasing awareness for the need for reliable statistics on the protein level, we compared several easy-to-understand protein inference methods and implemented the best-performing method-grouping proteins by their corresponding sets of theoretical peptides and then considering only the best-scoring peptide for each protein-in the Percolator package. We used Percolator 3.0 to analyze the data from a recent study of the draft human proteome containing 25 million spectra (PM:24870542). The source code and Ubuntu, Windows, MacOS, and Fedora binary packages are available from http://percolator.ms/ under an Apache 2.0 license. Graphical Abstract ᅟ.

  9. Disclosure of the differences of Mesorhizobium loti under the free-living and symbiotic conditions by comparative proteome analysis without bacteroid isolation.

    PubMed

    Tatsukami, Yohei; Nambu, Mami; Morisaka, Hironobu; Kuroda, Kouichi; Ueda, Mitsuyoshi

    2013-07-31

    Rhizobia are symbiotic nitrogen-fixing soil bacteria that show a symbiotic relationship with their host legume. Rhizobia have 2 different physiological conditions: a free-living condition in soil, and a symbiotic nitrogen-fixing condition in the nodule. The lifestyle of rhizobia remains largely unknown, although genome and transcriptome analyses have been carried out. To clarify the lifestyle of bacteria, proteome analysis is necessary because the protein profile directly reflects in vivo reactions of the organisms. In proteome analysis, high separation performance is required to analyze complex biological samples. Therefore, we used a liquid chromatography-tandem mass spectrometry system, equipped with a long monolithic silica capillary column, which is superior to conventional columns. In this study, we compared the protein profile of Mesorhizobium loti MAFF303099 under free-living condition to that of symbiotic conditions by using small amounts of crude extracts. We identified 1,533 and 847 proteins for M. loti under free-living and symbiotic conditions, respectively. Pathway analysis by Kyoto Encyclopedia of Genes and Genomes (KEGG) revealed that many of the enzymes involved in the central carbon metabolic pathway were commonly detected under both conditions. The proteins encoded in the symbiosis island, the transmissible chromosomal region that includes the genes that are highly upregulated under the symbiotic condition, were uniquely detected under the symbiotic condition. The features of the symbiotic condition that have been reported by transcriptome analysis were confirmed at the protein level by proteome analysis. In addition, the genes of the proteins involved in cell surface structure were repressed under the symbiotic nitrogen-fixing condition. Furthermore, farnesyl pyrophosphate (FPP) was found to be biosynthesized only in rhizobia under the symbiotic condition. The obtained protein profile appeared to reflect the difference in phenotypes under the free-living and symbiotic conditions. In addition, KEGG pathway analysis revealed that the cell surface structure of rhizobia was largely different under each condition, and surprisingly, rhizobia might provided FPP to the host as a source of secondary metabolism. M. loti changed its metabolism and cell surface structure in accordance with the surrounding conditions.

  10. Disclosure of the differences of Mesorhizobium loti under the free-living and symbiotic conditions by comparative proteome analysis without bacteroid isolation

    PubMed Central

    2013-01-01

    Background Rhizobia are symbiotic nitrogen-fixing soil bacteria that show a symbiotic relationship with their host legume. Rhizobia have 2 different physiological conditions: a free-living condition in soil, and a symbiotic nitrogen-fixing condition in the nodule. The lifestyle of rhizobia remains largely unknown, although genome and transcriptome analyses have been carried out. To clarify the lifestyle of bacteria, proteome analysis is necessary because the protein profile directly reflects in vivo reactions of the organisms. In proteome analysis, high separation performance is required to analyze complex biological samples. Therefore, we used a liquid chromatography-tandem mass spectrometry system, equipped with a long monolithic silica capillary column, which is superior to conventional columns. In this study, we compared the protein profile of Mesorhizobium loti MAFF303099 under free-living condition to that of symbiotic conditions by using small amounts of crude extracts. Result We identified 1,533 and 847 proteins for M. loti under free-living and symbiotic conditions, respectively. Pathway analysis by Kyoto Encyclopedia of Genes and Genomes (KEGG) revealed that many of the enzymes involved in the central carbon metabolic pathway were commonly detected under both conditions. The proteins encoded in the symbiosis island, the transmissible chromosomal region that includes the genes that are highly upregulated under the symbiotic condition, were uniquely detected under the symbiotic condition. The features of the symbiotic condition that have been reported by transcriptome analysis were confirmed at the protein level by proteome analysis. In addition, the genes of the proteins involved in cell surface structure were repressed under the symbiotic nitrogen-fixing condition. Furthermore, farnesyl pyrophosphate (FPP) was found to be biosynthesized only in rhizobia under the symbiotic condition. Conclusion The obtained protein profile appeared to reflect the difference in phenotypes under the free-living and symbiotic conditions. In addition, KEGG pathway analysis revealed that the cell surface structure of rhizobia was largely different under each condition, and surprisingly, rhizobia might provided FPP to the host as a source of secondary metabolism. M. loti changed its metabolism and cell surface structure in accordance with the surrounding conditions. PMID:23898917

  11. A Statistical Selection Strategy for Normalization Procedures in LC-MS Proteomics Experiments through Dataset Dependent Ranking of Normalization Scaling Factors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; Jacobs, Jon M.

    2011-12-01

    Quantification of LC-MS peak intensities assigned during peptide identification in a typical comparative proteomics experiment will deviate from run-to-run of the instrument due to both technical and biological variation. Thus, normalization of peak intensities across a LC-MS proteomics dataset is a fundamental step in pre-processing. However, the downstream analysis of LC-MS proteomics data can be dramatically affected by the normalization method selected . Current normalization procedures for LC-MS proteomics data are presented in the context of normalization values derived from subsets of the full collection of identified peptides. The distribution of these normalization values is unknown a priori. If theymore » are not independent from the biological factors associated with the experiment the normalization process can introduce bias into the data, which will affect downstream statistical biomarker discovery. We present a novel approach to evaluate normalization strategies, where a normalization strategy includes the peptide selection component associated with the derivation of normalization values. Our approach evaluates the effect of normalization on the between-group variance structure in order to identify candidate normalization strategies that improve the structure of the data without introducing bias into the normalized peak intensities.« less

  12. Proteome-wide identification of predominant subcellular protein localizations in a bacterial model organism

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stekhoven, Daniel J.; Omasits, Ulrich; Quebatte, Maxime

    2014-03-01

    Proteomics data provide unique insights into biological systems, including the predominant subcellular localization (SCL) of proteins, which can reveal important clues about their functions. Here we analyzed data of a complete prokaryotic proteome expressed under two conditions mimicking interaction of the emerging pathogen Bartonella henselae with its mammalian host. Normalized spectral count data from cytoplasmic, total membrane, inner and outer membrane fractions allowed us to identify the predominant SCL for 82% of the identified proteins. The spectral count proportion of total membrane versus cytoplasmic fractions indicated the propensity of cytoplasmic proteins to co-fractionate with the inner membrane, and enabled usmore » to distinguish cytoplasmic, peripheral innermembrane and bona fide inner membrane proteins. Principal component analysis and k-nearest neighbor classification training on selected marker proteins or predominantly localized proteins, allowed us to determine an extensive catalog of at least 74 expressed outer membrane proteins, and to extend the SCL assignment to 94% of the identified proteins, including 18% where in silico methods gave no prediction. Suitable experimental proteomics data combined with straightforward computational approaches can thus identify the predominant SCL on a proteome-wide scale. Finally, we present a conceptual approach to identify proteins potentially changing their SCL in a condition-dependent fashion.« less

  13. High-Throughput Analysis of Age-Dependent Protein Changes in Layer II/III of the Human Orbitofrontal Cortex

    NASA Astrophysics Data System (ADS)

    Kapadia, Fenika

    Studies on the orbitofrontal cortex (OFC) during normal aging have shown a decline in cognitive functions, a loss of spines/synapses in layer III and gene expression changes related to neural communication. Biological changes during the course of normal aging are summarized into 9 hallmarks based on aging in peripheral tissue. Whether these hallmarks apply to non-dividing brain tissue is not known. Therefore, we opted to perform large-scale proteomic profiling of the OFC layer II/III during normal aging from 15 young and 18 old male subjects. MaxQuant was utilized for label-free quantification and statistical analysis by the Random Intercept Model (RIM) identified 118 differentially expressed (DE) age-related proteins. Altered neural communication was the most represented hallmark of aging (54% of DE proteins), highlighting the importance of communication in the brain. Functional analysis showed enrichment in GABA/glutamate signaling and pro-inflammatory responses. The former may contribute to alterations in excitation/inhibition, leading to cognitive decline during aging.

  14. An in-depth snake venom proteopeptidome characterization: Benchmarking Bothrops jararaca.

    PubMed

    Nicolau, Carolina A; Carvalho, Paulo C; Junqueira-de-Azevedo, Inácio L M; Teixeira-Ferreira, André; Junqueira, Magno; Perales, Jonas; Neves-Ferreira, Ana Gisele C; Valente, Richard H

    2017-01-16

    A large-scale proteomic approach was devised to advance the understanding of venom composition. Bothrops jararaca venom was fractionated by OFFGEL followed by chromatography, generating peptidic and proteic fractions. The latter was submitted to trypsin digestion. Both fractions were separately analyzed by reversed-phase nanochromatography coupled to high resolution mass spectrometry. This strategy allowed deeper and joint characterizations of the peptidome and proteome (proteopeptidome) of this venom. Our results lead to the identification of 46 protein classes (with several uniquely assigned proteins per class) comprising eight high-abundance bona fide venom components, and 38 additional classes in smaller quantities. This last category included previously described B. jararaca venom proteins, common Elapidae venom constituents (cobra venom factor and three-finger toxin), and proteins typically encountered in lysosomes, cellular membranes and blood plasma. Furthermore, this report is the most complete snake venom peptidome described so far, both in number of peptides and in variety of unique proteins that could have originated them. It is hypothesized that such diversity could enclose cryptides, whose bioactivities would contribute to envenomation in yet undetermined ways. Finally, we propose that the broad range screening of B. jararaca peptidome will facilitate the discovery of bioactive molecules, eventually leading to valuable therapeutical agents. Our proteopeptidomic strategy yielded unprecedented insights into the remarkable diversity of B. jararaca venom composition, both at the peptide and protein levels. These results bring a substantial contribution to the actual pursuit of large-scale protein-level assignment in snake venomics. The detection of typical elapidic venom components, in a Viperidae venom, reinforces our view that the use of this approach (hand-in-hand with transcriptomic and genomic data) for venom proteomic analysis, at the specimen-level, can greatly contribute for venom toxin evolution studies. Furthermore, data were generated in support of a previous hypothesis that venom gland secretory vesicles are specialized forms of lysosomes. Two testable hypotheses also emerge from the results of this work. The first is that a nucleobindin-2-derived protein could lead to prey disorientation during envenomation, aiding in its capture by the snake. The other being that the venom's peptidome might contain a population of cryptides, whose biological activities could lead to the development of new therapeutical agents. Copyright © 2016 Elsevier B.V. All rights reserved.

  15. A simple protocol for protein extraction of recalcitrant fruit tissues suitable for 2-DE and MS analysis.

    PubMed

    Song, Jun; Braun, Gordon; Bevis, Eric; Doncaster, Kristen

    2006-08-01

    Fruit tissues are considered recalcitrant plant tissue for proteomic analysis. Three phenol-free protein extraction procedures for 2-DE were compared and evaluated on apple fruit proteins. Incorporation of hot SDS buffer, extraction with TCA/acetone precipitation was found to be the most effective protocol. The results from SDS-PAGE and 2-DE analysis showed high quality proteins. More than 500 apple polypeptides were separated on a small scale 2-DE gel. The successful protocol was further tested on banana fruit, in which 504 and 386 proteins were detected in peel and flesh tissues, respectively. To demonstrate the quality of the extracted proteins, several protein spots from apple and banana peels were cut from 2-DE gels, analyzed by MS and have been tentatively identified. The protocol described in this study is a simple procedure which could be routinely used in proteomic studies of many types of recalcitrant fruit tissues.

  16. An Integrated Proteomics and Bioinformatics Approach Reveals the Anti-inflammatory Mechanism of Carnosic Acid

    PubMed Central

    Wang, Li-Chao; Wei, Wen-Hui; Zhang, Xiao-Wen; Liu, Dan; Zeng, Ke-Wu; Tu, Peng-Fei

    2018-01-01

    Drastic macrophages activation triggered by exogenous infection or endogenous stresses is thought to be implicated in the pathogenesis of various inflammatory diseases. Carnosic acid (CA), a natural phenolic diterpene extracted from Salvia officinalis plant, has been reported to possess anti-inflammatory activity. However, its role in macrophages activation as well as potential molecular mechanism is largely unexplored. In the current study, we sought to elucidate the anti-inflammatory property of CA using an integrated approach based on unbiased proteomics and bioinformatics analysis. CA significantly inhibited the robust increase of nitric oxide and TNF-α, downregulated COX2 protein expression, and lowered the transcriptional level of inflammatory genes including Nos2, Tnfα, Cox2, and Mcp1 in LPS-stimulated RAW264.7 cells, a murine model of peritoneal macrophage cell line. The LC-MS/MS-based shotgun proteomics analysis showed CA negatively regulated 217 LPS-elicited proteins which were involved in multiple inflammatory processes including MAPK, nuclear factor (NF)-κB, and FoxO signaling pathways. A further molecular biology analysis revealed that CA effectually inactivated IKKβ/IκB-α/NF-κB, ERK/JNK/p38 MAPKs, and FoxO1/3 signaling pathways. Collectively, our findings demonstrated the role of CA in regulating inflammation response and provide some insights into the proteomics-guided pharmacological mechanism study of natural products. PMID:29713284

  17. Proteomic Analysis of ABCA1-Null Macrophages Reveals a Role for Stomatin-Like Protein-2 in Raft Composition and Toll-Like Receptor Signaling.

    PubMed

    Chowdhury, Saiful M; Zhu, Xuewei; Aloor, Jim J; Azzam, Kathleen M; Gabor, Kristin A; Ge, William; Addo, Kezia A; Tomer, Kenneth B; Parks, John S; Fessler, Michael B

    2015-07-01

    Lipid raft membrane microdomains organize signaling by many prototypical receptors, including the Toll-like receptors (TLRs) of the innate immune system. Raft-localization of proteins is widely thought to be regulated by raft cholesterol levels, but this is largely on the basis of studies that have manipulated cell cholesterol using crude and poorly specific chemical tools, such as β-cyclodextrins. To date, there has been no proteome-scale investigation of whether endogenous regulators of intracellular cholesterol trafficking, such as the ATP binding cassette (ABC)A1 lipid efflux transporter, regulate targeting of proteins to rafts. Abca1(-/-) macrophages have cholesterol-laden rafts that have been reported to contain increased levels of select proteins, including TLR4, the lipopolysaccharide receptor. Here, using quantitative proteomic profiling, we identified 383 proteins in raft isolates from Abca1(+/+) and Abca1(-/-) macrophages. ABCA1 deletion induced wide-ranging changes to the raft proteome. Remarkably, many of these changes were similar to those seen in Abca1(+/+) macrophages after lipopolysaccharide exposure. Stomatin-like protein (SLP)-2, a member of the stomatin-prohibitin-flotillin-HflK/C family of membrane scaffolding proteins, was robustly and specifically increased in Abca1(-/-) rafts. Pursuing SLP-2 function, we found that rafts of SLP-2-silenced macrophages had markedly abnormal composition. SLP-2 silencing did not compromise ABCA1-dependent cholesterol efflux but reduced macrophage responsiveness to multiple TLR ligands. This was associated with reduced raft levels of the TLR co-receptor, CD14, and defective lipopolysaccharide-induced recruitment of the common TLR adaptor, MyD88, to rafts. Taken together, we show that the lipid transporter ABCA1 regulates the protein repertoire of rafts and identify SLP-2 as an ABCA1-dependent regulator of raft composition and of the innate immune response. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

  18. Proteomic Analysis of ABCA1-Null Macrophages Reveals a Role for Stomatin-Like Protein-2 in Raft Composition and Toll-Like Receptor Signaling*

    PubMed Central

    Chowdhury, Saiful M.; Zhu, Xuewei; Aloor, Jim J.; Azzam, Kathleen M.; Gabor, Kristin A.; Ge, William; Addo, Kezia A.; Tomer, Kenneth B.; Parks, John S.; Fessler, Michael B.

    2015-01-01

    Lipid raft membrane microdomains organize signaling by many prototypical receptors, including the Toll-like receptors (TLRs) of the innate immune system. Raft-localization of proteins is widely thought to be regulated by raft cholesterol levels, but this is largely on the basis of studies that have manipulated cell cholesterol using crude and poorly specific chemical tools, such as β-cyclodextrins. To date, there has been no proteome-scale investigation of whether endogenous regulators of intracellular cholesterol trafficking, such as the ATP binding cassette (ABC)A1 lipid efflux transporter, regulate targeting of proteins to rafts. Abca1−/− macrophages have cholesterol-laden rafts that have been reported to contain increased levels of select proteins, including TLR4, the lipopolysaccharide receptor. Here, using quantitative proteomic profiling, we identified 383 proteins in raft isolates from Abca1+/+ and Abca1−/− macrophages. ABCA1 deletion induced wide-ranging changes to the raft proteome. Remarkably, many of these changes were similar to those seen in Abca1+/+ macrophages after lipopolysaccharide exposure. Stomatin-like protein (SLP)-2, a member of the stomatin-prohibitin-flotillin-HflK/C family of membrane scaffolding proteins, was robustly and specifically increased in Abca1−/− rafts. Pursuing SLP-2 function, we found that rafts of SLP-2-silenced macrophages had markedly abnormal composition. SLP-2 silencing did not compromise ABCA1-dependent cholesterol efflux but reduced macrophage responsiveness to multiple TLR ligands. This was associated with reduced raft levels of the TLR co-receptor, CD14, and defective lipopolysaccharide-induced recruitment of the common TLR adaptor, MyD88, to rafts. Taken together, we show that the lipid transporter ABCA1 regulates the protein repertoire of rafts and identify SLP-2 as an ABCA1-dependent regulator of raft composition and of the innate immune response. PMID:25910759

  19. Proteome-wide identification of predominant subcellular protein localizations in a bacterial model organism.

    PubMed

    Stekhoven, Daniel J; Omasits, Ulrich; Quebatte, Maxime; Dehio, Christoph; Ahrens, Christian H

    2014-03-17

    Proteomics data provide unique insights into biological systems, including the predominant subcellular localization (SCL) of proteins, which can reveal important clues about their functions. Here we analyzed data of a complete prokaryotic proteome expressed under two conditions mimicking interaction of the emerging pathogen Bartonella henselae with its mammalian host. Normalized spectral count data from cytoplasmic, total membrane, inner and outer membrane fractions allowed us to identify the predominant SCL for 82% of the identified proteins. The spectral count proportion of total membrane versus cytoplasmic fractions indicated the propensity of cytoplasmic proteins to co-fractionate with the inner membrane, and enabled us to distinguish cytoplasmic, peripheral inner membrane and bona fide inner membrane proteins. Principal component analysis and k-nearest neighbor classification training on selected marker proteins or predominantly localized proteins, allowed us to determine an extensive catalog of at least 74 expressed outer membrane proteins, and to extend the SCL assignment to 94% of the identified proteins, including 18% where in silico methods gave no prediction. Suitable experimental proteomics data combined with straightforward computational approaches can thus identify the predominant SCL on a proteome-wide scale. Finally, we present a conceptual approach to identify proteins potentially changing their SCL in a condition-dependent fashion. The work presented here describes the first prokaryotic proteome-wide subcellular localization (SCL) dataset for the emerging pathogen B. henselae (Bhen). The study indicates that suitable subcellular fractionation experiments combined with straight-forward computational analysis approaches assessing the proportion of spectral counts observed in different subcellular fractions are powerful for determining the predominant SCL of a large percentage of the experimentally observed proteins. This includes numerous cases where in silico prediction methods do not provide any prediction. Avoiding a treatment with harsh conditions, cytoplasmic proteins tend to co-fractionate with proteins of the inner membrane fraction, indicative of close functional interactions. The spectral count proportion (SCP) of total membrane versus cytoplasmic fractions allowed us to obtain a good indication about the relative proximity of individual protein complex members to the inner membrane. Using principal component analysis and k-nearest neighbor approaches, we were able to extend the percentage of proteins with a predominant experimental localization to over 90% of all expressed proteins and identified a set of at least 74 outer membrane (OM) proteins. In general, OM proteins represent a rich source of candidates for the development of urgently needed new therapeutics in combat of resurgence of infectious disease and multi-drug resistant bacteria. Finally, by comparing the data from two infection biology relevant conditions, we conceptually explore methods to identify and visualize potential candidates that may partially change their SCL in these different conditions. The data are made available to researchers as a SCL compendium for Bhen and as an assistance in further improving in silico SCL prediction algorithms. Copyright © 2014 Elsevier B.V. All rights reserved.

  20. Functional proteomics within the genus Lactobacillus.

    PubMed

    De Angelis, Maria; Calasso, Maria; Cavallo, Noemi; Di Cagno, Raffaella; Gobbetti, Marco

    2016-03-01

    Lactobacillus are mainly used for the manufacture of fermented dairy, sourdough, meat, and vegetable foods or used as probiotics. Under optimal processing conditions, Lactobacillus strains contribute to food functionality through their enzyme portfolio and the release of metabolites. An extensive genomic diversity analysis was conducted to elucidate the core features of the genus Lactobacillus, and to provide a better comprehension of niche adaptation of the strains. However, proteomics is an indispensable "omics" science to elucidate the proteome diversity, and the mechanisms of regulation and adaptation of Lactobacillus strains. This review focuses on the novel and comprehensive knowledge of functional proteomics and metaproteomics of Lactobacillus species. A large list of proteomic case studies of different Lactobacillus species is provided to illustrate the adaptability of the main metabolic pathways (e.g., carbohydrate transport and metabolism, pyruvate metabolism, proteolytic system, amino acid metabolism, and protein synthesis) to various life conditions. These investigations have highlighted that lactobacilli modulate the level of a complex panel of proteins to growth/survive in different ecological niches. In addition to the general regulation and stress response, specific metabolic pathways can be switched on and off, modifying the behavior of the strains. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics.

    PubMed

    Tyanova, Stefka; Temu, Tikira; Cox, Juergen

    2016-12-01

    MaxQuant is one of the most frequently used platforms for mass-spectrometry (MS)-based proteomics data analysis. Since its first release in 2008, it has grown substantially in functionality and can be used in conjunction with more MS platforms. Here we present an updated protocol covering the most important basic computational workflows, including those designed for quantitative label-free proteomics, MS1-level labeling and isobaric labeling techniques. This protocol presents a complete description of the parameters used in MaxQuant, as well as of the configuration options of its integrated search engine, Andromeda. This protocol update describes an adaptation of an existing protocol that substantially modifies the technique. Important concepts of shotgun proteomics and their implementation in MaxQuant are briefly reviewed, including different quantification strategies and the control of false-discovery rates (FDRs), as well as the analysis of post-translational modifications (PTMs). The MaxQuant output tables, which contain information about quantification of proteins and PTMs, are explained in detail. Furthermore, we provide a short version of the workflow that is applicable to data sets with simple and standard experimental designs. The MaxQuant algorithms are efficiently parallelized on multiple processors and scale well from desktop computers to servers with many cores. The software is written in C# and is freely available at http://www.maxquant.org.

  2. Notice of Pre-Application Webinar (RFA-CA-15-021, RFA-CA-15-022, RFA-CA-15-023) | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    The National Cancer Institute will hold a public pre-application webinar on Friday, December 11 at 12:00 p.m. (EST) for the Funding Opportunity Announcements (FOAs) RFA-CA-15-021 entitled “Proteome Characterization Centers for Clinical Proteomic Tumor Analysis Consortium (U24), RFA-CA-15-022 entitled “Proteogenomic Translational Research Centers for Clinical Proteomic Tumor Analysis Consortium (U01)”, and RFA-CA-15-023 entitled “Proteogenomic Data Analysis Centers for Clinical Proteomic Tumor Analysis Consortium (U24)”.

  3. Evaluation of the salivary proteome as a surrogate tissue for systems biology approaches to understanding appetite.

    PubMed

    Harden, Charlotte J; Perez-Carrion, Kristine; Babakordi, Zara; Plummer, Sue F; Hepburn, Natalie; Barker, Margo E; Wright, Phillip C; Evans, Caroline A; Corfe, Bernard M

    2012-06-06

    Current measurement of appetite depends upon tools that are either subjective (visual analogue scales), or invasive (blood). Saliva is increasingly recognised as a valuable resource for biomarker analysis. Proteomics workflows may provide alternative means for the assessment of appetitive response. The study aimed to assess the potential value of the salivary proteome to detect novel biomarkers of appetite using an iTRAQ-based workflow. Diurnal variation of salivary protein concentrations was assessed. A randomised, controlled, crossover study examined the effects on the salivary proteome of isocaloric doses of various long chain fatty acid (LCFA) oil emulsions compared to no treatment (NT). Fasted males provided saliva samples before and following NT or dosing with LCFA emulsions. The oil component of the DHA emulsion contained predominantly docosahexaenoic acid and the oil component of OA contained predominantly oleic acid. Several proteins were present in significantly (p<0.05) different quantities in saliva samples taken following treatments compared to fasting samples. DHA caused alterations in thioredoxin and serpin B4 relative to OA and NT. A further study evaluated energy intake (EI) in response to LCFA in conjunction with subjective appetite scoring. DHA was associated with significantly lower EI relative to NT and OA (p=0.039). The collective data suggest investigation of salivary proteome may be of value in appetitive response. This article is part of a Special Issue entitled: Proteomics: The clinical link. Copyright © 2011 Elsevier B.V. All rights reserved.

  4. A Proteomic Workflow Using High-Throughput De Novo Sequencing Towards Complementation of Genome Information for Improved Comparative Crop Science.

    PubMed

    Turetschek, Reinhard; Lyon, David; Desalegn, Getinet; Kaul, Hans-Peter; Wienkoop, Stefanie

    2016-01-01

    The proteomic study of non-model organisms, such as many crop plants, is challenging due to the lack of comprehensive genome information. Changing environmental conditions require the study and selection of adapted cultivars. Mutations, inherent to cultivars, hamper protein identification and thus considerably complicate the qualitative and quantitative comparison in large-scale systems biology approaches. With this workflow, cultivar-specific mutations are detected from high-throughput comparative MS analyses, by extracting sequence polymorphisms with de novo sequencing. Stringent criteria are suggested to filter for confidential mutations. Subsequently, these polymorphisms complement the initially used database, which is ready to use with any preferred database search algorithm. In our example, we thereby identified 26 specific mutations in two cultivars of Pisum sativum and achieved an increased number (17 %) of peptide spectrum matches.

  5. Characterization of the canine urinary proteome.

    PubMed

    Brandt, Laura E; Ehrhart, E J; Scherman, Hataichanok; Olver, Christine S; Bohn, Andrea A; Prenni, Jessica E

    2014-06-01

    Urine is an attractive biofluid for biomarker discovery as it is easy and minimally invasive to obtain. While numerous studies have focused on the characterization of human urine, much less research has focused on canine urine. The objectives of this study were to characterize the universal canine urinary proteome (both soluble and exosomal), to determine the overlap between the canine proteome and a representative human urinary proteome study, to generate a resource for future canine studies, and to determine the suitability of the dog as a large animal model for human diseases. The soluble and exosomal fractions of normal canine urine were characterized using liquid chromatography tandem mass spectrometry (LC-MS/MS). Biological Networks Gene Ontology (BiNGO) software was utilized to assign the canine urinary proteome to respective Gene Ontology categories, such as Cellular Component, Molecular Function, and Biological Process. Over 500 proteins were confidently identified in normal canine urine. Gene Ontology analysis revealed that exosomal proteins were largely derived from an intracellular location, while soluble proteins included both extracellular and membrane proteins. Exosome proteins were assigned to metabolic processes and localization, while soluble proteins were primarily annotated to specific localization processes. Several proteins identified in normal canine urine have previously been identified in human urine where these proteins are related to various extrarenal and renal diseases. The results of this study illustrate the potential of the dog as an animal model for human disease states and provide the framework for future studies of canine renal diseases. © 2014 American Society for Veterinary Clinical Pathology and European Society for Veterinary Clinical Pathology.

  6. Glycoproteins Enrichment and LC-MS/MS Glycoproteomics in Central Nervous System Applications.

    PubMed

    Zhu, Rui; Song, Ehwang; Hussein, Ahmed; Kobeissy, Firas H; Mechref, Yehia

    2017-01-01

    Proteins and glycoproteins play important biological roles in central nervous systems (CNS). Qualitative and quantitative evaluation of proteins and glycoproteins expression in CNS is critical to reveal the inherent biomolecular mechanism of CNS diseases. This chapter describes proteomic and glycoproteomic approaches based on liquid chromatography/tandem mass spectrometry (LC-MS or LC-MS/MS) for the qualitative and quantitative assessment of proteins and glycoproteins expressed in CNS. Proteins and glycoproteins, extracted by a mass spectrometry friendly surfactant from CNS samples, were subjected to enzymatic (tryptic) digestion and three down-stream analyses: (1) a nano LC system coupled with a high-resolution MS instrument to achieve qualitative proteomic profile, (2) a nano LC system combined with a triple quadrupole MS to quantify identified proteins, and (3) glycoprotein enrichment prior to LC-MS/MS analysis. Enrichment techniques can be applied to improve coverage of low abundant glycopeptides/glycoproteins. An example described in this chapter is hydrophilic interaction liquid chromatographic (HILIC) enrichment to capture glycopeptides, allowing efficient removal of peptides. The combination of three LC-MS/MS-based approaches is capable of the investigation of large-scale proteins and glycoproteins from CNS with an in-depth coverage, thus offering a full view of proteins and glycoproteins changes in CNS.

  7. Maximizing the sensitivity and reliability of peptide identification in large-scale proteomic experiments by harnessing multiple search engines.

    PubMed

    Yu, Wen; Taylor, J Alex; Davis, Michael T; Bonilla, Leo E; Lee, Kimberly A; Auger, Paul L; Farnsworth, Chris C; Welcher, Andrew A; Patterson, Scott D

    2010-03-01

    Despite recent advances in qualitative proteomics, the automatic identification of peptides with optimal sensitivity and accuracy remains a difficult goal. To address this deficiency, a novel algorithm, Multiple Search Engines, Normalization and Consensus is described. The method employs six search engines and a re-scoring engine to search MS/MS spectra against protein and decoy sequences. After the peptide hits from each engine are normalized to error rates estimated from the decoy hits, peptide assignments are then deduced using a minimum consensus model. These assignments are produced in a series of progressively relaxed false-discovery rates, thus enabling a comprehensive interpretation of the data set. Additionally, the estimated false-discovery rate was found to have good concordance with the observed false-positive rate calculated from known identities. Benchmarking against standard proteins data sets (ISBv1, sPRG2006) and their published analysis, demonstrated that the Multiple Search Engines, Normalization and Consensus algorithm consistently achieved significantly higher sensitivity in peptide identifications, which led to increased or more robust protein identifications in all data sets compared with prior methods. The sensitivity and the false-positive rate of peptide identification exhibit an inverse-proportional and linear relationship with the number of participating search engines.

  8. Identification of Proteins Related to Epigenetic Regulation in the Malignant Transformation of Aberrant Karyotypic Human Embryonic Stem Cells by Quantitative Proteomics

    PubMed Central

    Sun, Yi; Yang, Yixuan; Zeng, Sicong; Tan, Yueqiu; Lu, Guangxiu; Lin, Ge

    2014-01-01

    Previous reports have demonstrated that human embryonic stem cells (hESCs) tend to develop genomic alterations and progress to a malignant state during long-term in vitro culture. This raises concerns of the clinical safety in using cultured hESCs. However, transformed hESCs might serve as an excellent model to determine the process of embryonic stem cell transition. In this study, ITRAQ-based tandem mass spectrometry was used to quantify normal and aberrant karyotypic hESCs proteins from simple to more complex karyotypic abnormalities. We identified and quantified 2583 proteins, and found that the expression levels of 316 proteins that represented at least 23 functional molecular groups were significantly different in both normal and abnormal hESCs. Dysregulated protein expression in epigenetic regulation was further verified in six pairs of hESC lines in early and late passage. In summary, this study is the first large-scale quantitative proteomic analysis of the malignant transformation of aberrant karyotypic hESCs. The data generated should serve as a useful reference of stem cell-derived tumor progression. Increased expression of both HDAC2 and CTNNB1 are detected as early as the pre-neoplastic stage, and might serve as prognostic markers in the malignant transformation of hESCs. PMID:24465727

  9. Generation and analyses of human synthetic antibody libraries and their application for protein microarrays.

    PubMed

    Säll, Anna; Walle, Maria; Wingren, Christer; Müller, Susanne; Nyman, Tomas; Vala, Andrea; Ohlin, Mats; Borrebaeck, Carl A K; Persson, Helena

    2016-10-01

    Antibody-based proteomics offers distinct advantages in the analysis of complex samples for discovery and validation of biomarkers associated with disease. However, its large-scale implementation requires tools and technologies that allow development of suitable antibody or antibody fragments in a high-throughput manner. To address this we designed and constructed two human synthetic antibody fragment (scFv) libraries denoted HelL-11 and HelL-13. By the use of phage display technology, in total 466 unique scFv antibodies specific for 114 different antigens were generated. The specificities of these antibodies were analyzed in a variety of immunochemical assays and a subset was further evaluated for functionality in protein microarray applications. This high-throughput approach demonstrates the ability to rapidly generate a wealth of reagents not only for proteome research, but potentially also for diagnostics and therapeutics. In addition, this work provides a great example on how a synthetic approach can be used to optimize library designs. By having precise control of the diversity introduced into the antigen-binding sites, synthetic libraries offer increased understanding of how different diversity contributes to antibody binding reactivity and stability, thereby providing the key to future library optimization. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  10. Toward Repurposing Metformin as a Precision Anti-Cancer Therapy Using Structural Systems Pharmacology

    PubMed Central

    Hart, Thomas; Dider, Shihab; Han, Weiwei; Xu, Hua; Zhao, Zhongming; Xie, Lei

    2016-01-01

    Metformin, a drug prescribed to treat type-2 diabetes, exhibits anti-cancer effects in a portion of patients, but the direct molecular and genetic interactions leading to this pleiotropic effect have not yet been fully explored. To repurpose metformin as a precision anti-cancer therapy, we have developed a novel structural systems pharmacology approach to elucidate metformin’s molecular basis and genetic biomarkers of action. We integrated structural proteome-scale drug target identification with network biology analysis by combining structural genomic, functional genomic, and interactomic data. Through searching the human structural proteome, we identified twenty putative metformin binding targets and their interaction models. We experimentally verified the interactions between metformin and our top-ranked kinase targets. Notably, kinases, particularly SGK1 and EGFR were identified as key molecular targets of metformin. Subsequently, we linked these putative binding targets to genes that do not directly bind to metformin but whose expressions are altered by metformin through protein-protein interactions, and identified network biomarkers of phenotypic response of metformin. The molecular targets and the key nodes in genetic networks are largely consistent with the existing experimental evidence. Their interactions can be affected by the observed cancer mutations. This study will shed new light into repurposing metformin for safe, effective, personalized therapies. PMID:26841718

  11. Machine Learning-based Classification of Diffuse Large B-cell Lymphoma Patients by Their Protein Expression Profiles.

    PubMed

    Deeb, Sally J; Tyanova, Stefka; Hummel, Michael; Schmidt-Supprian, Marc; Cox, Juergen; Mann, Matthias

    2015-11-01

    Characterization of tumors at the molecular level has improved our knowledge of cancer causation and progression. Proteomic analysis of their signaling pathways promises to enhance our understanding of cancer aberrations at the functional level, but this requires accurate and robust tools. Here, we develop a state of the art quantitative mass spectrometric pipeline to characterize formalin-fixed paraffin-embedded tissues of patients with closely related subtypes of diffuse large B-cell lymphoma. We combined a super-SILAC approach with label-free quantification (hybrid LFQ) to address situations where the protein is absent in the super-SILAC standard but present in the patient samples. Shotgun proteomic analysis on a quadrupole Orbitrap quantified almost 9,000 tumor proteins in 20 patients. The quantitative accuracy of our approach allowed the segregation of diffuse large B-cell lymphoma patients according to their cell of origin using both their global protein expression patterns and the 55-protein signature obtained previously from patient-derived cell lines (Deeb, S. J., D'Souza, R. C., Cox, J., Schmidt-Supprian, M., and Mann, M. (2012) Mol. Cell. Proteomics 11, 77-89). Expression levels of individual segregation-driving proteins as well as categories such as extracellular matrix proteins behaved consistently with known trends between the subtypes. We used machine learning (support vector machines) to extract candidate proteins with the highest segregating power. A panel of four proteins (PALD1, MME, TNFAIP8, and TBC1D4) is predicted to classify patients with low error rates. Highly ranked proteins from the support vector analysis revealed differential expression of core signaling molecules between the subtypes, elucidating aspects of their pathobiology. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

  12. Proteomic Analysis Reveals Differences in Tolerance to Acid Rain in Two Broad-Leaf Tree Species, Liquidambar formosana and Schima superba

    PubMed Central

    Wang, Chao; Liu, Ting-Wu; Chalifour, Annie; Chen, Juan; Shen, Zhi-Jun; Liu, Xiang; Wang, Wen-Hua; Zheng, Hai-Lei

    2014-01-01

    Acid rain (AR) is a serious environmental issue inducing harmful impacts on plant growth and development. It has been reported that Liquidambar formosana, considered as an AR-sensitive tree species, was largely injured by AR, compared with Schima superba, an AR-tolerant tree species. To clarify the different responses of these two species to AR, a comparative proteomic analysis was conducted in this study. More than 1000 protein spots were reproducibly detected on two-dimensional electrophoresis gels. Among them, 74 protein spots from L. formosana gels and 34 protein spots from S. superba gels showed significant changes in their abundances under AR stress. In both L. formosana and S. superba, the majority proteins with more than 2 fold changes were involved in photosynthesis and energy production, followed by material metabolism, stress and defense, transcription, post-translational and modification, and signal transduction. In contrast with L. formosana, no hormone response-related protein was found in S. superba. Moreover, the changes of proteins involved in photosynthesis, starch synthesis, and translation were distinctly different between L. formosana and S. superba. Protein expression analysis of three proteins (ribulose-1,5-bisphosphate carboxylase/oxygenase large subunit, ascorbate peroxidase and glutathione-S-transferase) by Western blot was well correlated with the results of proteomics. In conclusion, our study provides new insights into AR stress responses in woody plants and clarifies the differences in strategies to cope with AR between L. formosana and S. superba. PMID:25025692

  13. Comparative Proteomic Analysis of the Graft Unions in Hickory (Carya cathayensis) Provides Insights into Response Mechanisms to Grafting Process

    PubMed Central

    Xu, Dongbin; Yuan, Huwei; Tong, Yafei; Zhao, Liang; Qiu, Lingling; Guo, Wenbin; Shen, Chenjia; Liu, Hongjia; Yan, Daoliang; Zheng, Bingsong

    2017-01-01

    Hickory (Carya cathayensis), a tree with high nutritional and economic value, is widely cultivated in China. Grafting greatly reduces the juvenile phase length and makes the large scale cultivation of hickory possible. To reveal the response mechanisms of this species to grafting, we employed a proteomics-based approach to identify differentially expressed proteins in the graft unions during the grafting process. Our study identified 3723 proteins, of which 2518 were quantified. A total of 710 differentially expressed proteins (DEPs) were quantified and these were involved in various molecular functional and biological processes. Among these DEPs, 341 were up-regulated and 369 were down-regulated at 7 days after grafting compared with the control. Four auxin-related proteins were down-regulated, which was in agreement with the transcription levels of their encoding genes. The Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis showed that the ‘Flavonoid biosynthesis’ pathway and ‘starch and sucrose metabolism’ were both significantly up-regulated. Interestingly, five flavonoid biosynthesis-related proteins, a flavanone 3-hyfroxylase, a cinnamate 4-hydroxylase, a dihydroflavonol-4-reductase, a chalcone synthase, and a chalcone isomerase, were significantly up-regulated. Further experiments verified a significant increase in the total flavonoid contents in scions, which suggests that graft union formation may activate flavonoid biosynthesis to increase the content of a series of downstream secondary metabolites. This comprehensive analysis provides fundamental information on the candidate proteins and secondary metabolism pathways involved in the grafting process for hickory. PMID:28496455

  14. Global Relative Quantification with Liquid Chromatography–Matrix-assisted Laser Desorption Ionization Time-of-flight (LC-MALDI-TOF)—Cross–validation with LTQ-Orbitrap Proves Reliability and Reveals Complementary Ionization Preferences*

    PubMed Central

    Hessling, Bernd; Büttner, Knut; Hecker, Michael; Becher, Dörte

    2013-01-01

    Quantitative LC-MALDI is an underrepresented method, especially in large-scale experiments. The additional fractionation step that is needed for most MALDI-TOF-TOF instruments, the comparatively long analysis time, and the very limited number of established software tools for the data analysis render LC-MALDI a niche application for large quantitative analyses beside the widespread LC–electrospray ionization workflows. Here, we used LC-MALDI in a relative quantification analysis of Staphylococcus aureus for the first time on a proteome-wide scale. Samples were analyzed in parallel with an LTQ-Orbitrap, which allowed cross-validation with a well-established workflow. With nearly 850 proteins identified in the cytosolic fraction and quantitative data for more than 550 proteins obtained with the MASCOT Distiller software, we were able to prove that LC-MALDI is able to process highly complex samples. The good correlation of quantities determined via this method and the LTQ-Orbitrap workflow confirmed the high reliability of our LC-MALDI approach for global quantification analysis. Because the existing literature reports differences for MALDI and electrospray ionization preferences and the respective experimental work was limited by technical or methodological constraints, we systematically compared biochemical attributes of peptides identified with either instrument. This genome-wide, comprehensive study revealed biases toward certain peptide properties for both MALDI-TOF-TOF- and LTQ-Orbitrap-based approaches. These biases are based on almost 13,000 peptides and result in a general complementarity of the two approaches that should be exploited in future experiments. PMID:23788530

  15. Global relative quantification with liquid chromatography-matrix-assisted laser desorption ionization time-of-flight (LC-MALDI-TOF)--cross-validation with LTQ-Orbitrap proves reliability and reveals complementary ionization preferences.

    PubMed

    Hessling, Bernd; Büttner, Knut; Hecker, Michael; Becher, Dörte

    2013-10-01

    Quantitative LC-MALDI is an underrepresented method, especially in large-scale experiments. The additional fractionation step that is needed for most MALDI-TOF-TOF instruments, the comparatively long analysis time, and the very limited number of established software tools for the data analysis render LC-MALDI a niche application for large quantitative analyses beside the widespread LC-electrospray ionization workflows. Here, we used LC-MALDI in a relative quantification analysis of Staphylococcus aureus for the first time on a proteome-wide scale. Samples were analyzed in parallel with an LTQ-Orbitrap, which allowed cross-validation with a well-established workflow. With nearly 850 proteins identified in the cytosolic fraction and quantitative data for more than 550 proteins obtained with the MASCOT Distiller software, we were able to prove that LC-MALDI is able to process highly complex samples. The good correlation of quantities determined via this method and the LTQ-Orbitrap workflow confirmed the high reliability of our LC-MALDI approach for global quantification analysis. Because the existing literature reports differences for MALDI and electrospray ionization preferences and the respective experimental work was limited by technical or methodological constraints, we systematically compared biochemical attributes of peptides identified with either instrument. This genome-wide, comprehensive study revealed biases toward certain peptide properties for both MALDI-TOF-TOF- and LTQ-Orbitrap-based approaches. These biases are based on almost 13,000 peptides and result in a general complementarity of the two approaches that should be exploited in future experiments.

  16. PeptideDepot: flexible relational database for visual analysis of quantitative proteomic data and integration of existing protein information.

    PubMed

    Yu, Kebing; Salomon, Arthur R

    2009-12-01

    Recently, dramatic progress has been achieved in expanding the sensitivity, resolution, mass accuracy, and scan rate of mass spectrometers able to fragment and identify peptides through MS/MS. Unfortunately, this enhanced ability to acquire proteomic data has not been accompanied by a concomitant increase in the availability of flexible tools allowing users to rapidly assimilate, explore, and analyze this data and adapt to various experimental workflows with minimal user intervention. Here we fill this critical gap by providing a flexible relational database called PeptideDepot for organization of expansive proteomic data sets, collation of proteomic data with available protein information resources, and visual comparison of multiple quantitative proteomic experiments. Our software design, built upon the synergistic combination of a MySQL database for safe warehousing of proteomic data with a FileMaker-driven graphical user interface for flexible adaptation to diverse workflows, enables proteomic end-users to directly tailor the presentation of proteomic data to the unique analysis requirements of the individual proteomics lab. PeptideDepot may be deployed as an independent software tool or integrated directly with our high throughput autonomous proteomic pipeline used in the automated acquisition and post-acquisition analysis of proteomic data.

  17. Proteomics research in India: an update.

    PubMed

    Reddy, Panga Jaipal; Atak, Apurva; Ghantasala, Saicharan; Kumar, Saurabh; Gupta, Shabarni; Prasad, T S Keshava; Zingde, Surekha M; Srivastava, Sanjeeva

    2015-09-08

    After a successful completion of the Human Genome Project, deciphering the mystery surrounding the human proteome posed a major challenge. Despite not being largely involved in the Human Genome Project, the Indian scientific community contributed towards proteomic research along with the global community. Currently, more than 76 research/academic institutes and nearly 145 research labs are involved in core proteomic research across India. The Indian researchers have been major contributors in drafting the "human proteome map" along with international efforts. In addition to this, virtual proteomics labs, proteomics courses and remote triggered proteomics labs have helped to overcome the limitations of proteomics education posed due to expensive lab infrastructure. The establishment of Proteomics Society, India (PSI) has created a platform for the Indian proteomic researchers to share ideas, research collaborations and conduct annual conferences and workshops. Indian proteomic research is really moving forward with the global proteomics community in a quest to solve the mysteries of proteomics. A draft map of the human proteome enhances the enthusiasm among intellectuals to promote proteomic research in India to the world.This article is part of a Special Issue entitled: Proteomics in India. Copyright © 2015 Elsevier B.V. All rights reserved.

  18. The secrets of Oriental panacea: Panax ginseng.

    PubMed

    Colzani, Mara; Altomare, Alessandra; Caliendo, Matteo; Aldini, Giancarlo; Righetti, Pier Giorgio; Fasoli, Elisa

    2016-01-01

    The Panax ginseng root proteome has been investigated via capture with combinatorial peptide ligand libraries (CPLL) at three different pH values. Proteomic characterization by SDS-PAGE and nLC–MS/MS analysis, via LTQ-Orbitrap XL, led to the identification of a total of 207 expressed proteins. This quite large number of identifications was achieved by consulting two different plant databases: P. ginseng and Arabidopsis thaliana. The major groups of identified proteins were associated to structural species (19.2%), oxidoreductase (19.5%), dehydrogenases (7.6%) and synthases (9.0%). For the first time, an exploration of protein–protein interactions was performed by merging all recognized proteins and building an interactomic map, characterized by 196 nodes and 1554 interactions. Finally a peptidomic analysis was developed combining different in-silico enzymatic digestions to simulate the human gastrointestinal process: from 661 generated peptides, 95 were identified as possible bioactives and in particular 6 of them were characterized by antimicrobial activity. The present report offers new insight for future investigations focused on elucidation of biological properties of P. ginseng proteome and peptidome. Ginseng is a traditional oriental herbal remedy whose use is very diffused in all the world for its numerous pharmacological effects. However, the exact mechanism of action of ginseng components, both ginsenosides and proteins, is still unidentified. So the common use of ginseng requires strict investigations to assess both its efficiency and its safety. Although many reports have been published regarding the pharmacological effects of ginseng, little is known about the biochemical pathways of root. Proteomics analysis could be useful to elucidate the physiological pathways. In this manuscript, an integrated approach to proteomics and peptidomics will usher in exploration of Panax ginseng proteins and proteolytic peptides, obtained by in-silico gastrointestinal digestion, characterized by antimicrobial action. The present research would pave the way for better knowledge of metabolic functions connected with ginseng proteome and provide with new information necessary to understand better antimicrobial activity of P. ginseng.

  19. Global Analysis of Palmitoylated Proteins in Toxoplasma gondii.

    PubMed

    Foe, Ian T; Child, Matthew A; Majmudar, Jaimeen D; Krishnamurthy, Shruthi; van der Linden, Wouter A; Ward, Gary E; Martin, Brent R; Bogyo, Matthew

    2015-10-14

    Post-translational modifications (PTMs) such as palmitoylation are critical for the lytic cycle of the protozoan parasite Toxoplasma gondii. While palmitoylation is involved in invasion, motility, and cell morphology, the proteins that utilize this PTM remain largely unknown. Using a chemical proteomic approach, we report a comprehensive analysis of palmitoylated proteins in T. gondii, identifying a total of 282 proteins, including cytosolic, membrane-associated, and transmembrane proteins. From this large set of palmitoylated targets, we validate palmitoylation of proteins involved in motility (myosin light chain 1, myosin A), cell morphology (PhIL1), and host cell invasion (apical membrane antigen 1, AMA1). Further studies reveal that blocking AMA1 palmitoylation enhances the release of AMA1 and other invasion-related proteins from apical secretory organelles, suggesting a previously unrecognized role for AMA1. These findings suggest that palmitoylation is ubiquitous throughout the T. gondii proteome and reveal insights into the biology of this important human pathogen. Copyright © 2015 Elsevier Inc. All rights reserved.

  20. Proteomic analysis of the dorsal and ventral hippocampus of rats maintained on a high fat and refined sugar diet.

    PubMed

    Francis, Heather M; Mirzaei, Mehdi; Pardey, Margery C; Haynes, Paul A; Cornish, Jennifer L

    2013-10-01

    The typical Western diet, rich in high saturated fat and refined sugar (HFS), has been shown to increase cognitive decline with aging and Alzheimer's disease, and to affect cognitive functions that are dependent on the hippocampus, including memory processes and reversal learning. To investigate neurophysiological changes underlying these impairments, we employed a proteomic approach to identify differentially expressed proteins in the rat dorsal and ventral hippocampus following maintenance on an HFS diet. Rats maintained on the HFS diet for 8 weeks were impaired on a novel object recognition task that assesses memory and on a Morris Water Maze task assessing reversal learning. Quantitative label-free shotgun proteomic analysis was conducted on biological triplicates for each group. For the dorsal hippocampus, 59 proteins were upregulated and 36 downregulated in the HFS group compared to controls. Pathway ana-lysis revealed changes to proteins involved in molecular transport and cellular and molecular signaling, and changes to signaling pathways including calcium signaling, citrate cycle, and oxidative phosphorylation. For the ventral hippocampus, 25 proteins were upregulated and 27 downregulated in HFS fed rats. Differentially expressed proteins were involved in cell-to-cell signaling and interaction, and cellular and molecular function. Changes to signaling pathways included protein ubiquitination, ubiquinone biosynthesis, oxidative phosphorylation, and mitochondrial dysfunction. This is the first shotgun proteomics study to examine protein changes in the hippocampus following long-term consumption of a HFS diet, identifying changes to a large number of proteins including those involved in synaptic plasticity and energy metabolism. All MS data have been deposited in the ProteomeXchange with identifier PXD000028. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. Elevated host lipid metabolism revealed by iTRAQ-based quantitative proteomic analysis of cerebrospinal fluid of tuberculous meningitis patients

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mu, Jun; Institute of Neuroscience and the Collaborative Innovation Center for Brain Science, Chongqing Medical University, Chongqing; Chongqing Key Laboratory of Neurobiology, Chongqing

    Purpose: Tuberculous meningitis (TBM) remains to be one of the most deadly infectious diseases. The pathogen interacts with the host immune system, the process of which is largely unknown. Various cellular processes of Mycobacterium tuberculosis (MTB) centers around lipid metabolism. To determine the lipid metabolism related proteins, a quantitative proteomic study was performed here to identify differential proteins in the cerebrospinal fluid (CSF) obtained from TBM patients (n = 12) and healthy controls (n = 12). Methods: CSF samples were desalted, concentrated, labelled with isobaric tags for relative and absolute quantitation (iTRAQ™), and analyzed by multi-dimensional liquid chromatography-tandem mass spectrometry (LC-MS/MS). Gene ontology andmore » proteomic phenotyping analysis of the differential proteins were conducted using Database for Annotation, Visualization, and Integrated Discovery (DAVID) Bioinformatics Resources. ApoE and ApoB were selected for validation by ELISA. Results: Proteomic phenotyping of the 4 differential proteins was invloved in the lipid metabolism. ELISA showed significantly increased ApoB levels in TBM subjects compared to healthy controls. Area under the receiver operating characteristic curve analysis demonstrated ApoB levels could distinguish TBM subjects from healthy controls and viral meningitis subjects with 89.3% sensitivity and 92% specificity. Conclusions: CSF lipid metabolism disregulation, especially elevated expression of ApoB, gives insights into the pathogenesis of TBM. Further evaluation of these findings in larger studies including anti-tuberculosis medicated and unmedicated patient cohorts with other center nervous system infectious diseases is required for successful clinical translation. - Highlights: • The first proteomic study on the cerebrospinal fluid of tuberculous meningitis patients using iTRAQ. • Identify 4 differential proteins invloved in the lipid metabolism. • Elevated expression of ApoB gives insights into the pathogenesis of TBM.« less

  2. Proteomic profiling reveals dopaminergic regulation of progenitor cell functions of goldfish radial glial cells in vitro.

    PubMed

    Xing, Lei; Martyniuk, Christopher J; Esau, Crystal; Da Fonte, Dillon F; Trudeau, Vance L

    2016-07-20

    Radial glial cells (RGCs) are stem-like cells found in the developing and adult central nervous system. They function as both a scaffold to guide neuron migration and as progenitor cells that support neurogenesis. Our previous study revealed a close anatomical relationship between dopamine neurons and RGCs in the telencephalon of female goldfish. In this study, label-free proteomics was used to identify the proteins in a primary RGC culture and to determine the proteome response to the selective dopamine D1 receptor agonist SKF 38393 (10μM), in order to better understand dopaminergic regulation of RGCs. A total of 689 unique proteins were identified in the RGCs and these were classified into biological and pathological pathways. Proteins such as nucleolin (6.9-fold) and ependymin related protein 1 (4.9-fold) were increased in abundance while proteins triosephosphate isomerase (10-fold) and phosphoglycerate dehydrogenase (5-fold) were decreased in abundance. Pathway analysis revealed that proteins that consistently changed in abundance across biological replicates were related to small molecules such as ATP, lipids and steroids, hormones, glucose, cyclic AMP and Ca(2+). Sub-network enrichment analysis suggested that estrogen receptor signaling, among other transcription factors, is regulated by D1 receptor activation. This suggests that these signaling pathways are correlated to dopaminergic regulation of radial glial cell functions. Most proteins down-regulated by SKF 38393 were involved in cell cycle/proliferation, growth, death, and survival, which suggests that dopamine inhibits the progenitor-related processes of radial glial cells. Examples of differently expressed proteins including triosephosphate isomerase, nucleolin, phosphoglycerate dehydrogenase and capping protein (actin filament) muscle Z-line beta were validated by qPCR and western blot, which were consistent with MS/MS data in the direction of change. This is the first study to characterize the RGC proteome on a large scale in a vertebrate species. These data provide novel insight into glial protein networks that are associated with neuroendocrine function and neurogenesis in the teleost brain. While the role of radial glial cells in organizing brain structure and neurogenesis has been well studied, protein profiling experiments in this unique cell type has not been conducted. This study is the first to profile the proteome of goldfish radial glial cells in culture and to study the regulation of progenitor functions of radial glial cells by the neurotransmitter dopamine. This study provides the foundation for molecular network analysis in fish radial glial cells, and identifies cellular processes and signaling pathways in these cells with roles in neurogenesis and neuroendocrine function. Lastly, this study begins to characterize signatures and biomarkers for specific neuroendocrine and neurogenesis disruptors. Copyright © 2016 Elsevier B.V. All rights reserved.

  3. Mining Missing Membrane Proteins by High-pH Reverse Phase StageTip Fractionation and Multiple Reaction Monitoring Mass Spectrometry

    PubMed Central

    Kitata, Reta Birhanu; Dimayacyac-Esleta, Baby Rorielyn T.; Choong, Wai-Kok; Tsai, Chia-Feng; Lin, Tai-Du; Tsou, Chih-Chiang; Weng, Shao-Hsing; Chen, Yi-Ju; Yang, Pan-Chyr; Arco, Susan D.; Nesvizhskii, Alexey I.; Sung, Ting-Yi; Chen, Yu-Ju

    2016-01-01

    Despite significant efforts in the past decade towards complete mapping of the human proteome, 3564 proteins (neXtProt, 09-2014) are still “missing proteins”. Over one-third of these missing proteins are annotated as membrane proteins, owing to their relatively challenging accessibility with standard shotgun proteomics. Using non-small cell lung cancer (NSCLC) as a model study, we aim to mine missing proteins from disease-associated membrane proteome, which may be still largely under-represented. To increase identification coverage, we employed Hp-RP StageTip pre-fractionation of membrane-enriched samples from 11 NSCLC cell lines. Analysis of membrane samples from 20 pairs of tumor and adjacent normal lung tissue were incorporated to include physiologically expressed membrane proteins. Using multiple search engines (X!Tandem, Comet and Mascot) and stringent evaluation of FDR (MAYU and PeptideShaker), we identified 7702 proteins (66% membrane proteins) and 178 missing proteins (74 membrane proteins) with PSM-, peptide-, and protein-level FDR of 1%. Through multiple reaction monitoring (MRM) using synthetic peptides, we provided additional evidences for 8 missing proteins including 7 with transmembrane helix domains (TMH). This study demonstrates that mining missing proteins focused on cancer membrane sub-proteome can greatly contribute to map the whole human proteome. All data were deposited into ProteomeXchange with the identifier PXD002224. PMID:26202522

  4. Comparative quantitative proteomics analysis of the ABA response of roots of drought-sensitive and drought-tolerant wheat varieties identifies proteomic signatures of drought adaptability.

    PubMed

    Alvarez, Sophie; Roy Choudhury, Swarup; Pandey, Sona

    2014-03-07

    Wheat is one of the most highly cultivated cereals in the world. Like other cultivated crops, wheat production is significantly affected by abiotic stresses such as drought. Multiple wheat varieties suitable for different geographical regions of the world have been developed that are adapted to different environmental conditions; however, the molecular basis of such adaptations remains unknown in most cases. We have compared the quantitative proteomics profile of the roots of two different wheat varieties, Nesser (drought-tolerant) and Opata (drought-sensitive), in the absence and presence of abscisic acid (ABA, as a proxy for drought). A labeling LC-based quantitative proteomics approach using iTRAQ was applied to elucidate the changes in protein abundance levels. Quantitative differences in protein levels were analyzed for the evaluation of inherent differences between the two varieties as well as the overall and variety-specific effect of ABA on the root proteome. This study reveals the most elaborate ABA-responsive root proteome identified to date in wheat. A large number of proteins exhibited inherently different expression levels between Nesser and Opata. Additionally, significantly higher numbers of proteins were ABA-responsive in Nesser roots compared with Opata roots. Furthermore, several proteins showed variety-specific regulation by ABA, suggesting their role in drought adaptation.

  5. Comparative Proteomics Analysis Reveals L-Arginine Activates Ethanol Degradation Pathways in HepG2 Cells.

    PubMed

    Yan, Guokai; Lestari, Retno; Long, Baisheng; Fan, Qiwen; Wang, Zhichang; Guo, Xiaozhen; Yu, Jie; Hu, Jun; Yang, Xingya; Chen, Changqing; Liu, Lu; Li, Xiuzhi; Purnomoadi, Agung; Achmadi, Joelal; Yan, Xianghua

    2016-03-17

    L-Arginine (Arg) is a versatile amino acid that plays crucial roles in a wide range of physiological and pathological processes. In this study, to investigate the alteration induced by Arg supplementation in proteome scale, isobaric tags for relative and absolute quantification (iTRAQ) based proteomic approach was employed to comparatively characterize the differentially expressed proteins between Arg deprivation (Ctrl) and Arg supplementation (+Arg) treated human liver hepatocellular carcinoma (HepG2) cells. A total of 21 proteins were identified as differentially expressed proteins and these 21 proteins were all up-regulated by Arg supplementation. Six amino acid metabolism-related proteins, mostly metabolic enzymes, showed differential expressions. Intriguingly, Ingenuity Pathway Analysis (IPA) based pathway analysis suggested that the three ethanol degradation pathways were significantly altered between Ctrl and +Arg. Western blotting and enzymatic activity assays validated that the key enzymes ADH1C, ALDH1A1, and ALDH2, which are mainly involved in ethanol degradation pathways, were highly differentially expressed, and activated between Ctrl and +Arg in HepG2 cells. Furthermore, 10 mM Arg significantly attenuated the cytotoxicity induced by 100 mM ethanol treatment (P < 0.0001). This study is the first time to reveal that Arg activates ethanol degradation pathways in HepG2 cells.

  6. MetReS, an Efficient Database for Genomic Applications.

    PubMed

    Vilaplana, Jordi; Alves, Rui; Solsona, Francesc; Mateo, Jordi; Teixidó, Ivan; Pifarré, Marc

    2018-02-01

    MetReS (Metabolic Reconstruction Server) is a genomic database that is shared between two software applications that address important biological problems. Biblio-MetReS is a data-mining tool that enables the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the processes of interest and their function. The main goal of this work was to identify the areas where the performance of the MetReS database performance could be improved and to test whether this improvement would scale to larger datasets and more complex types of analysis. The study was started with a relational database, MySQL, which is the current database server used by the applications. We also tested the performance of an alternative data-handling framework, Apache Hadoop. Hadoop is currently used for large-scale data processing. We found that this data handling framework is likely to greatly improve the efficiency of the MetReS applications as the dataset and the processing needs increase by several orders of magnitude, as expected to happen in the near future.

  7. A Large-Scale Quantitative Proteomic Approach to Identifying Sulfur Mustard-Induced Protein Phosphorylation Cascades

    DTIC Science & Technology

    2010-01-01

    snapshot of SM-induced toxicity. Over the past few years, innovations in systems biology and biotechnology have led to important advances in our under...perturbations. SILAC has been used to study tumor metastasis (3, 4), focal adhesion- associated proteins, growth factor signaling, and insulin regula- tion (5...stained with colloidal Coomassie blue. After it was destained, the gel lane was excised into six regions, and each region was cut into 1 mm cubes

  8. Proteome screening of pleural effusions identifies IL1A as a diagnostic biomarker for non-small cell lung cancer.

    PubMed

    Li, Yuanyuan; Lian, Hengning; Jia, Qingzhu; Wan, Ying

    2015-02-06

    Non-small cell lung cancer (NSCLC) is a common malignant disease, and in ~10-20% of patients, pleural effusion is the first symptom. The pleural effusion proteome contains information on pulmonary disease that directly or indirectly reflects pathophysiological status. However, the proteome of pleural effusion in NSCLC patients is not well understood, nor is the variability in protein composition between malignant and benign pleural effusions. Here, we investigated the different proteins in pleural effusions from NSCLC and tuberculosis (TB) patients by using nano-scale liquid chromatography-tandem mass spectrometry (nLC-MS/MS) analysis. In total, 363 proteins were identified in the NSCLC pleural effusion proteome with a low false discovery rate (<1%), and 199 proteins were unique to NSCLC. The proteins in the NSCLC patients' pleural effusion were involved in cell adhesion, proteolysis, and cell migration. Furthermore, interleukin 1 alpha (IL1A), a protein that regulates tumor growth, angiogenesis, and metastasis, was significantly more abundant in the NSCLC group compared to the TB group, a finding that was validated with an ELISA assay. Copyright © 2014 Elsevier Inc. All rights reserved.

  9. CPTAC Releases Largest-Ever Breast Cancer Proteome Dataset from Previously Genome Characterized Tumors | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium (CPTAC) scientists have released a dataset of proteins and  phosphopeptides identified through deep proteomic and phosphoproteomic analysis of breast tumor samples, previously genomically analyzed by The Cancer Genome Atlas (TCGA).

  10. Impact of cryopreservation on bull () semen proteome.

    PubMed

    Westfalewicz, B; Dietrich, M A; Ciereszko, A

    2015-11-01

    Cryopreservation of bull spermatozoa is a well-established technique, allowing artificial insemination of cattle on a commercial scale. However, the extent of proteome changes in seminal plasma and spermatozoa during cryopreservation are not yet fully known. The objective of this study was to compare the proteomes of fresh, equilibrated, and cryopreserved bull semen (spermatozoa and seminal plasma) to establish the changes in semen proteins during the cryopreservation process. Semen was collected from 6 mature Holstein Friesian bulls. After sample processing, comparative analysis and identification of proteins was performed using 2-dimensional difference in-gel electrophoresis coupled with matrix-assisted laser desorption/ionization mass spectrometry. Analysis of spermatozoa extracts revealed that 25 identified protein spots, representing 16 proteins, underwent significant ( < 0.05) changes in abundance due to equilibration and cryopreservation. Eighteen protein spots decreased in abundance, 5 protein spots increased in abundance, and 2 protein spots showed different, specific patterns of abundance changes. Analysis of seminal fluid containing seminal plasma showed that 6 identified protein spots, representing 4 proteins, underwent significant ( < 0.05) changes in abundance due to equilibration and cryopreservation. Two protein spots increased in abundance and 4 decreased in abundance. Semen extending and equilibration seems to be responsible for a significant portion of the proteome changes related to cryopreservation technology. Most sperm proteins affected by equilibration and cryopreservation are membrane bound, and loss of those proteins may reduce natural spermatozoa coating. Further research is needed to unravel the mechanisms of the particular protein changes described in this study and establish the relationship between those changes and sperm quality.

  11. CPTAC Releases Largest-Ever Ovarian Cancer Proteome Dataset from Previously Genome Characterized Tumors | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium (CPTAC) scientists have just released a comprehensive dataset of the proteomic analysis of high grade serous ovarian tumor samples, previously genomically analyzed by The Cancer Genome Atlas (TCGA).  This is one of the largest public datasets covering the proteome, phosphoproteome and glycoproteome with complementary deep genomic sequencing data on the same tumor.

  12. Neural Stem Cells (NSCs) and Proteomics.

    PubMed

    Shoemaker, Lorelei D; Kornblum, Harley I

    2016-02-01

    Neural stem cells (NSCs) can self-renew and give rise to the major cell types of the CNS. Studies of NSCs include the investigation of primary, CNS-derived cells as well as animal and human embryonic stem cell (ESC)-derived and induced pluripotent stem cell (iPSC)-derived sources. NSCs provide a means with which to study normal neural development, neurodegeneration, and neurological disease and are clinically relevant sources for cellular repair to the damaged and diseased CNS. Proteomics studies of NSCs have the potential to delineate molecules and pathways critical for NSC biology and the means by which NSCs can participate in neural repair. In this review, we provide a background to NSC biology, including the means to obtain them and the caveats to these processes. We then focus on advances in the proteomic interrogation of NSCs. This includes the analysis of posttranslational modifications (PTMs); approaches to analyzing different proteomic compartments, such the secretome; as well as approaches to analyzing temporal differences in the proteome to elucidate mechanisms of differentiation. We also discuss some of the methods that will undoubtedly be useful in the investigation of NSCs but which have not yet been applied to the field. While many proteomics studies of NSCs have largely catalogued the proteome or posttranslational modifications of specific cellular states, without delving into specific functions, some have led to understandings of functional processes or identified markers that could not have been identified via other means. Many challenges remain in the field, including the precise identification and standardization of NSCs used for proteomic analyses, as well as how to translate fundamental proteomics studies to functional biology. The next level of investigation will require interdisciplinary approaches, combining the skills of those interested in the biochemistry of proteomics with those interested in modulating NSC function. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  13. Software Analysis of Uncorrelated MS1 Peaks for Discovery of Post-Translational Modifications.

    PubMed

    Pascal, Bruce D; West, Graham M; Scharager-Tapia, Catherina; Flefil, Ricardo; Moroni, Tina; Martinez-Acedo, Pablo; Griffin, Patrick R; Carvalloza, Anthony C

    2015-12-01

    The goal in proteomics to identify all peptides in a complex mixture has been largely addressed using various LC MS/MS approaches, such as data dependent acquisition, SRM/MRM, and data independent acquisition instrumentation. Despite these developments, many peptides remain unsequenced, often due to low abundance, poor fragmentation patterns, or data analysis difficulties. Many of the unidentified peptides exhibit strong evidence in high resolution MS(1) data and are frequently post-translationally modified, playing a significant role in biological processes. Proteomics Workbench (PWB) software was developed to automate the detection and visualization of all possible peptides in MS(1) data, reveal candidate peptides not initially identified, and build inclusion lists for subsequent MS(2) analysis to uncover new identifications. We used this software on existing data on the autophagy regulating kinase Ulk1 as a proof of concept for this method, as we had already manually identified a number of phosphorylation sites Dorsey, F. C. et al (J. Proteome. Res. 8(11), 5253-5263 (2009)). PWB found all previously identified sites of phosphorylation. The software has been made freely available at http://www.proteomicsworkbench.com . Graphical Abstract ᅟ.

  14. A novel proteomics sample preparation method for secretome analysis of Hypocrea jecorina growing on insoluble substrates.

    PubMed

    Bengtsson, Oskar; Arntzen, Magnus Ø; Mathiesen, Geir; Skaugen, Morten; Eijsink, Vincent G H

    2016-01-10

    Analysis of the secretomes of filamentous fungi growing on insoluble lignocellulosic substrates is of major current interest because of the industrial potential of secreted fungal enzymes. Importantly, such studies can help identifying key enzymes from a large arsenal of bioinformatically detected candidates in fungal genomes. We describe a simple, plate-based method to analyze the secretome of Hypocrea jecorina growing on insoluble substrates that allows harsh sample preparation methods promoting desorption, and subsequent identification, of substrate-bound proteins, while minimizing contamination with non-secreted proteins from leaking or lysed cells. The validity of the method was demonstrated by comparative secretome analysis of wild-type H.jecorina strain QM6a growing on bagasse, birch wood, spruce wood or pure cellulose, using label-fee quantification. The proteomic data thus obtained were consistent with existing data from transcriptomics and proteomics studies and revealed clear differences in the responses to complex lignocellulosic substrates and the response to pure cellulose. This easy method is likely to be generally applicable to filamentous fungi and to other microorganisms growing on insoluble substrates. Copyright © 2015 Elsevier B.V. All rights reserved.

  15. [Progress in omics research of Aspergillus niger].

    PubMed

    Sui, Yufei; Ouyang, Liming; Lu, Hongzhong; Zhuang, Yingping; Zhang, Siliang

    2016-08-25

    Aspergillus niger, as an important industrial fermentation strain, is widely applied in the production of organic acids and industrial enzymes. With the development of diverse omics technologies, the data of genome, transcriptome, proteome and metabolome of A. niger are increasing continuously, which declared the coming era of big data for the research in fermentation process of A. niger. The data analysis from single omics and the comparison of multi-omics, to the integrations of multi-omics based on the genome-scale metabolic network model largely extends the intensive and systematic understanding of the efficient production mechanism of A. niger. It also provides possibilities for the reasonable global optimization of strain performance by genetic modification and process regulation. We reviewed and summarized progress in omics research of A. niger, and proposed the development direction of omics research on this cell factory.

  16. Evaluation of selected binding domains for the analysis of ubiquitinated proteomes

    PubMed Central

    Nakayasu, Ernesto S.; Ansong, Charles; Brown, Joseph N.; Yang, Feng; Lopez-Ferrer, Daniel; Qian, Wei-Jun; Smith, Richard D.; Adkins, Joshua N.

    2013-01-01

    Ubiquitination is an abundant post-translational modification that consists of covalent attachment of ubiquitin to lysine residues or the N-terminus of proteins. Mono and polyubiquitination have been shown to be involved in many critical eukaryotic cellular functions and are often disrupted by intracellular bacterial pathogens. Affinity enrichment of ubiquitinated proteins enables global analysis of this key modification. In this context, the use of ubiquitin-binding domains is a promising, but relatively unexplored alternative to more broadly used immunoaffinity or tagged affinity enrichment methods. In this study, we evaluated the application of eight ubiquitin-binding domains that have differing affinities for ubiquitination states. Small-scale proteomics analysis identified ∼200 ubiquitinated protein candidates per ubiquitin-binding domain pull-down experiment. Results from subsequent Western blot analyses that employed anti-ubiquitin or monoclonal antibodies against polyubiquitination at lysine 48 and 63 suggest that ubiquitin-binding domains from Dsk2 and ubiquilin-1 have the broadest specificity in that they captured most types of ubiquitination, whereas the binding domain from NBR1 was more selective to polyubiquitination. These data demonstrate that with optimized purification conditions, ubiquitin-binding domains can be an alternative tool for proteomic applications. This approach is especially promising for the analysis of tissues or cells resistant to transfection, of which the overexpression of tagged ubiquitin is a major hurdle. PMID:23649778

  17. Quantitative proteomics reveals the central changes of wheat in response to powdery mildew.

    PubMed

    Fu, Ying; Zhang, Hong; Mandal, Siddikun Nabi; Wang, Changyou; Chen, Chunhuan; Ji, Wanquan

    2016-01-01

    Powdery mildew (Pm), caused by Blumeria graminis f. sp. tritici (Bgt), is one of the most important crop diseases, causing severe economic losses to wheat production worldwide. However, there are few reports about the proteomic response to Bgt infection in resistant wheat. Hence, quantitative proteomic analysis of N9134, a resistant wheat line, was performed to explore the molecular mechanism of wheat in defense against Bgt. Comparing the leaf proteins of Bgt-inoculated N9134 with that of mock-inoculated controls, a total of 2182 protein-species were quantified by iTRAQ at 24, 48 and 72h postinoculation (hpi) with Bgt, of which 394 showed differential accumulation. These differentially accumulated protein-species (DAPs) mainly included pathogenesis-related (PR) polypeptides, oxidative stress responsive proteins and components involved in primary metabolic pathways. KEGG enrichment analysis showed that phenylpropanoid biosynthesis, phenylalanine metabolism and photosynthesis-antenna proteins were the key pathways in response to Bgt infection. InterProScan 5 and the Gibbs Motif Sampler cluster 394 DAPs into eight conserved motifs, which shared leucine repeats and histidine sites in the sequence motifs. Moreover, eight separate protein-protein interaction (PPI) networks were predicted from STRING database. This study provides a powerful platform for further exploration of the molecular mechanism underlying resistant wheat responding to Bgt. Powdery mildew, caused by Blumeria graminis f. sp. tritici (Bgt), is a destructive pathogenic disease in wheat-producing regions worldwide, resulting in severe yield reductions. Although many resistant wheat varieties have been cultivated, there are few reports about the proteomic response to Bgt infection in resistant wheat. Therefore, an iTRAQ-based quantitative proteomic analysis of a resistant wheat line (N9134) in response to Bgt infection has been performed. This paper provides new insights into the underlying molecular mechanism of wheat in response to Bgt. The proteomic analysis can significantly narrow the field of potential defense-related protein-species, and is conducive to recognize the critical or effector protein under Bgt infection more precisely. Taken together, large amounts of high-throughput data provide a powerful platform for further exploration of the molecular mechanism on wheat-Bgt interactions. Copyright © 2015 Elsevier B.V. All rights reserved.

  18. PeptideDepot: Flexible Relational Database for Visual Analysis of Quantitative Proteomic Data and Integration of Existing Protein Information

    PubMed Central

    Yu, Kebing; Salomon, Arthur R.

    2010-01-01

    Recently, dramatic progress has been achieved in expanding the sensitivity, resolution, mass accuracy, and scan rate of mass spectrometers able to fragment and identify peptides through tandem mass spectrometry (MS/MS). Unfortunately, this enhanced ability to acquire proteomic data has not been accompanied by a concomitant increase in the availability of flexible tools allowing users to rapidly assimilate, explore, and analyze this data and adapt to a variety of experimental workflows with minimal user intervention. Here we fill this critical gap by providing a flexible relational database called PeptideDepot for organization of expansive proteomic data sets, collation of proteomic data with available protein information resources, and visual comparison of multiple quantitative proteomic experiments. Our software design, built upon the synergistic combination of a MySQL database for safe warehousing of proteomic data with a FileMaker-driven graphical user interface for flexible adaptation to diverse workflows, enables proteomic end-users to directly tailor the presentation of proteomic data to the unique analysis requirements of the individual proteomics lab. PeptideDepot may be deployed as an independent software tool or integrated directly with our High Throughput Autonomous Proteomic Pipeline (HTAPP) used in the automated acquisition and post-acquisition analysis of proteomic data. PMID:19834895

  19. Proteomic profiling of isogenic primary and metastatic medulloblastoma cell lines reveals differential expression of key metastatic factors.

    PubMed

    Gu, Shuo; Chen, Kai; Yin, Minzhi; Wu, Zhixiang; Wu, Yeming

    2017-05-08

    Medulloblastoma is the most common malignant brain tumor in children. Around 30% of medulloblastoma patients are diagnosed with metastasis, which often results in a poor prognosis. Unfortunately, molecular mechanisms of medulloblastoma metastasis remain largely unknown. In this study, we employed the recently developed deep proteome analysis approach to quantitatively profile the expression of >10,000 proteins from CHLA-01-MED and CHLA-01R-MED isogenic cell lines derived from the primary and metastatic tumor of the same patient diagnosed with a group IV medulloblastoma. Using statistical analysis, we identified ~1400 significantly altered proteins between the primary and metastatic cell lines including known factors such as placental growth factor (PLGF), LIM homeobox 1 (LHX1) and prominim 1 (PROM1), as well as the negative regulator secreted protein acidic and cysteine rich (SPARC). Additional transwell experiments and immunohistochemical analysis of clinical medulloblastoma samples implicated yes-associated protein 1 (YAP1) as a potential key factor contributing to metastasis. Taken together, our data broadly defines the metastasis-relevant regulated proteome and provides a precious resource for further investigating potential mechanisms of medulloblastoma metastasis. This study represented the first deep proteome analysis of metastatic medulloblastomas and provided a valuable candidate list of altered proteins in metastatic medulloblastomas. The primary data suggested YAP1 as a potential driver for the metastasis of medulloblastoma. These results open up numerous avenues for further investigating the underlying mechanisms of medulloblastoma metastasis and improving the prognosis of medulloblastoma patients. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. Proteomic analysis of laser-captured paraffin-embedded tissues: a molecular portrait of head and neck cancer progression.

    PubMed

    Patel, Vyomesh; Hood, Brian L; Molinolo, Alfredo A; Lee, Norman H; Conrads, Thomas P; Braisted, John C; Krizman, David B; Veenstra, Timothy D; Gutkind, J Silvio

    2008-02-15

    Squamous cell carcinoma of the head and neck (HNSCC), the sixth most prevalent cancer among men worldwide, is associated with poor prognosis, which has improved only marginally over the past three decades. A proteomic analysis of HNSCC lesions may help identify novel molecular targets for the early detection, prevention, and treatment of HNSCC. Laser capture microdissection was combined with recently developed techniques for protein extraction from formalin-fixed paraffin-embedded (FFPE) tissues and a novel proteomics platform. Approximately 20,000 cells procured from FFPE tissue sections of normal oral epithelium and well, moderately, and poorly differentiated HNSCC were processed for mass spectrometry and bioinformatic analysis. A large number of proteins expressed in normal oral epithelium and HNSCC, including cytokeratins, intermediate filaments, differentiation markers, and proteins involved in stem cell maintenance, signal transduction, migration, cell cycle regulation, growth and angiogenesis, matrix degradation, and proteins with tumor suppressive and oncogenic potential, were readily detected. Of interest, the relative expression of many of these molecules followed a distinct pattern in normal squamous epithelia and well, moderately, and poorly differentiated HNSCC tumor tissues. Representative proteins were further validated using immunohistochemical studies in HNSCC tissue sections and tissue microarrays. The ability to combine laser capture microdissection and in-depth proteomic analysis of FFPE tissues provided a wealth of information regarding the nature of the proteins expressed in normal squamous epithelium and during HNSCC progression, which may allow the development of novel biomarkers of diagnostic and prognostic value and the identification of novel targets for therapeutic intervention in HNSCC.

  1. Building high-quality assay libraries for targeted analysis of SWATH MS data.

    PubMed

    Schubert, Olga T; Gillet, Ludovic C; Collins, Ben C; Navarro, Pedro; Rosenberger, George; Wolski, Witold E; Lam, Henry; Amodei, Dario; Mallick, Parag; MacLean, Brendan; Aebersold, Ruedi

    2015-03-01

    Targeted proteomics by selected/multiple reaction monitoring (S/MRM) or, on a larger scale, by SWATH (sequential window acquisition of all theoretical spectra) MS (mass spectrometry) typically relies on spectral reference libraries for peptide identification. Quality and coverage of these libraries are therefore of crucial importance for the performance of the methods. Here we present a detailed protocol that has been successfully used to build high-quality, extensive reference libraries supporting targeted proteomics by SWATH MS. We describe each step of the process, including data acquisition by discovery proteomics, assertion of peptide-spectrum matches (PSMs), generation of consensus spectra and compilation of MS coordinates that uniquely define each targeted peptide. Crucial steps such as false discovery rate (FDR) control, retention time normalization and handling of post-translationally modified peptides are detailed. Finally, we show how to use the library to extract SWATH data with the open-source software Skyline. The protocol takes 2-3 d to complete, depending on the extent of the library and the computational resources available.

  2. Integration analysis of quantitative proteomics and transcriptomics data identifies potential targets of frizzled-8 protein-related antiproliferative factor in vivo.

    PubMed

    Yang, Wei; Kim, Yongsoo; Kim, Taek-Kyun; Keay, Susan K; Kim, Kwang Pyo; Steen, Hanno; Freeman, Michael R; Hwang, Daehee; Kim, Jayoung

    2012-12-01

    What's known on the subject? and What does the study add? Interstitial cystitis (IC) is a prevalent and debilitating pelvic disorder generally accompanied by chronic pain combined with chronic urinating problems. Over one million Americans are affected, especially middle-aged women. However, its aetiology or mechanism remains unclear. No efficient drug has been provided to patients. Several urinary biomarker candidates have been identified for IC; among the most promising is antiproliferative factor (APF), whose biological activity is detectable in urine specimens from >94% of patients with both ulcerative and non-ulcerative IC. The present study identified several important mediators of the effect of APF on bladder cell physiology, suggesting several candidate drug targets against IC. In an attempt to identify potential proteins and genes regulated by APF in vivo, and to possibly expand the APF-regulated network identified by stable isotope labelling by amino acids in cell culture (SILAC), we performed an integration analysis of our own SILAC data and the microarray data of Gamper et al. (2009) BMC Genomics 10: 199. Notably, two of the proteins (i.e. MAPKSP1 and GSPT1) that are down-regulated by APF are involved in the activation of mTORC1, suggesting that the mammalian target of rapamycin (mTOR) pathway is potentially a critical pathway regulated by APF in vivo. Several components of the mTOR pathway are currently being studied as potential therapeutic targets in other diseases. Our analysis suggests that this pathway might also be relevant in the design of diagnostic tools and medications targeting IC. • To enhance our understanding of the interstitial cystitis urine biomarker antiproliferative factor (APF), as well as interstitial cystitis biology more generally at the systems level, we reanalyzed recently published large-scale quantitative proteomics and in vivo transcriptomics data sets using an integration analysis tool that we have developed. • To identify more differentially expressed genes with a lower false discovery rate from a previously published microarray data set, an integrative hypothesis-testing statistical approach was applied. • For validation experiments, expression and phosphorylation levels of select proteins were evaluated by western blotting. • Integration analysis of this transcriptomics data set with our own quantitative proteomics data set identified 10 genes that are potentially regulated by APF in vivo from 4140 differentially expressed genes identified with a false discovery rate of 1%. • Of these, five (i.e. JUP, MAPKSP1, GSPT1, PTGS2/COX-2 and XPOT) were found to be prominent after network modelling of the common genes identified in the proteomics and microarray studies. • This molecular signature reflects the biological processes of cell adhesion, cell proliferation and inflammation, which is consistent with the known physiological effects of APF. • Lastly, we found the mammalian target of rapamycin pathway was down-regulated in response to APF. • This unbiased integration analysis of in vitro quantitative proteomics data with in vivo quantitative transcriptomics data led to the identification of potential downstream mediators of the APF signal transduction pathway. © 2012 THE AUTHORS. BJU INTERNATIONAL © 2012 BJU INTERNATIONAL.

  3. SpirPro: A Spirulina proteome database and web-based tools for the analysis of protein-protein interactions at the metabolic level in Spirulina (Arthrospira) platensis C1.

    PubMed

    Senachak, Jittisak; Cheevadhanarak, Supapon; Hongsthong, Apiradee

    2015-07-29

    Spirulina (Arthrospira) platensis is the only cyanobacterium that in addition to being studied at the molecular level and subjected to gene manipulation, can also be mass cultivated in outdoor ponds for commercial use as a food supplement. Thus, encountering environmental changes, including temperature stresses, is common during the mass production of Spirulina. The use of cyanobacteria as an experimental platform, especially for photosynthetic gene manipulation in plants and bacteria, is becoming increasingly important. Understanding the mechanisms and protein-protein interaction networks that underlie low- and high-temperature responses is relevant to Spirulina mass production. To accomplish this goal, high-throughput techniques such as OMICs analyses are used. Thus, large datasets must be collected, managed and subjected to information extraction. Therefore, databases including (i) proteomic analysis and protein-protein interaction (PPI) data and (ii) domain/motif visualization tools are required for potential use in temperature response models for plant chloroplasts and photosynthetic bacteria. A web-based repository was developed including an embedded database, SpirPro, and tools for network visualization. Proteome data were analyzed integrated with protein-protein interactions and/or metabolic pathways from KEGG. The repository provides various information, ranging from raw data (2D-gel images) to associated results, such as data from interaction and/or pathway analyses. This integration allows in silico analyses of protein-protein interactions affected at the metabolic level and, particularly, analyses of interactions between and within the affected metabolic pathways under temperature stresses for comparative proteomic analysis. The developed tool, which is coded in HTML with CSS/JavaScript and depicted in Scalable Vector Graphics (SVG), is designed for interactive analysis and exploration of the constructed network. SpirPro is publicly available on the web at http://spirpro.sbi.kmutt.ac.th . SpirPro is an analysis platform containing an integrated proteome and PPI database that provides the most comprehensive data on this cyanobacterium at the systematic level. As an integrated database, SpirPro can be applied in various analyses, such as temperature stress response networking analysis in cyanobacterial models and interacting domain-domain analysis between proteins of interest.

  4. A Computational Tool to Detect and Avoid Redundancy in Selected Reaction Monitoring

    PubMed Central

    Röst, Hannes; Malmström, Lars; Aebersold, Ruedi

    2012-01-01

    Selected reaction monitoring (SRM), also called multiple reaction monitoring, has become an invaluable tool for targeted quantitative proteomic analyses, but its application can be compromised by nonoptimal selection of transitions. In particular, complex backgrounds may cause ambiguities in SRM measurement results because peptides with interfering transitions similar to those of the target peptide may be present in the sample. Here, we developed a computer program, the SRMCollider, that calculates nonredundant theoretical SRM assays, also known as unique ion signatures (UIS), for a given proteomic background. We show theoretically that UIS of three transitions suffice to conclusively identify 90% of all yeast peptides and 85% of all human peptides. Using predicted retention times, the SRMCollider also simulates time-scheduled SRM acquisition, which reduces the number of interferences to consider and leads to fewer transitions necessary to construct an assay. By integrating experimental fragment ion intensities from large scale proteome synthesis efforts (SRMAtlas) with the information content-based UIS, we combine two orthogonal approaches to create high quality SRM assays ready to be deployed. We provide a user friendly, open source implementation of an algorithm to calculate UIS of any order that can be accessed online at http://www.srmcollider.org to find interfering transitions. Finally, our tool can also simulate the specificity of novel data-independent MS acquisition methods in Q1–Q3 space. This allows us to predict parameters for these methods that deliver a specificity comparable with that of SRM. Using SRM interference information in addition to other sources of information can increase the confidence in an SRM measurement. We expect that the consideration of information content will become a standard step in SRM assay design and analysis, facilitated by the SRMCollider. PMID:22535207

  5. New Funding Opportunity Announcements (FOAs): Reissuance of Clinical Proteomic Tumor Analysis Consortium (CPTAC) | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    The National Cancer Institute is soliciting applications for the reissuance of its Clinical Proteomic Tumor Analysis Consortium (CPTAC) program.   CPTAC will support broad efforts focused on several cancer types to explore further the complexities of cancer proteomes and their connections to abnormalities in cancer genomes.

  6. An effective protein extraction method for two-dimensional electrophoresis in the anticancer herb Andrographis paniculata Nees.

    PubMed

    Talei, Daryush; Valdiani, Alireza; Puad, Mohd Abdullah

    2013-01-01

    Proteomic analysis of plants relies on high yields of pure protein. In plants, protein extraction and purification present a great challenge due to accumulation of a large amount of interfering substances, including polysaccharides, polyphenols, and secondary metabolites. Therefore, it is necessary to modify the extraction protocols. A study was conducted to compare four protein extraction and precipitation methods for proteomic analysis. The results showed significant differences in protein content among the four methods. The chloroform-trichloroacetic acid-acetone method using 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES) buffer provided the best results in terms of protein content, pellets, spot resolution, and intensity of unique spots detected. An overall of 83 qualitative or quantitative significant differential spots were found among the four methods. Based on the 2-DE gel map, the method is expected to benefit the development of high-level proteomic and biochemical studies of Andrographis paniculata, which may also be applied to other recalcitrant medicinal plant tissues. © 2013 International Union of Biochemistry and Molecular Biology, Inc.

  7. Proteomic Analysis of Laser Microdissected Melanoma Cells from Skin Organ Cultures

    PubMed Central

    Hood, Brian L.; Grahovac, Jelena; Flint, Melanie S.; Sun, Mai; Charro, Nuno; Becker, Dorothea; Wells, Alan; Conrads, Thomas P

    2010-01-01

    Gaining insights into the molecular events that govern the progression from melanoma in situ to advanced melanoma, and understanding how the local microenvironment at the melanoma site influences this progression, are two clinically pivotal aspects that to date are largely unexplored. In an effort to identify key regulators of the crosstalk between melanoma cells and the melanoma-skin microenvironment, primary and metastatic human melanoma cells were seeded into skin organ cultures (SOCs), and grown for two weeks. Melanoma cells were recovered from SOCs by laser microdissection and whole-cell tryptic digests analyzed by nanoflow liquid chromatography-tandem mass spectrometry with an LTQ-Orbitrap. The differential protein abundances were calculated by spectral counting, the results of which provides evidence that cell-matrix and cell-adhesion molecules that are upregulated in the presence of these melanoma cells recapitulate proteomic data obtained from comparative analysis of human biopsies of invasive melanoma and a tissue sample of adjacent, non-involved skin. This concordance demonstrates the value of SOCs for conducting proteomic investigations of the melanoma microenvironment. PMID:20459140

  8. The Perseus computational platform for comprehensive analysis of (prote)omics data.

    PubMed

    Tyanova, Stefka; Temu, Tikira; Sinitcyn, Pavel; Carlson, Arthur; Hein, Marco Y; Geiger, Tamar; Mann, Matthias; Cox, Jürgen

    2016-09-01

    A main bottleneck in proteomics is the downstream biological analysis of highly multivariate quantitative protein abundance data generated using mass-spectrometry-based analysis. We developed the Perseus software platform (http://www.perseus-framework.org) to support biological and biomedical researchers in interpreting protein quantification, interaction and post-translational modification data. Perseus contains a comprehensive portfolio of statistical tools for high-dimensional omics data analysis covering normalization, pattern recognition, time-series analysis, cross-omics comparisons and multiple-hypothesis testing. A machine learning module supports the classification and validation of patient groups for diagnosis and prognosis, and it also detects predictive protein signatures. Central to Perseus is a user-friendly, interactive workflow environment that provides complete documentation of computational methods used in a publication. All activities in Perseus are realized as plugins, and users can extend the software by programming their own, which can be shared through a plugin store. We anticipate that Perseus's arsenal of algorithms and its intuitive usability will empower interdisciplinary analysis of complex large data sets.

  9. Computational clustering for viral reference proteomes

    PubMed Central

    Chen, Chuming; Huang, Hongzhan; Mazumder, Raja; Natale, Darren A.; McGarvey, Peter B.; Zhang, Jian; Polson, Shawn W.; Wang, Yuqi; Wu, Cathy H.

    2016-01-01

    Motivation: The enormous number of redundant sequenced genomes has hindered efforts to analyze and functionally annotate proteins. As the taxonomy of viruses is not uniformly defined, viral proteomes pose special challenges in this regard. Grouping viruses based on the similarity of their proteins at proteome scale can normalize against potential taxonomic nomenclature anomalies. Results: We present Viral Reference Proteomes (Viral RPs), which are computed from complete virus proteomes within UniProtKB. Viral RPs based on 95, 75, 55, 35 and 15% co-membership in proteome similarity based clusters are provided. Comparison of our computational Viral RPs with UniProt’s curator-selected Reference Proteomes indicates that the two sets are consistent and complementary. Furthermore, each Viral RP represents a cluster of virus proteomes that was consistent with virus or host taxonomy. We provide BLASTP search and FTP download of Viral RP protein sequences, and a browser to facilitate the visualization of Viral RPs. Availability and implementation: http://proteininformationresource.org/rps/viruses/ Contact: chenc@udel.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153712

  10. Proteomic Workflows for Biomarker Identification Using Mass Spectrometry — Technical and Statistical Considerations during Initial Discovery

    PubMed Central

    Orton, Dennis J.; Doucette, Alan A.

    2013-01-01

    Identification of biomarkers capable of differentiating between pathophysiological states of an individual is a laudable goal in the field of proteomics. Protein biomarker discovery generally employs high throughput sample characterization by mass spectrometry (MS), being capable of identifying and quantifying thousands of proteins per sample. While MS-based technologies have rapidly matured, the identification of truly informative biomarkers remains elusive, with only a handful of clinically applicable tests stemming from proteomic workflows. This underlying lack of progress is attributed in large part to erroneous experimental design, biased sample handling, as well as improper statistical analysis of the resulting data. This review will discuss in detail the importance of experimental design and provide some insight into the overall workflow required for biomarker identification experiments. Proper balance between the degree of biological vs. technical replication is required for confident biomarker identification. PMID:28250400

  11. Proteomic analysis of the response to cell cycle arrests in human myeloid leukemia cells.

    PubMed

    Ly, Tony; Endo, Aki; Lamond, Angus I

    2015-01-02

    Previously, we analyzed protein abundance changes across a 'minimally perturbed' cell cycle by using centrifugal elutriation to differentially enrich distinct cell cycle phases in human NB4 cells (Ly et al., 2014). In this study, we compare data from elutriated cells with NB4 cells arrested at comparable phases using serum starvation, hydroxyurea, or RO-3306. While elutriated and arrested cells have similar patterns of DNA content and cyclin expression, a large fraction of the proteome changes detected in arrested cells are found to reflect arrest-specific responses (i.e., starvation, DNA damage, CDK1 inhibition), rather than physiological cell cycle regulation. For example, we show most cells arrested in G2 by CDK1 inhibition express abnormally high levels of replication and origin licensing factors and are likely poised for genome re-replication. The protein data are available in the Encyclopedia of Proteome Dynamics (

  12. Comparative Proteome Analysis of the Tuberous Roots of Six Cassava (Manihot esculenta) Varieties Reveals Proteins Related to Phenotypic Traits.

    PubMed

    Schmitz, Gabriela Justamante Händel; de Magalhães Andrade, Jonathan; Valle, Teresa Losada; Labate, Carlos Alberto; do Nascimento, João Roberto Oliveira

    2016-04-27

    Cassava (Manihot esculenta Crantz) is a staple food and an important source of starch, and the attributes of its tuberous root largely depend on the variety. The proteome of cassava has been investigated; however, to date, no study has focused on varieties that reveal the molecular basis of phenotypical characteristics. Therefore, we aimed to compare the proteome of the tuberous roots of six cassava varieties that differed in carbohydrates, carotenoids, and resistance to diseases, among other attributes. Two-dimensional gels showed 146 differential spots between the varieties, and the functional roles of some differential proteins were correlated to phenotypic characteristics of the varieties, such as the amount of carbohydrates or carotenoids and the resistance to biotic or abiotic stresses. The results obtained here highlight elements that might help to direct the improvement of new cultivars of cassava, which is an economically and socially relevant crop worldwide.

  13. A Combination of Histological, Physiological, and Proteomic Approaches Shed Light on Seed Desiccation Tolerance of the Basal Angiosperm Amborella trichopoda.

    PubMed

    Villegente, Matthieu; Marmey, Philippe; Job, Claudette; Galland, Marc; Cueff, Gwendal; Godin, Béatrice; Rajjou, Loïc; Balliau, Thierry; Zivy, Michel; Fogliani, Bruno; Sarramegna-Burtet, Valérie; Job, Dominique

    2017-07-28

    Desiccation tolerance allows plant seeds to remain viable in a dry state for years and even centuries. To reveal potential evolutionary processes of this trait, we have conducted a shotgun proteomic analysis of isolated embryo and endosperm from mature seeds of Amborella trichopoda , an understory shrub endemic to New Caledonia that is considered to be the basal extant angiosperm. The present analysis led to the characterization of 415 and 69 proteins from the isolated embryo and endosperm tissues, respectively. The role of these proteins is discussed in terms of protein evolution and physiological properties of the rudimentary, underdeveloped, Amborella embryos, notably considering that the acquisition of desiccation tolerance corresponds to the final developmental stage of mature seeds possessing large embryos.

  14. A Combination of Histological, Physiological, and Proteomic Approaches Shed Light on Seed Desiccation Tolerance of the Basal Angiosperm Amborella trichopoda

    PubMed Central

    Villegente, Matthieu; Marmey, Philippe; Job, Claudette; Galland, Marc; Cueff, Gwendal; Godin, Béatrice; Rajjou, Loïc; Balliau, Thierry; Zivy, Michel; Sarramegna-Burtet, Valérie; Job, Dominique

    2017-01-01

    Desiccation tolerance allows plant seeds to remain viable in a dry state for years and even centuries. To reveal potential evolutionary processes of this trait, we have conducted a shotgun proteomic analysis of isolated embryo and endosperm from mature seeds of Amborella trichopoda, an understory shrub endemic to New Caledonia that is considered to be the basal extant angiosperm. The present analysis led to the characterization of 415 and 69 proteins from the isolated embryo and endosperm tissues, respectively. The role of these proteins is discussed in terms of protein evolution and physiological properties of the rudimentary, underdeveloped, Amborella embryos, notably considering that the acquisition of desiccation tolerance corresponds to the final developmental stage of mature seeds possessing large embryos. PMID:28788068

  15. Mining Missing Membrane Proteins by High-pH Reverse-Phase StageTip Fractionation and Multiple Reaction Monitoring Mass Spectrometry.

    PubMed

    Kitata, Reta Birhanu; Dimayacyac-Esleta, Baby Rorielyn T; Choong, Wai-Kok; Tsai, Chia-Feng; Lin, Tai-Du; Tsou, Chih-Chiang; Weng, Shao-Hsing; Chen, Yi-Ju; Yang, Pan-Chyr; Arco, Susan D; Nesvizhskii, Alexey I; Sung, Ting-Yi; Chen, Yu-Ju

    2015-09-04

    Despite significant efforts in the past decade toward complete mapping of the human proteome, 3564 proteins (neXtProt, 09-2014) are still "missing proteins". Over one-third of these missing proteins are annotated as membrane proteins, owing to their relatively challenging accessibility with standard shotgun proteomics. Using nonsmall cell lung cancer (NSCLC) as a model study, we aim to mine missing proteins from disease-associated membrane proteome, which may be still largely under-represented. To increase identification coverage, we employed Hp-RP StageTip prefractionation of membrane-enriched samples from 11 NSCLC cell lines. Analysis of membrane samples from 20 pairs of tumor and adjacent normal lung tissue was incorporated to include physiologically expressed membrane proteins. Using multiple search engines (X!Tandem, Comet, and Mascot) and stringent evaluation of FDR (MAYU and PeptideShaker), we identified 7702 proteins (66% membrane proteins) and 178 missing proteins (74 membrane proteins) with PSM-, peptide-, and protein-level FDR of 1%. Through multiple reaction monitoring using synthetic peptides, we provided additional evidence of eight missing proteins including seven with transmembrane helix domains. This study demonstrates that mining missing proteins focused on cancer membrane subproteome can greatly contribute to map the whole human proteome. All data were deposited into ProteomeXchange with the identifier PXD002224.

  16. The Urine Proteome as a Biomarker of Radiation Injury: Submitted to Proteomics- Clinical Applications Special Issue: "Renal and Urinary Proteomics (Thongboonkerd)"

    PubMed

    Sharma, Mukut; Halligan, Brian D; Wakim, Bassam T; Savin, Virginia J; Cohen, Eric P; Moulder, John E

    2008-06-18

    Terrorist attacks or nuclear accidents could expose large numbers of people to ionizing radiation, and early biomarkers of radiation injury would be critical for triage, treatment and follow-up of such individuals. However, no such biomarkers have yet been proven to exist. We tested the potential of high throughput proteomics to identify protein biomarkers of radiation injury after total body X-ray irradiation in a rat model. Subtle functional changes in the kidney are suggested by an increased glomerular permeability for macromolecules measured within 24 hours after TBI. Ultrastructural changes in glomerular podocytes include partial loss of the interdigitating organization of foot processes. Analysis of urine by LC-MS/MS and 2D-GE showed significant changes in the urine proteome within 24 hours after TBI. Tissue kallikrein 1-related peptidase, cysteine proteinase inhibitor cystatin C and oxidized histidine were found to be increased while a number of proteinase inhibitors including kallikrein-binding protein and albumin were found to be decreased post-irradiation. Thus, TBI causes immediately detectable changes in renal structure and function and in the urinary protein profile. This suggests that both systemic and renal changes are induced by radiation and it may be possible to identify a set of biomarkers unique to radiation injury.

  17. Proteome Analysis Unravels Mechanism Underling the Embryogenesis of the Honeybee Drone and Its Divergence with the Worker (Apis mellifera lingustica).

    PubMed

    Fang, Yu; Feng, Mao; Han, Bin; Qi, Yuping; Hu, Han; Fan, Pei; Huo, Xinmei; Meng, Lifeng; Li, Jianke

    2015-09-04

    The worker and drone bees each contain a separate diploid and haploid genetic makeup, respectively. Mechanisms regulating the embryogenesis of the drone and its mechanistic difference with the worker are still poorly understood. The proteomes of the two embryos at three time-points throughout development were analyzed by applying mass spectrometry-based proteomics. We identified 2788 and 2840 proteins in the worker and drone embryos, respectively. The age-dependent proteome driving the drone embryogenesis generally follows the worker's. The two embryos however evolve a distinct proteome setting to prime their respective embryogenesis. The strongly expressed proteins and pathways related to transcriptional-translational machinery and morphogenesis at 24 h drone embryo relative to the worker, illustrating the earlier occurrence of morphogenesis in the drone than worker. These morphogenesis differences remain through to the middle-late stage in the two embryos. The two embryos employ distinct antioxidant mechanisms coinciding with the temporal-difference organogenesis. The drone embryo's strongly expressed cytoskeletal proteins signify key roles to match its large body size. The RNAi induced knockdown of the ribosomal protein offers evidence for the functional investigation of gene regulating of honeybee embryogenesis. The data significantly expand novel regulatory mechanisms governing the embryogenesis, which is potentially important for honeybee and other insects.

  18. A Description of the Clinical Proteomic Tumor Analysis Consortium (CPTAC) Common Data Analysis Pipeline

    PubMed Central

    Rudnick, Paul A.; Markey, Sanford P.; Roth, Jeri; Mirokhin, Yuri; Yan, Xinjian; Tchekhovskoi, Dmitrii V.; Edwards, Nathan J.; Thangudu, Ratna R.; Ketchum, Karen A.; Kinsinger, Christopher R.; Mesri, Mehdi; Rodriguez, Henry; Stein, Stephen E.

    2016-01-01

    The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics datasets from the mass spectrometric interrogation of tumor samples previously analyzed by The Cancer Genome Atlas (TCGA) program. The availability of the genomic and proteomic data is enabling proteogenomic study for both reference (i.e., contained in major sequence databases) and non-reference markers of cancer. The CPTAC labs have focused on colon, breast, and ovarian tissues in the first round of analyses; spectra from these datasets were produced from 2D LC-MS/MS analyses and represent deep coverage. To reduce the variability introduced by disparate data analysis platforms (e.g., software packages, versions, parameters, sequence databases, etc.), the CPTAC Common Data Analysis Platform (CDAP) was created. The CDAP produces both peptide-spectrum-match (PSM) reports and gene-level reports. The pipeline processes raw mass spectrometry data according to the following: (1) Peak-picking and quantitative data extraction, (2) database searching, (3) gene-based protein parsimony, and (4) false discovery rate (FDR)-based filtering. The pipeline also produces localization scores for the phosphopeptide enrichment studies using the PhosphoRS program. Quantitative information for each of the datasets is specific to the sample processing, with PSM and protein reports containing the spectrum-level or gene-level (“rolled-up”) precursor peak areas and spectral counts for label-free or reporter ion log-ratios for 4plex iTRAQ™. The reports are available in simple tab-delimited formats and, for the PSM-reports, in mzIdentML. The goal of the CDAP is to provide standard, uniform reports for all of the CPTAC data, enabling comparisons between different samples and cancer types as well as across the major ‘omics fields. PMID:26860878

  19. A Description of the Clinical Proteomic Tumor Analysis Consortium (CPTAC) Common Data Analysis Pipeline.

    PubMed

    Rudnick, Paul A; Markey, Sanford P; Roth, Jeri; Mirokhin, Yuri; Yan, Xinjian; Tchekhovskoi, Dmitrii V; Edwards, Nathan J; Thangudu, Ratna R; Ketchum, Karen A; Kinsinger, Christopher R; Mesri, Mehdi; Rodriguez, Henry; Stein, Stephen E

    2016-03-04

    The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics data sets from the mass spectrometric interrogation of tumor samples previously analyzed by The Cancer Genome Atlas (TCGA) program. The availability of the genomic and proteomic data is enabling proteogenomic study for both reference (i.e., contained in major sequence databases) and nonreference markers of cancer. The CPTAC laboratories have focused on colon, breast, and ovarian tissues in the first round of analyses; spectra from these data sets were produced from 2D liquid chromatography-tandem mass spectrometry analyses and represent deep coverage. To reduce the variability introduced by disparate data analysis platforms (e.g., software packages, versions, parameters, sequence databases, etc.), the CPTAC Common Data Analysis Platform (CDAP) was created. The CDAP produces both peptide-spectrum-match (PSM) reports and gene-level reports. The pipeline processes raw mass spectrometry data according to the following: (1) peak-picking and quantitative data extraction, (2) database searching, (3) gene-based protein parsimony, and (4) false-discovery rate-based filtering. The pipeline also produces localization scores for the phosphopeptide enrichment studies using the PhosphoRS program. Quantitative information for each of the data sets is specific to the sample processing, with PSM and protein reports containing the spectrum-level or gene-level ("rolled-up") precursor peak areas and spectral counts for label-free or reporter ion log-ratios for 4plex iTRAQ. The reports are available in simple tab-delimited formats and, for the PSM-reports, in mzIdentML. The goal of the CDAP is to provide standard, uniform reports for all of the CPTAC data to enable comparisons between different samples and cancer types as well as across the major omics fields.

  20. pyQms enables universal and accurate quantification of mass spectrometry data.

    PubMed

    Leufken, Johannes; Niehues, Anna; Sarin, L Peter; Wessel, Florian; Hippler, Michael; Leidel, Sebastian A; Fufezan, Christian

    2017-10-01

    Quantitative mass spectrometry (MS) is a key technique in many research areas (1), including proteomics, metabolomics, glycomics, and lipidomics. Because all of the corresponding molecules can be described by chemical formulas, universal quantification tools are highly desirable. Here, we present pyQms, an open-source software for accurate quantification of all types of molecules measurable by MS. pyQms uses isotope pattern matching that offers an accurate quality assessment of all quantifications and the ability to directly incorporate mass spectrometer accuracy. pyQms is, due to its universal design, applicable to every research field, labeling strategy, and acquisition technique. This opens ultimate flexibility for researchers to design experiments employing innovative and hitherto unexplored labeling strategies. Importantly, pyQms performs very well to accurately quantify partially labeled proteomes in large scale and high throughput, the most challenging task for a quantification algorithm. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  1. Serial isoelectric focusing as an effective and economic way to obtain maximal resolution and high-throughput in 2D-based comparative proteomics of scarce samples: proof-of-principle.

    PubMed

    Farhoud, Murtada H; Wessels, Hans J C T; Wevers, Ron A; van Engelen, Baziel G; van den Heuvel, Lambert P; Smeitink, Jan A

    2005-01-01

    In 2D-based comparative proteomics of scarce samples, such as limited patient material, established methods for prefractionation and subsequent use of different narrow range IPG strips to increase overall resolution are difficult to apply. Also, a high number of samples, a prerequisite for drawing meaningful conclusions when pathological and control samples are considered, will increase the associated amount of work almost exponentially. Here, we introduce a novel, effective, and economic method designed to obtain maximum 2D resolution while maintaining the high throughput necessary to perform large-scale comparative proteomics studies. The method is based on connecting different IPG strips serially head-to-tail so that a complete line of different IPG strips with sequential pH regions can be focused in the same experiment. We show that when 3 IPG strips (covering together the pH range of 3-11) are connected head-to-tail an optimal resolution is achieved along the whole pH range. Sample consumption, time required, and associated costs are reduced by almost 70%, and the workload is reduced significantly.

  2. Proteomic analysis of the renal effects of simulated occupational jet fuel exposure.

    PubMed

    Witzmann, F A; Bauer, M D; Fieno, A M; Grant, R A; Keough, T W; Lacey, M P; Sun, Y; Witten, M L; Young, R S

    2000-03-01

    We analyzed protein expression in the cytosolic fraction prepared from whole kidneys in male Swiss-Webster mice exposed 1 h/day for five days to aerosolized JP-8 jet fuel at a concentration of 1000 mg/m3, simulating military occupational exposure. Kidney cytosol samples were solubilized and separated via large-scale, high-resolution two-dimensional electrophoresis (2-DE) and gel patterns scanned, digitized and processed for statistical analysis. Significant changes in soluble kidney proteins resulted from jet fuel exposure. Several of the altered proteins were identified by peptide mass finger-printing and related to ultrastructural abnormalities, altered protein processing, metabolic effects, and paradoxical stress protein/detoxification system responses. These results demonstrate a significant but comparatively moderate JP-8 effect on protein expression in the kidney and provide novel molecular evidence of JP-8 nephrotoxicity. Human risk is suggested by these data but conclusive assessment awaits a noninvasive search for biomarkers in JP-8 exposed humans.

  3. Multiplexed and Microparticle-based Analyses: Quantitative Tools for the Large-Scale Analysis of Biological Systems

    PubMed Central

    Nolan, John P.; Mandy, Francis

    2008-01-01

    While the term flow cytometry refers to the measurement of cells, the approach of making sensitive multiparameter optical measurements in a flowing sample stream is a very general analytical approach. The past few years have seen an explosion in the application of flow cytometry technology for molecular analysis and measurements using micro-particles as solid supports. While microsphere-based molecular analyses using flow cytometry date back three decades, the need for highly parallel quantitative molecular measurements that has arisen from various genomic and proteomic advances has driven the development in particle encoding technology to enable highly multiplexed assays. Multiplexed particle-based immunoassays are now common place, and new assays to study genes, protein function, and molecular assembly. Numerous efforts are underway to extend the multiplexing capabilities of microparticle-based assays through new approaches to particle encoding and analyte reporting. The impact of these developments will be seen in the basic research and clinical laboratories, as well as in drug development. PMID:16604537

  4. Chemical composition and the potential for proteomic transformation in cancer, hypoxia, and hyperosmotic stress

    PubMed Central

    2017-01-01

    The changes of protein expression that are monitored in proteomic experiments are a type of biological transformation that also involves changes in chemical composition. Accompanying the myriad molecular-level interactions that underlie any proteomic transformation, there is an overall thermodynamic potential that is sensitive to microenvironmental conditions, including local oxidation and hydration potential. Here, up- and down-expressed proteins identified in 71 comparative proteomics studies were analyzed using the average oxidation state of carbon (ZC) and water demand per residue (\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}${\\overline{n}}_{{\\mathrm{H}}_{2}\\mathrm{O}}$\\end{document}n¯H2O), calculated using elemental abundances and stoichiometric reactions to form proteins from basis species. Experimental lowering of oxygen availability (hypoxia) or water activity (hyperosmotic stress) generally results in decreased ZC or \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}${\\overline{n}}_{{\\mathrm{H}}_{2}\\mathrm{O}}$\\end{document}n¯H2O of up-expressed compared to down-expressed proteins. This correspondence of chemical composition with experimental conditions provides evidence for attraction of the proteomes to a low-energy state. An opposite compositional change, toward higher average oxidation or hydration state, is found for proteomic transformations in colorectal and pancreatic cancer, and in two experiments for adipose-derived stem cells. Calculations of chemical affinity were used to estimate the thermodynamic potentials for proteomic transformations as a function of fugacity of O2 and activity of H2O, which serve as scales of oxidation and hydration potential. Diagrams summarizing the relative potential for formation of up- and down-expressed proteins have predicted equipotential lines that cluster around particular values of oxygen fugacity and water activity for similar datasets. The changes in chemical composition of proteomes are likely linked with reactions among other cellular molecules. A redox balance calculation indicates that an increase in the lipid to protein ratio in cancer cells by 20% over hypoxic cells would generate a large enough electron sink for oxidation of the cancer proteomes. The datasets and computer code used here are made available in a new R package, canprot. PMID:28603672

  5. Imaging and Molecular Markers for Patients with Lung Cancer: Approaches with Molecular Targets, Complementary/Innovative Treatment, and Therapeutic Modalities

    DTIC Science & Technology

    2011-02-01

    Thrombocytopenia, Grade 3 in 1 patient • Hypomagnesemia, Grade 3 in 1 patient • Hypokalemia, Grade 3 in 2 patient • Pneumonia , Grade 3 in 7 patients...urgently needed. While the molecular events involved in lung cancer pathogenesis are being unraveled by ongoing large scale genomics, proteomics, and...tumor initiation, progression and metastasis are an important first step leading to the development of new prognostic markers and targets for therapy

  6. Cloud Computing for Protein-Ligand Binding Site Comparison

    PubMed Central

    2013-01-01

    The proteome-wide analysis of protein-ligand binding sites and their interactions with ligands is important in structure-based drug design and in understanding ligand cross reactivity and toxicity. The well-known and commonly used software, SMAP, has been designed for 3D ligand binding site comparison and similarity searching of a structural proteome. SMAP can also predict drug side effects and reassign existing drugs to new indications. However, the computing scale of SMAP is limited. We have developed a high availability, high performance system that expands the comparison scale of SMAP. This cloud computing service, called Cloud-PLBS, combines the SMAP and Hadoop frameworks and is deployed on a virtual cloud computing platform. To handle the vast amount of experimental data on protein-ligand binding site pairs, Cloud-PLBS exploits the MapReduce paradigm as a management and parallelizing tool. Cloud-PLBS provides a web portal and scalability through which biologists can address a wide range of computer-intensive questions in biology and drug discovery. PMID:23762824

  7. Cloud computing for protein-ligand binding site comparison.

    PubMed

    Hung, Che-Lun; Hua, Guan-Jie

    2013-01-01

    The proteome-wide analysis of protein-ligand binding sites and their interactions with ligands is important in structure-based drug design and in understanding ligand cross reactivity and toxicity. The well-known and commonly used software, SMAP, has been designed for 3D ligand binding site comparison and similarity searching of a structural proteome. SMAP can also predict drug side effects and reassign existing drugs to new indications. However, the computing scale of SMAP is limited. We have developed a high availability, high performance system that expands the comparison scale of SMAP. This cloud computing service, called Cloud-PLBS, combines the SMAP and Hadoop frameworks and is deployed on a virtual cloud computing platform. To handle the vast amount of experimental data on protein-ligand binding site pairs, Cloud-PLBS exploits the MapReduce paradigm as a management and parallelizing tool. Cloud-PLBS provides a web portal and scalability through which biologists can address a wide range of computer-intensive questions in biology and drug discovery.

  8. Proteomic analysis of broccoli (Brassica oleracea) under high temperature and waterlogging stresses.

    PubMed

    Lin, Hsin-Hung; Lin, Kuan-Hung; Chen, Su-Ching; Shen, Yu-Hsing; Lo, Hsiao-Feng

    2015-12-01

    The production of broccoli (Brassica oleracea) is largely reduced by waterlogging and high temperature stresses. Heat-tolerant and heat-susceptible broccoli cultivars TSS-AVRDC-2 and B-75, respectively, were used for physiological and proteomic analyses. The objective of this study was to identify TSS-AVRDC-2 and B-75 proteins differentially regulated at different time periods in response to waterlogging at 40 °C for three days. TSS-AVRDC-2 exhibited significantly higher chlorophyll content, lower stomatal conductance, and better H 2 O 2 scavenging under stress in comparison to B-75. Two-dimensional liquid phase fractionation analyses revealed that Rubisco proteins in both varieties were regulated under stressing treatments, and that TSS-AVRDC-2 had higher levels of both Rubisco large and small subunit transcripts than B-75 when subjected to high temperature and/or waterlogging. This report utilizes physiological and proteomic approaches to discover changes in the protein expression profiles of broccoli in response to heat and waterlogging stresses. Higher levels of Rubisco proteins in TSS-AVRDC-2 could lead to increased carbon fixation efficiency to provide sufficient energy to enable stress tolerance under waterlogging at 40 °C.

  9. Large-scale identification of c-MYC-associated proteins using a combined TAP/MudPIT approach.

    PubMed

    Koch, Heike B; Zhang, Ru; Verdoodt, Berlinda; Bailey, Aaron; Zhang, Chang-Dong; Yates, John R; Menssen, Antje; Hermeking, Heiko

    2007-01-15

    The c-MYC oncogene encodes a transcription factor, which is sufficient and necessary for the induction of cellular proliferation. However, the c-MYC protein is a relatively weak transactivator suggesting that it may have other functions. To identify protein interactors which may reveal new functions or represent regulators of c-MYC we systematically identified proteins associated with c-MYC in vivo using a proteomic approach. We combined tandem affinity purification (TAP) with the mass spectral multidimensional protein identification technology (MudPIT). Thereby, 221 c-MYC-associated proteins were identified. Among them were 17 previously known c-MYC-interactors. Selected new c-MYC-associated proteins (DBC-1, FBX29, KU70, MCM7, Mi2-beta/CHD4, RNA Pol II, RFC2, RFC3, SV40 Large T Antigen, TCP1alpha, U5-116kD, ZNF281) were confirmed independently. For association with MCM7, SV40 Large T Antigen and DBC-1 the functionally important MYC-box II region was required, whereas FBX29 and Mi2-beta interacted via MYC-box II and the BR-HLH-LZ motif. In addition, regulators of c-MYC activity were identified: ectopic expression of FBX29, an E3 ubiquitin ligase, decreased c-MYC protein levels and inhibited c-MYC transactivation, whereas knock-down of FBX29 elevated the concentration of c-MYC. Furthermore, sucrose gradient analysis demonstrated that c-MYC is present in numerous complexes with varying size and composition, which may accommodate the large number of new c-MYC-associated proteins identified here and mediate the diverse functions of c-MYC. Our results suggest that c-MYC, besides acting as a mitogenic transcription factor, regulates cellular proliferation by direct association with protein complexes involved in multiple synthetic processes required for cell division, as for example DNA-replication/repair and RNA-processing. Furthermore, this first comprehensive description of the c-MYC-associated sub-proteome will facilitate further studies aimed to elucidate the biology of c-MYC.

  10. Proteomics profiling of interactome dynamics by colocalisation analysis (COLA)† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c6mb00701e Click here for additional data file. Click here for additional data file.

    PubMed Central

    Sailem, Heba Z.; Kümper, Sandra; Tape, Christopher J.; McCully, Ryan R.; Paul, Angela; Anjomani-Virmouni, Sara; Jørgensen, Claus; Poulogiannis, George; Marshall, Christopher J.

    2017-01-01

    Localisation and protein function are intimately linked in eukaryotes, as proteins are localised to specific compartments where they come into proximity of other functionally relevant proteins. Significant co-localisation of two proteins can therefore be indicative of their functional association. We here present COLA, a proteomics based strategy coupled with a bioinformatics framework to detect protein–protein co-localisations on a global scale. COLA reveals functional interactions by matching proteins with significant similarity in their subcellular localisation signatures. The rapid nature of COLA allows mapping of interactome dynamics across different conditions or treatments with high precision. PMID:27824369

  11. Thermosensitivity of growth is determined by chaperone-mediated proteome reallocation

    PubMed Central

    Chen, Ke; Gao, Ye; Mih, Nathan; O’Brien, Edward J.; Yang, Laurence; Palsson, Bernhard O.

    2017-01-01

    Maintenance of a properly folded proteome is critical for bacterial survival at notably different growth temperatures. Understanding the molecular basis of thermoadaptation has progressed in two main directions, the sequence and structural basis of protein thermostability and the mechanistic principles of protein quality control assisted by chaperones. Yet we do not fully understand how structural integrity of the entire proteome is maintained under stress and how it affects cellular fitness. To address this challenge, we reconstruct a genome-scale protein-folding network for Escherichia coli and formulate a computational model, FoldME, that provides statistical descriptions of multiscale cellular response consistent with many datasets. FoldME simulations show (i) that the chaperones act as a system when they respond to unfolding stress rather than achieving efficient folding of any single component of the proteome, (ii) how the proteome is globally balanced between chaperones for folding and the complex machinery synthesizing the proteins in response to perturbation, (iii) how this balancing determines growth rate dependence on temperature and is achieved through nonspecific regulation, and (iv) how thermal instability of the individual protein affects the overall functional state of the proteome. Overall, these results expand our view of cellular regulation, from targeted specific control mechanisms to global regulation through a web of nonspecific competing interactions that modulate the optimal reallocation of cellular resources. The methodology developed in this study enables genome-scale integration of environment-dependent protein properties and a proteome-wide study of cellular stress responses. PMID:29073085

  12. Current Progress in Tonoplast Proteomics Reveals Insights into the Function of the Large Central Vacuole

    PubMed Central

    Trentmann, Oliver; Haferkamp, Ilka

    2013-01-01

    Vacuoles of plants fulfill various biologically important functions, like turgor generation and maintenance, detoxification, solute sequestration, or protein storage. Different types of plant vacuoles (lytic versus protein storage) are characterized by different functional properties apparently caused by a different composition/abundance and regulation of transport proteins in the surrounding membrane, the tonoplast. Proteome analyses allow the identification of vacuolar proteins and provide an informative basis for assigning observed transport processes to specific carriers or channels. This review summarizes techniques required for vacuolar proteome analyses, like e.g., isolation of the large central vacuole or tonoplast membrane purification. Moreover, an overview about diverse published vacuolar proteome studies is provided. It becomes evident that qualitative proteomes from different plant species represent just the tip of the iceberg. During the past few years, mass spectrometry achieved immense improvement concerning its accuracy, sensitivity, and application. As a consequence, modern tonoplast proteome approaches are suited for detecting alterations in membrane protein abundance in response to changing environmental/physiological conditions and help to clarify the regulation of tonoplast transport processes. PMID:23459586

  13. Human body fluid proteome analysis

    PubMed Central

    Hu, Shen; Loo, Joseph A.; Wong, David T.

    2010-01-01

    The focus of this article is to review the recent advances in proteome analysis of human body fluids, including plasma/serum, urine, cerebrospinal fluid, saliva, bronchoalveolar lavage fluid, synovial fluid, nipple aspirate fluid, tear fluid, and amniotic fluid, as well as its applications to human disease biomarker discovery. We aim to summarize the proteomics technologies currently used for global identification and quantification of body fluid proteins, and elaborate the putative biomarkers discovered for a variety of human diseases through human body fluid proteome (HBFP) analysis. Some critical concerns and perspectives in this emerging field are also discussed. With the advances made in proteomics technologies, the impact of HBFP analysis in the search for clinically relevant disease biomarkers would be realized in the future. PMID:17083142

  14. Human body fluid proteome analysis.

    PubMed

    Hu, Shen; Loo, Joseph A; Wong, David T

    2006-12-01

    The focus of this article is to review the recent advances in proteome analysis of human body fluids, including plasma/serum, urine, cerebrospinal fluid, saliva, bronchoalveolar lavage fluid, synovial fluid, nipple aspirate fluid, tear fluid, and amniotic fluid, as well as its applications to human disease biomarker discovery. We aim to summarize the proteomics technologies currently used for global identification and quantification of body fluid proteins, and elaborate the putative biomarkers discovered for a variety of human diseases through human body fluid proteome (HBFP) analysis. Some critical concerns and perspectives in this emerging field are also discussed. With the advances made in proteomics technologies, the impact of HBFP analysis in the search for clinically relevant disease biomarkers would be realized in the future.

  15. Micro-proteomics with iterative data analysis: Proteome analysis in C. elegans at the single worm level.

    PubMed

    Bensaddek, Dalila; Narayan, Vikram; Nicolas, Armel; Murillo, Alejandro Brenes; Gartner, Anton; Kenyon, Cynthia J; Lamond, Angus I

    2016-02-01

    Proteomics studies typically analyze proteins at a population level, using extracts prepared from tens of thousands to millions of cells. The resulting measurements correspond to average values across the cell population and can mask considerable variation in protein expression and function between individual cells or organisms. Here, we report the development of micro-proteomics for the analysis of Caenorhabditis elegans, a eukaryote composed of 959 somatic cells and ∼1500 germ cells, measuring the worm proteome at a single organism level to a depth of ∼3000 proteins. This includes detection of proteins across a wide dynamic range of expression levels (>6 orders of magnitude), including many chromatin-associated factors involved in chromosome structure and gene regulation. We apply the micro-proteomics workflow to measure the global proteome response to heat-shock in individual nematodes. This shows variation between individual animals in the magnitude of proteome response following heat-shock, including variable induction of heat-shock proteins. The micro-proteomics pipeline thus facilitates the investigation of stochastic variation in protein expression between individuals within an isogenic population of C. elegans. All data described in this study are available online via the Encyclopedia of Proteome Dynamics (http://www.peptracker.com/epd), an open access, searchable database resource. © 2015 The Authors. PROTEOMICS Published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. SwissPalm: Protein Palmitoylation database.

    PubMed

    Blanc, Mathieu; David, Fabrice; Abrami, Laurence; Migliozzi, Daniel; Armand, Florence; Bürgi, Jérôme; van der Goot, Françoise Gisou

    2015-01-01

    Protein S-palmitoylation is a reversible post-translational modification that regulates many key biological processes, although the full extent and functions of protein S-palmitoylation remain largely unexplored. Recent developments of new chemical methods have allowed the establishment of palmitoyl-proteomes of a variety of cell lines and tissues from different species.  As the amount of information generated by these high-throughput studies is increasing, the field requires centralization and comparison of this information. Here we present SwissPalm ( http://swisspalm.epfl.ch), our open, comprehensive, manually curated resource to study protein S-palmitoylation. It currently encompasses more than 5000 S-palmitoylated protein hits from seven species, and contains more than 500 specific sites of S-palmitoylation. SwissPalm also provides curated information and filters that increase the confidence in true positive hits, and integrates predictions of S-palmitoylated cysteine scores, orthologs and isoform multiple alignments. Systems analysis of the palmitoyl-proteome screens indicate that 10% or more of the human proteome is susceptible to S-palmitoylation. Moreover, ontology and pathway analyses of the human palmitoyl-proteome reveal that key biological functions involve this reversible lipid modification. Comparative analysis finally shows a strong crosstalk between S-palmitoylation and other post-translational modifications. Through the compilation of data and continuous updates, SwissPalm will provide a powerful tool to unravel the global importance of protein S-palmitoylation.

  17. SwissPalm: Protein Palmitoylation database

    PubMed Central

    Abrami, Laurence; Migliozzi, Daniel; Armand, Florence; Bürgi, Jérôme; van der Goot, Françoise Gisou

    2015-01-01

    Protein S-palmitoylation is a reversible post-translational modification that regulates many key biological processes, although the full extent and functions of protein S-palmitoylation remain largely unexplored. Recent developments of new chemical methods have allowed the establishment of palmitoyl-proteomes of a variety of cell lines and tissues from different species.  As the amount of information generated by these high-throughput studies is increasing, the field requires centralization and comparison of this information. Here we present SwissPalm ( http://swisspalm.epfl.ch), our open, comprehensive, manually curated resource to study protein S-palmitoylation. It currently encompasses more than 5000 S-palmitoylated protein hits from seven species, and contains more than 500 specific sites of S-palmitoylation. SwissPalm also provides curated information and filters that increase the confidence in true positive hits, and integrates predictions of S-palmitoylated cysteine scores, orthologs and isoform multiple alignments. Systems analysis of the palmitoyl-proteome screens indicate that 10% or more of the human proteome is susceptible to S-palmitoylation. Moreover, ontology and pathway analyses of the human palmitoyl-proteome reveal that key biological functions involve this reversible lipid modification. Comparative analysis finally shows a strong crosstalk between S-palmitoylation and other post-translational modifications. Through the compilation of data and continuous updates, SwissPalm will provide a powerful tool to unravel the global importance of protein S-palmitoylation. PMID:26339475

  18. Difference gel electrophoresis (DiGE) identifies differentially expressed proteins in endoscopically-collected pancreatic fluid

    PubMed Central

    Paulo, Joao A.; Lee, Linda S.; Banks, Peter A.; Steen, Hanno; Conwell, Darwin L.

    2012-01-01

    Alterations in the pancreatic fluid proteome of individuals with chronic pancreatitis may offer insights into the development and progression of the disease. The endoscopic pancreas function test (ePFT) can safely collect large volumes of pancreatic fluid that are potentially amenable to proteomic analyses using difference gel electrophoresis (DiGE) coupled with liquid chromatography-tandem mass spectrometry (LC-MS/MS). Pancreatic fluid was collected endoscopically using the ePFT method following secretin stimulation from three individuals with severe chronic pancreatitis and three chronic abdominal pain controls. The fluid was processed to minimize protein degradation and the protein profiles of each cohort, as determined by DiGE and LC-MS/MS, were compared. This DiGE-LC-MS/MS analysis reveals proteins that are differentially expressed in chronic pancreatitis compared to chronic abdominal pain controls. Proteins with higher abundance in pancreatic fluid from chronic pancreatitis individuals include: actin, desmoplankin, alpha-1-antitrypsin, SNC73, and serotransferrin. Those of relatively lower abundance include carboxypeptidase B, lipase, alpha-1-antichymotrypsin, alpha-2-macroglobulin, Arp2/3 subunit 4, glyceraldehyde-3-phosphate dehydrogenase, and protein disulfide isomerase. Endoscopic collection (ePFT) in tandem with DiGE-LC-MS/MS is a suitable approach for pancreatic fluid proteome analysis, however, further optimization of our protocol, as outlined herein, may improve proteome coverage in future analyses. PMID:21792986

  19. MASPECTRAS: a platform for management and analysis of proteomics LC-MS/MS data

    PubMed Central

    Hartler, Jürgen; Thallinger, Gerhard G; Stocker, Gernot; Sturn, Alexander; Burkard, Thomas R; Körner, Erik; Rader, Robert; Schmidt, Andreas; Mechtler, Karl; Trajanoski, Zlatko

    2007-01-01

    Background The advancements of proteomics technologies have led to a rapid increase in the number, size and rate at which datasets are generated. Managing and extracting valuable information from such datasets requires the use of data management platforms and computational approaches. Results We have developed the MAss SPECTRometry Analysis System (MASPECTRAS), a platform for management and analysis of proteomics LC-MS/MS data. MASPECTRAS is based on the Proteome Experimental Data Repository (PEDRo) relational database schema and follows the guidelines of the Proteomics Standards Initiative (PSI). Analysis modules include: 1) import and parsing of the results from the search engines SEQUEST, Mascot, Spectrum Mill, X! Tandem, and OMSSA; 2) peptide validation, 3) clustering of proteins based on Markov Clustering and multiple alignments; and 4) quantification using the Automated Statistical Analysis of Protein Abundance Ratios algorithm (ASAPRatio). The system provides customizable data retrieval and visualization tools, as well as export to PRoteomics IDEntifications public repository (PRIDE). MASPECTRAS is freely available at Conclusion Given the unique features and the flexibility due to the use of standard software technology, our platform represents significant advance and could be of great interest to the proteomics community. PMID:17567892

  20. Integrated Proteomic Approaches for Understanding Toxicity of Environmental Chemicals

    EPA Science Inventory

    To apply quantitative proteomic analysis to the evaluation of toxicity of environmental chemicals, we have developed an integrated proteomic technology platform. This platform has been applied to the analysis of the toxic effects and pathways of many important environmental chemi...

Top