DOE Office of Scientific and Technical Information (OSTI.GOV)
Gritsenko, Marina A.; Xu, Zhe; Liu, Tao
Comprehensive, quantitative information on abundances of proteins and their post-translational modifications (PTMs) can potentially provide novel biological insights into diseases pathogenesis and therapeutic intervention. Herein, we introduce a quantitative strategy utilizing isobaric stable isotope-labelling techniques combined with two-dimensional liquid chromatography-tandem mass spectrometry (2D-LC-MS/MS) for large-scale, deep quantitative proteome profiling of biological samples or clinical specimens such as tumor tissues. The workflow includes isobaric labeling of tryptic peptides for multiplexed and accurate quantitative analysis, basic reversed-phase LC fractionation and concatenation for reduced sample complexity, and nano-LC coupled to high resolution and high mass accuracy MS analysis for high confidence identification andmore » quantification of proteins. This proteomic analysis strategy has been successfully applied for in-depth quantitative proteomic analysis of tumor samples, and can also be used for integrated proteome and PTM characterization, as well as comprehensive quantitative proteomic analysis across samples from large clinical cohorts.« less
Gritsenko, Marina A; Xu, Zhe; Liu, Tao; Smith, Richard D
2016-01-01
Comprehensive, quantitative information on abundances of proteins and their posttranslational modifications (PTMs) can potentially provide novel biological insights into diseases pathogenesis and therapeutic intervention. Herein, we introduce a quantitative strategy utilizing isobaric stable isotope-labeling techniques combined with two-dimensional liquid chromatography-tandem mass spectrometry (2D-LC-MS/MS) for large-scale, deep quantitative proteome profiling of biological samples or clinical specimens such as tumor tissues. The workflow includes isobaric labeling of tryptic peptides for multiplexed and accurate quantitative analysis, basic reversed-phase LC fractionation and concatenation for reduced sample complexity, and nano-LC coupled to high resolution and high mass accuracy MS analysis for high confidence identification and quantification of proteins. This proteomic analysis strategy has been successfully applied for in-depth quantitative proteomic analysis of tumor samples and can also be used for integrated proteome and PTM characterization, as well as comprehensive quantitative proteomic analysis across samples from large clinical cohorts.
Processing Shotgun Proteomics Data on the Amazon Cloud with the Trans-Proteomic Pipeline*
Slagel, Joseph; Mendoza, Luis; Shteynberg, David; Deutsch, Eric W.; Moritz, Robert L.
2015-01-01
Cloud computing, where scalable, on-demand compute cycles and storage are available as a service, has the potential to accelerate mass spectrometry-based proteomics research by providing simple, expandable, and affordable large-scale computing to all laboratories regardless of location or information technology expertise. We present new cloud computing functionality for the Trans-Proteomic Pipeline, a free and open-source suite of tools for the processing and analysis of tandem mass spectrometry datasets. Enabled with Amazon Web Services cloud computing, the Trans-Proteomic Pipeline now accesses large scale computing resources, limited only by the available Amazon Web Services infrastructure, for all users. The Trans-Proteomic Pipeline runs in an environment fully hosted on Amazon Web Services, where all software and data reside on cloud resources to tackle large search studies. In addition, it can also be run on a local computer with computationally intensive tasks launched onto the Amazon Elastic Compute Cloud service to greatly decrease analysis times. We describe the new Trans-Proteomic Pipeline cloud service components, compare the relative performance and costs of various Elastic Compute Cloud service instance types, and present on-line tutorials that enable users to learn how to deploy cloud computing technology rapidly with the Trans-Proteomic Pipeline. We provide tools for estimating the necessary computing resources and costs given the scale of a job and demonstrate the use of cloud enabled Trans-Proteomic Pipeline by performing over 1100 tandem mass spectrometry files through four proteomic search engines in 9 h and at a very low cost. PMID:25418363
Processing shotgun proteomics data on the Amazon cloud with the trans-proteomic pipeline.
Slagel, Joseph; Mendoza, Luis; Shteynberg, David; Deutsch, Eric W; Moritz, Robert L
2015-02-01
Cloud computing, where scalable, on-demand compute cycles and storage are available as a service, has the potential to accelerate mass spectrometry-based proteomics research by providing simple, expandable, and affordable large-scale computing to all laboratories regardless of location or information technology expertise. We present new cloud computing functionality for the Trans-Proteomic Pipeline, a free and open-source suite of tools for the processing and analysis of tandem mass spectrometry datasets. Enabled with Amazon Web Services cloud computing, the Trans-Proteomic Pipeline now accesses large scale computing resources, limited only by the available Amazon Web Services infrastructure, for all users. The Trans-Proteomic Pipeline runs in an environment fully hosted on Amazon Web Services, where all software and data reside on cloud resources to tackle large search studies. In addition, it can also be run on a local computer with computationally intensive tasks launched onto the Amazon Elastic Compute Cloud service to greatly decrease analysis times. We describe the new Trans-Proteomic Pipeline cloud service components, compare the relative performance and costs of various Elastic Compute Cloud service instance types, and present on-line tutorials that enable users to learn how to deploy cloud computing technology rapidly with the Trans-Proteomic Pipeline. We provide tools for estimating the necessary computing resources and costs given the scale of a job and demonstrate the use of cloud enabled Trans-Proteomic Pipeline by performing over 1100 tandem mass spectrometry files through four proteomic search engines in 9 h and at a very low cost. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
CPTAC | Office of Cancer Clinical Proteomics Research
The National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) is a national effort to accelerate the understanding of the molecular basis of cancer through the application of large-scale proteome and genome analysis, or proteogenomics.
Applications of Proteomic Technologies to Toxicology
Proteomics is the large-scale study of gene expression at the protein level. This cutting edge technology has been extensively applied to toxicology research recently. The up-to-date development of proteomics has presented the toxicology community with an unprecedented opportunit...
HiQuant: Rapid Postquantification Analysis of Large-Scale MS-Generated Proteomics Data.
Bryan, Kenneth; Jarboui, Mohamed-Ali; Raso, Cinzia; Bernal-Llinares, Manuel; McCann, Brendan; Rauch, Jens; Boldt, Karsten; Lynn, David J
2016-06-03
Recent advances in mass-spectrometry-based proteomics are now facilitating ambitious large-scale investigations of the spatial and temporal dynamics of the proteome; however, the increasing size and complexity of these data sets is overwhelming current downstream computational methods, specifically those that support the postquantification analysis pipeline. Here we present HiQuant, a novel application that enables the design and execution of a postquantification workflow, including common data-processing steps, such as assay normalization and grouping, and experimental replicate quality control and statistical analysis. HiQuant also enables the interpretation of results generated from large-scale data sets by supporting interactive heatmap analysis and also the direct export to Cytoscape and Gephi, two leading network analysis platforms. HiQuant may be run via a user-friendly graphical interface and also supports complete one-touch automation via a command-line mode. We evaluate HiQuant's performance by analyzing a large-scale, complex interactome mapping data set and demonstrate a 200-fold improvement in the execution time over current methods. We also demonstrate HiQuant's general utility by analyzing proteome-wide quantification data generated from both a large-scale public tyrosine kinase siRNA knock-down study and an in-house investigation into the temporal dynamics of the KSR1 and KSR2 interactomes. Download HiQuant, sample data sets, and supporting documentation at http://hiquant.primesdb.eu .
Deutsch, Eric W.; Mendoza, Luis; Shteynberg, David; Slagel, Joseph; Sun, Zhi; Moritz, Robert L.
2015-01-01
Democratization of genomics technologies has enabled the rapid determination of genotypes. More recently the democratization of comprehensive proteomics technologies is enabling the determination of the cellular phenotype and the molecular events that define its dynamic state. Core proteomic technologies include mass spectrometry to define protein sequence, protein:protein interactions, and protein post-translational modifications. Key enabling technologies for proteomics are bioinformatic pipelines to identify, quantitate, and summarize these events. The Trans-Proteomics Pipeline (TPP) is a robust open-source standardized data processing pipeline for large-scale reproducible quantitative mass spectrometry proteomics. It supports all major operating systems and instrument vendors via open data formats. Here we provide a review of the overall proteomics workflow supported by the TPP, its major tools, and how it can be used in its various modes from desktop to cloud computing. We describe new features for the TPP, including data visualization functionality. We conclude by describing some common perils that affect the analysis of tandem mass spectrometry datasets, as well as some major upcoming features. PMID:25631240
Deutsch, Eric W; Mendoza, Luis; Shteynberg, David; Slagel, Joseph; Sun, Zhi; Moritz, Robert L
2015-08-01
Democratization of genomics technologies has enabled the rapid determination of genotypes. More recently the democratization of comprehensive proteomics technologies is enabling the determination of the cellular phenotype and the molecular events that define its dynamic state. Core proteomic technologies include MS to define protein sequence, protein:protein interactions, and protein PTMs. Key enabling technologies for proteomics are bioinformatic pipelines to identify, quantitate, and summarize these events. The Trans-Proteomics Pipeline (TPP) is a robust open-source standardized data processing pipeline for large-scale reproducible quantitative MS proteomics. It supports all major operating systems and instrument vendors via open data formats. Here, we provide a review of the overall proteomics workflow supported by the TPP, its major tools, and how it can be used in its various modes from desktop to cloud computing. We describe new features for the TPP, including data visualization functionality. We conclude by describing some common perils that affect the analysis of MS/MS datasets, as well as some major upcoming features. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The National Cancer Institute's (NCI) Clinical Proteomic Technologies for Cancer (CPTC) initiative at the National Institutes of Health has entered into a memorandum of understanding (MOU) with the Korea Institute of Science and Technology (KIST). This MOU promotes proteomic technology optimization and standards implementation in large-scale international programs.
Analyzing large-scale proteomics projects with latent semantic indexing.
Klie, Sebastian; Martens, Lennart; Vizcaíno, Juan Antonio; Côté, Richard; Jones, Phil; Apweiler, Rolf; Hinneburg, Alexander; Hermjakob, Henning
2008-01-01
Since the advent of public data repositories for proteomics data, readily accessible results from high-throughput experiments have been accumulating steadily. Several large-scale projects in particular have contributed substantially to the amount of identifications available to the community. Despite the considerable body of information amassed, very few successful analyses have been performed and published on this data, leveling off the ultimate value of these projects far below their potential. A prominent reason published proteomics data is seldom reanalyzed lies in the heterogeneous nature of the original sample collection and the subsequent data recording and processing. To illustrate that at least part of this heterogeneity can be compensated for, we here apply a latent semantic analysis to the data contributed by the Human Proteome Organization's Plasma Proteome Project (HUPO PPP). Interestingly, despite the broad spectrum of instruments and methodologies applied in the HUPO PPP, our analysis reveals several obvious patterns that can be used to formulate concrete recommendations for optimizing proteomics project planning as well as the choice of technologies used in future experiments. It is clear from these results that the analysis of large bodies of publicly available proteomics data by noise-tolerant algorithms such as the latent semantic analysis holds great promise and is currently underexploited.
Durbin, Kenneth R.; Tran, John C.; Zamdborg, Leonid; Sweet, Steve M. M.; Catherman, Adam D.; Lee, Ji Eun; Li, Mingxi; Kellie, John F.; Kelleher, Neil L.
2011-01-01
Applying high-throughput Top-Down MS to an entire proteome requires a yet-to-be-established model for data processing. Since Top-Down is becoming possible on a large scale, we report our latest software pipeline dedicated to capturing the full value of intact protein data in automated fashion. For intact mass detection, we combine algorithms for processing MS1 data from both isotopically resolved (FT) and charge-state resolved (ion trap) LC-MS data, which are then linked to their fragment ions for database searching using ProSight. Automated determination of human keratin and tubulin isoforms is one result. Optimized for the intricacies of whole proteins, new software modules visualize proteome-scale data based on the LC retention time and intensity of intact masses and enable selective detection of PTMs to automatically screen for acetylation, phosphorylation, and methylation. Software functionality was demonstrated using comparative LC-MS data from yeast strains in addition to human cells undergoing chemical stress. We further these advances as a key aspect of realizing Top-Down MS on a proteomic scale. PMID:20848673
Proteomics wants cRacker: automated standardized data analysis of LC-MS derived proteomic data.
Zauber, Henrik; Schulze, Waltraud X
2012-11-02
The large-scale analysis of thousands of proteins under various experimental conditions or in mutant lines has gained more and more importance in hypothesis-driven scientific research and systems biology in the past years. Quantitative analysis by large scale proteomics using modern mass spectrometry usually results in long lists of peptide ion intensities. The main interest for most researchers, however, is to draw conclusions on the protein level. Postprocessing and combining peptide intensities of a proteomic data set requires expert knowledge, and the often repetitive and standardized manual calculations can be time-consuming. The analysis of complex samples can result in very large data sets (lists with several 1000s to 100,000 entries of different peptides) that cannot easily be analyzed using standard spreadsheet programs. To improve speed and consistency of the data analysis of LC-MS derived proteomic data, we developed cRacker. cRacker is an R-based program for automated downstream proteomic data analysis including data normalization strategies for metabolic labeling and label free quantitation. In addition, cRacker includes basic statistical analysis, such as clustering of data, or ANOVA and t tests for comparison between treatments. Results are presented in editable graphic formats and in list files.
Reverse-phase protein arrays (RPPA) represent a powerful functional proteomic approach to elucidate cancer-related molecular mechanisms and to develop novel cancer therapies. To facilitate community-based investigation of the large-scale protein expression data generated by this platform, we have developed a user-friendly, open-access bioinformatic resource, The Cancer Proteome Atlas (TCPA, http://tcpaportal.org), which contains two separate web applications.
Cehofski, Lasse Jørgensen; Honoré, Bent; Vorum, Henrik
2017-04-28
Retinal artery occlusion (RAO), retinal vein occlusion (RVO), diabetic retinopathy (DR) and age-related macular degeneration (AMD) are frequent ocular diseases with potentially sight-threatening outcomes. In the present review we discuss major findings of proteomic studies of RAO, RVO, DR and AMD, including an overview of ocular proteome changes associated with anti-vascular endothelial growth factor (VEGF) treatments. Despite the severe outcomes of RAO, the proteome of the disease remains largely unstudied. There is also limited knowledge about the proteome of RVO, but proteomic studies suggest that RVO is associated with remodeling of the extracellular matrix and adhesion processes. Proteomic studies of DR have resulted in the identification of potential therapeutic targets such as carbonic anhydrase-I. Proliferative diabetic retinopathy is the most intensively studied stage of DR. Proteomic studies have established VEGF, pigment epithelium-derived factor (PEDF) and complement components as key factors associated with AMD. The aim of this review is to highlight the major milestones in proteomics in RAO, RVO, DR and AMD. Through large-scale protein analyses, proteomics is bringing new important insights into these complex pathological conditions.
Arntzen, Magnus Ø; Thiede, Bernd
2012-02-01
Apoptosis is the most commonly described form of programmed cell death, and dysfunction is implicated in a large number of human diseases. Many quantitative proteome analyses of apoptosis have been performed to gain insight in proteins involved in the process. This resulted in large and complex data sets that are difficult to evaluate. Therefore, we developed the ApoptoProteomics database for storage, browsing, and analysis of the outcome of large scale proteome analyses of apoptosis derived from human, mouse, and rat. The proteomics data of 52 publications were integrated and unified with protein annotations from UniProt-KB, the caspase substrate database homepage (CASBAH), and gene ontology. Currently, more than 2300 records of more than 1500 unique proteins were included, covering a large proportion of the core signaling pathways of apoptosis. Analysis of the data set revealed a high level of agreement between the reported changes in directionality reported in proteomics studies and expected apoptosis-related function and may disclose proteins without a current recognized involvement in apoptosis based on gene ontology. Comparison between induction of apoptosis by the intrinsic and the extrinsic apoptotic signaling pathway revealed slight differences. Furthermore, proteomics has significantly contributed to the field of apoptosis in identifying hundreds of caspase substrates. The database is available at http://apoptoproteomics.uio.no.
Arntzen, Magnus Ø.; Thiede, Bernd
2012-01-01
Apoptosis is the most commonly described form of programmed cell death, and dysfunction is implicated in a large number of human diseases. Many quantitative proteome analyses of apoptosis have been performed to gain insight in proteins involved in the process. This resulted in large and complex data sets that are difficult to evaluate. Therefore, we developed the ApoptoProteomics database for storage, browsing, and analysis of the outcome of large scale proteome analyses of apoptosis derived from human, mouse, and rat. The proteomics data of 52 publications were integrated and unified with protein annotations from UniProt-KB, the caspase substrate database homepage (CASBAH), and gene ontology. Currently, more than 2300 records of more than 1500 unique proteins were included, covering a large proportion of the core signaling pathways of apoptosis. Analysis of the data set revealed a high level of agreement between the reported changes in directionality reported in proteomics studies and expected apoptosis-related function and may disclose proteins without a current recognized involvement in apoptosis based on gene ontology. Comparison between induction of apoptosis by the intrinsic and the extrinsic apoptotic signaling pathway revealed slight differences. Furthermore, proteomics has significantly contributed to the field of apoptosis in identifying hundreds of caspase substrates. The database is available at http://apoptoproteomics.uio.no. PMID:22067098
Systems Proteomics for Translational Network Medicine
Arrell, D. Kent; Terzic, Andre
2012-01-01
Universal principles underlying network science, and their ever-increasing applications in biomedicine, underscore the unprecedented capacity of systems biology based strategies to synthesize and resolve massive high throughput generated datasets. Enabling previously unattainable comprehension of biological complexity, systems approaches have accelerated progress in elucidating disease prediction, progression, and outcome. Applied to the spectrum of states spanning health and disease, network proteomics establishes a collation, integration, and prioritization algorithm to guide mapping and decoding of proteome landscapes from large-scale raw data. Providing unparalleled deconvolution of protein lists into global interactomes, integrative systems proteomics enables objective, multi-modal interpretation at molecular, pathway, and network scales, merging individual molecular components, their plurality of interactions, and functional contributions for systems comprehension. As such, network systems approaches are increasingly exploited for objective interpretation of cardiovascular proteomics studies. Here, we highlight network systems proteomic analysis pipelines for integration and biological interpretation through protein cartography, ontological categorization, pathway and functional enrichment and complex network analysis. PMID:22896016
Affordable proteomics: the two-hybrid systems.
Gillespie, Marc
2003-06-01
Numerous proteomic methodologies exist, but most require a heavy investment in expertise and technology. This puts these approaches out of reach for many laboratories and small companies, rarely allowing proteomics to be used as a pilot approach for biomarker or target identification. Two proteomic approaches, 2D gel electrophoresis and the two-hybrid systems, are currently available to most researchers. The two-hybrid systems, though accommodating to large-scale experiments, were originally designed as practical screens, that by comparison to current proteomics tools were small-scale, affordable and technically feasible. The screens rapidly generated data, identifying protein interactions that were previously uncharacterized. The foundation for a two-hybrid proteomic investigation can be purchased as separate kits from a number of companies. The true power of the technique lies not in its affordability, but rather in its portability. The two-hybrid system puts proteomics back into laboratories where the output of the screens can be evaluated by researchers with experience in the particular fields of basic research, cancer biology, toxicology or drug development.
Halligan, Brian D.; Geiger, Joey F.; Vallejos, Andrew K.; Greene, Andrew S.; Twigger, Simon N.
2009-01-01
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step by step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center website (http://proteomics.mcw.edu/vipdac). PMID:19358578
Halligan, Brian D; Geiger, Joey F; Vallejos, Andrew K; Greene, Andrew S; Twigger, Simon N
2009-06-01
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step-by-step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center Web site ( http://proteomics.mcw.edu/vipdac ).
Liu, Ming-Qi; Zeng, Wen-Feng; Fang, Pan; Cao, Wei-Qian; Liu, Chao; Yan, Guo-Quan; Zhang, Yang; Peng, Chao; Wu, Jian-Qiang; Zhang, Xiao-Jin; Tu, Hui-Jun; Chi, Hao; Sun, Rui-Xiang; Cao, Yong; Dong, Meng-Qiu; Jiang, Bi-Yun; Huang, Jiang-Ming; Shen, Hua-Li; Wong, Catherine C L; He, Si-Min; Yang, Peng-Yuan
2017-09-05
The precise and large-scale identification of intact glycopeptides is a critical step in glycoproteomics. Owing to the complexity of glycosylation, the current overall throughput, data quality and accessibility of intact glycopeptide identification lack behind those in routine proteomic analyses. Here, we propose a workflow for the precise high-throughput identification of intact N-glycopeptides at the proteome scale using stepped-energy fragmentation and a dedicated search engine. pGlyco 2.0 conducts comprehensive quality control including false discovery rate evaluation at all three levels of matches to glycans, peptides and glycopeptides, improving the current level of accuracy of intact glycopeptide identification. The N-glycoproteome of samples metabolically labeled with 15 N/ 13 C were analyzed quantitatively and utilized to validate the glycopeptide identification, which could be used as a novel benchmark pipeline to compare different search engines. Finally, we report a large-scale glycoproteome dataset consisting of 10,009 distinct site-specific N-glycans on 1988 glycosylation sites from 955 glycoproteins in five mouse tissues.Protein glycosylation is a heterogeneous post-translational modification that generates greater proteomic diversity that is difficult to analyze. Here the authors describe pGlyco 2.0, a workflow for the precise one step identification of intact N-glycopeptides at the proteome scale.
Determination of burn patient outcome by large-scale quantitative discovery proteomics
Finnerty, Celeste C.; Jeschke, Marc G.; Qian, Wei-Jun; Kaushal, Amit; Xiao, Wenzhong; Liu, Tao; Gritsenko, Marina A.; Moore, Ronald J.; Camp, David G.; Moldawer, Lyle L.; Elson, Constance; Schoenfeld, David; Gamelli, Richard; Gibran, Nicole; Klein, Matthew; Arnoldo, Brett; Remick, Daniel; Smith, Richard D.; Davis, Ronald; Tompkins, Ronald G.; Herndon, David N.
2013-01-01
Objective Emerging proteomics techniques can be used to establish proteomic outcome signatures and to identify candidate biomarkers for survival following traumatic injury. We applied high-resolution liquid chromatography-mass spectrometry (LC-MS) and multiplex cytokine analysis to profile the plasma proteome of survivors and non-survivors of massive burn injury to determine the proteomic survival signature following a major burn injury. Design Proteomic discovery study. Setting Five burn hospitals across the U.S. Patients Thirty-two burn patients (16 non-survivors and 16 survivors), 19–89 years of age, were admitted within 96 h of injury to the participating hospitals with burns covering >20% of the total body surface area and required at least one surgical intervention. Interventions None. Measurements and Main Results We found differences in circulating levels of 43 proteins involved in the acute phase response, hepatic signaling, the complement cascade, inflammation, and insulin resistance. Thirty-two of the proteins identified were not previously known to play a role in the response to burn. IL-4, IL-8, GM-CSF, MCP-1, and β2-microglobulin correlated well with survival and may serve as clinical biomarkers. Conclusions These results demonstrate the utility of these techniques for establishing proteomic survival signatures and for use as a discovery tool to identify candidate biomarkers for survival. This is the first clinical application of a high-throughput, large-scale LC-MS-based quantitative plasma proteomic approach for biomarker discovery for the prediction of patient outcome following burn, trauma or critical illness. PMID:23507713
Cehofski, Lasse Jørgensen; Honoré, Bent; Vorum, Henrik
2017-01-01
Retinal artery occlusion (RAO), retinal vein occlusion (RVO), diabetic retinopathy (DR) and age-related macular degeneration (AMD) are frequent ocular diseases with potentially sight-threatening outcomes. In the present review we discuss major findings of proteomic studies of RAO, RVO, DR and AMD, including an overview of ocular proteome changes associated with anti-vascular endothelial growth factor (VEGF) treatments. Despite the severe outcomes of RAO, the proteome of the disease remains largely unstudied. There is also limited knowledge about the proteome of RVO, but proteomic studies suggest that RVO is associated with remodeling of the extracellular matrix and adhesion processes. Proteomic studies of DR have resulted in the identification of potential therapeutic targets such as carbonic anhydrase-I. Proliferative diabetic retinopathy is the most intensively studied stage of DR. Proteomic studies have established VEGF, pigment epithelium-derived factor (PEDF) and complement components as key factors associated with AMD. The aim of this review is to highlight the major milestones in proteomics in RAO, RVO, DR and AMD. Through large-scale protein analyses, proteomics is bringing new important insights into these complex pathological conditions. PMID:28452939
Content Is King: Databases Preserve the Collective Information of Science.
Yates, John R
2018-04-01
Databases store sequence information experimentally gathered to create resources that further science. In the last 20 years databases have become critical components of fields like proteomics where they provide the basis for large-scale and high-throughput proteomic informatics. Amos Bairoch, winner of the Association of Biomolecular Resource Facilities Frederick Sanger Award, has created some of the important databases proteomic research depends upon for accurate interpretation of data.
A Community Standard Format for the Representation of Protein Affinity Reagents*
Gloriam, David E.; Orchard, Sandra; Bertinetti, Daniela; Björling, Erik; Bongcam-Rudloff, Erik; Borrebaeck, Carl A. K.; Bourbeillon, Julie; Bradbury, Andrew R. M.; de Daruvar, Antoine; Dübel, Stefan; Frank, Ronald; Gibson, Toby J.; Gold, Larry; Haslam, Niall; Herberg, Friedrich W.; Hiltke, Tara; Hoheisel, Jörg D.; Kerrien, Samuel; Koegl, Manfred; Konthur, Zoltán; Korn, Bernhard; Landegren, Ulf; Montecchi-Palazzi, Luisa; Palcy, Sandrine; Rodriguez, Henry; Schweinsberg, Sonja; Sievert, Volker; Stoevesandt, Oda; Taussig, Michael J.; Ueffing, Marius; Uhlén, Mathias; van der Maarel, Silvère; Wingren, Christer; Woollard, Peter; Sherman, David J.; Hermjakob, Henning
2010-01-01
Protein affinity reagents (PARs), most commonly antibodies, are essential reagents for protein characterization in basic research, biotechnology, and diagnostics as well as the fastest growing class of therapeutics. Large numbers of PARs are available commercially; however, their quality is often uncertain. In addition, currently available PARs cover only a fraction of the human proteome, and their cost is prohibitive for proteome scale applications. This situation has triggered several initiatives involving large scale generation and validation of antibodies, for example the Swedish Human Protein Atlas and the German Antibody Factory. Antibodies targeting specific subproteomes are being pursued by members of Human Proteome Organisation (plasma and liver proteome projects) and the United States National Cancer Institute (cancer-associated antigens). ProteomeBinders, a European consortium, aims to set up a resource of consistently quality-controlled protein-binding reagents for the whole human proteome. An ultimate PAR database resource would allow consumers to visit one on-line warehouse and find all available affinity reagents from different providers together with documentation that facilitates easy comparison of their cost and quality. However, in contrast to, for example, nucleotide databases among which data are synchronized between the major data providers, current PAR producers, quality control centers, and commercial companies all use incompatible formats, hindering data exchange. Here we propose Proteomics Standards Initiative (PSI)-PAR as a global community standard format for the representation and exchange of protein affinity reagent data. The PSI-PAR format is maintained by the Human Proteome Organisation PSI and was developed within the context of ProteomeBinders by building on a mature proteomics standard format, PSI-molecular interaction, which is a widely accepted and established community standard for molecular interaction data. Further information and documentation are available on the PSI-PAR web site. PMID:19674966
Investigators from the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) who comprehensively analyzed 95 human colorectal tumor samples, have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, provides a more comprehensive view of the biological features that drive cancer than genomic analysis alone and may help identify the most important targets for cancer detection and intervention.
Martínez-Bartolomé, Salvador; Medina-Aunon, J Alberto; López-García, Miguel Ángel; González-Tejedo, Carmen; Prieto, Gorka; Navajas, Rosana; Salazar-Donate, Emilio; Fernández-Costa, Carolina; Yates, John R; Albar, Juan Pablo
2018-04-06
Mass-spectrometry-based proteomics has evolved into a high-throughput technology in which numerous large-scale data sets are generated from diverse analytical platforms. Furthermore, several scientific journals and funding agencies have emphasized the storage of proteomics data in public repositories to facilitate its evaluation, inspection, and reanalysis. (1) As a consequence, public proteomics data repositories are growing rapidly. However, tools are needed to integrate multiple proteomics data sets to compare different experimental features or to perform quality control analysis. Here, we present a new Java stand-alone tool, Proteomics Assay COMparator (PACOM), that is able to import, combine, and simultaneously compare numerous proteomics experiments to check the integrity of the proteomic data as well as verify data quality. With PACOM, the user can detect source of errors that may have been introduced in any step of a proteomics workflow and that influence the final results. Data sets can be easily compared and integrated, and data quality and reproducibility can be visually assessed through a rich set of graphical representations of proteomics data features as well as a wide variety of data filters. Its flexibility and easy-to-use interface make PACOM a unique tool for daily use in a proteomics laboratory. PACOM is available at https://github.com/smdb21/pacom .
Background | Office of Cancer Clinical Proteomics Research
The term "proteomics" refers to a large-scale comprehensive study of a specific proteome resulting from its genome, including abundances of proteins, their variations and modifications, and interacting partners and networks in order to understand cellular processes involved. Similarly, “Cancer proteomics” refers to comprehensive analyses of proteins and their derivatives translated from a specific cancer genome using a human biospecimen or a preclinical model (e.g., cultured cell or animal model).
Assembling proteomics data as a prerequisite for the analysis of large scale experiments
Schmidt, Frank; Schmid, Monika; Thiede, Bernd; Pleißner, Klaus-Peter; Böhme, Martina; Jungblut, Peter R
2009-01-01
Background Despite the complete determination of the genome sequence of a huge number of bacteria, their proteomes remain relatively poorly defined. Beside new methods to increase the number of identified proteins new database applications are necessary to store and present results of large- scale proteomics experiments. Results In the present study, a database concept has been developed to address these issues and to offer complete information via a web interface. In our concept, the Oracle based data repository system SQL-LIMS plays the central role in the proteomics workflow and was applied to the proteomes of Mycobacterium tuberculosis, Helicobacter pylori, Salmonella typhimurium and protein complexes such as 20S proteasome. Technical operations of our proteomics labs were used as the standard for SQL-LIMS template creation. By means of a Java based data parser, post-processed data of different approaches, such as LC/ESI-MS, MALDI-MS and 2-D gel electrophoresis (2-DE), were stored in SQL-LIMS. A minimum set of the proteomics data were transferred in our public 2D-PAGE database using a Java based interface (Data Transfer Tool) with the requirements of the PEDRo standardization. Furthermore, the stored proteomics data were extractable out of SQL-LIMS via XML. Conclusion The Oracle based data repository system SQL-LIMS played the central role in the proteomics workflow concept. Technical operations of our proteomics labs were used as standards for SQL-LIMS templates. Using a Java based parser, post-processed data of different approaches such as LC/ESI-MS, MALDI-MS and 1-DE and 2-DE were stored in SQL-LIMS. Thus, unique data formats of different instruments were unified and stored in SQL-LIMS tables. Moreover, a unique submission identifier allowed fast access to all experimental data. This was the main advantage compared to multi software solutions, especially if personnel fluctuations are high. Moreover, large scale and high-throughput experiments must be managed in a comprehensive repository system such as SQL-LIMS, to query results in a systematic manner. On the other hand, these database systems are expensive and require at least one full time administrator and specialized lab manager. Moreover, the high technical dynamics in proteomics may cause problems to adjust new data formats. To summarize, SQL-LIMS met the requirements of proteomics data handling especially in skilled processes such as gel-electrophoresis or mass spectrometry and fulfilled the PSI standardization criteria. The data transfer into a public domain via DTT facilitated validation of proteomics data. Additionally, evaluation of mass spectra by post-processing using MS-Screener improved the reliability of mass analysis and prevented storage of data junk. PMID:19166578
Paulovich, Amanda G.; Billheimer, Dean; Ham, Amy-Joan L.; Vega-Montoto, Lorenzo; Rudnick, Paul A.; Tabb, David L.; Wang, Pei; Blackman, Ronald K.; Bunk, David M.; Cardasis, Helene L.; Clauser, Karl R.; Kinsinger, Christopher R.; Schilling, Birgit; Tegeler, Tony J.; Variyath, Asokan Mulayath; Wang, Mu; Whiteaker, Jeffrey R.; Zimmerman, Lisa J.; Fenyo, David; Carr, Steven A.; Fisher, Susan J.; Gibson, Bradford W.; Mesri, Mehdi; Neubert, Thomas A.; Regnier, Fred E.; Rodriguez, Henry; Spiegelman, Cliff; Stein, Stephen E.; Tempst, Paul; Liebler, Daniel C.
2010-01-01
Optimal performance of LC-MS/MS platforms is critical to generating high quality proteomics data. Although individual laboratories have developed quality control samples, there is no widely available performance standard of biological complexity (and associated reference data sets) for benchmarking of platform performance for analysis of complex biological proteomes across different laboratories in the community. Individual preparations of the yeast Saccharomyces cerevisiae proteome have been used extensively by laboratories in the proteomics community to characterize LC-MS platform performance. The yeast proteome is uniquely attractive as a performance standard because it is the most extensively characterized complex biological proteome and the only one associated with several large scale studies estimating the abundance of all detectable proteins. In this study, we describe a standard operating protocol for large scale production of the yeast performance standard and offer aliquots to the community through the National Institute of Standards and Technology where the yeast proteome is under development as a certified reference material to meet the long term needs of the community. Using a series of metrics that characterize LC-MS performance, we provide a reference data set demonstrating typical performance of commonly used ion trap instrument platforms in expert laboratories; the results provide a basis for laboratories to benchmark their own performance, to improve upon current methods, and to evaluate new technologies. Additionally, we demonstrate how the yeast reference, spiked with human proteins, can be used to benchmark the power of proteomics platforms for detection of differentially expressed proteins at different levels of concentration in a complex matrix, thereby providing a metric to evaluate and minimize preanalytical and analytical variation in comparative proteomics experiments. PMID:19858499
Huang, Junfeng; Wang, Fangjun; Ye, Mingliang; Zou, Hanfa
2014-11-06
Comprehensive analysis of the post-translational modifications (PTMs) on proteins at proteome level is crucial to elucidate the regulatory mechanisms of various biological processes. In the past decades, thanks to the development of specific PTM enrichment techniques and efficient multidimensional liquid chromatography (LC) separation strategy, the identification of protein PTMs have made tremendous progress. A huge number of modification sites for some major protein PTMs have been identified by proteomics analysis. In this review, we first introduced the recent progresses of PTM enrichment methods for the analysis of several major PTMs including phosphorylation, glycosylation, ubiquitination, acetylation, methylation, and oxidation/reduction status. We then briefly summarized the challenges for PTM enrichment. Finally, we introduced the fractionation and separation techniques for efficient separation of PTM peptides in large-scale PTM analysis. Copyright © 2014 Elsevier B.V. All rights reserved.
Activity-based protein profiling for biochemical pathway discovery in cancer
Nomura, Daniel K.; Dix, Melissa M.; Cravatt, Benjamin F.
2011-01-01
Large-scale profiling methods have uncovered numerous gene and protein expression changes that correlate with tumorigenesis. However, determining the relevance of these expression changes and which biochemical pathways they affect has been hindered by our incomplete understanding of the proteome and its myriad functions and modes of regulation. Activity-based profiling platforms enable both the discovery of cancer-relevant enzymes and selective pharmacological probes to perturb and characterize these proteins in tumour cells. When integrated with other large-scale profiling methods, activity-based proteomics can provide insight into the metabolic and signalling pathways that support cancer pathogenesis and illuminate new strategies for disease diagnosis and treatment. PMID:20703252
Seo, Moon-Hyeong; Nim, Satra; Jeon, Jouhyun; Kim, Philip M
2017-01-01
Protein-protein interactions are essential to cellular functions and signaling pathways. We recently combined bioinformatics and custom oligonucleotide arrays to construct custom-made peptide-phage libraries for screening peptide-protein interactions, an approach we call proteomic peptide-phage display (ProP-PD). In this chapter, we describe protocols for phage display for the identification of natural peptide binders for a given protein. We finally describe deep sequencing for the analysis of the proteomic peptide-phage display.
The yeast protein extract (RM8323) developed by National Institute of Standards and Technology (NIST) under the auspices of NCI's CPTC initiative is currently available to the public at https://www-s.nist.gov/srmors/view_detail.cfm?srm=8323. The yeast proteome offers researchers a unique biological reference material. RM8323 is the most extensively characterized complex biological proteome and the only one associated with several large-scale studies to estimate protein abundance across a wide concentration range.
Science, marketing and wishful thinking in quantitative proteomics.
Hackett, Murray
2008-11-01
In a recent editorial (J. Proteome Res. 2007, 6, 1633) and elsewhere questions have been raised regarding the lack of attention paid to good analytical practice with respect to the reporting of quantitative results in proteomics. Using those comments as a starting point, several issues are discussed that relate to the challenges involved in achieving adequate sampling with MS-based methods in order to generate valid data for large-scale studies. The discussion touches on the relationships that connect sampling depth and the power to detect protein abundance change, conflict of interest, and strategies to overcome bureaucratic obstacles that impede the use of peer-to-peer technologies for transfer and storage of large data files generated in such experiments.
From the genome sequence to the protein inventory of Bacillus subtilis.
Becher, Dörte; Büttner, Knut; Moche, Martin; Hessling, Bernd; Hecker, Michael
2011-08-01
Owing to the low number of proteins necessary to render a bacterial cell viable, bacteria are extremely attractive model systems to understand how the genome sequence is translated into actual life processes. One of the most intensively investigated model organisms is Bacillus subtilis. It has attracted world-wide research interest, addressing cell differentiation and adaptation on a molecular scale as well as biotechnological production processes. Meanwhile, we are looking back on more than 25 years of B. subtilis proteomics. A wide range of methods have been developed during this period for the large-scale qualitative and quantitative proteome analysis. Currently, it is possible to identify and quantify more than 50% of the predicted proteome in different cellular subfractions. In this review, we summarize the development of B. subtilis proteomics during the past 25 years. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Proteomics and circadian rhythms: It’s all about signaling!
Mauvoisin, Daniel; Dayon, Loïc; Gachon, Frédéric; Kussmann, Martin
2014-01-01
1. Abstract Proteomic technologies using mass spectrometry (MS) offer new perspectives in circadian biology, in particular the possibility to study posttranslational modifications (PTMs). To date, only very few studies have been carried out to decipher the rhythmicity of protein expression in mammals with large-scale proteomics. Although signaling has been shown to be of high relevance, comprehensive characterization studies of PTMs are even more rare. This review aims at describing the actual landscape of circadian proteomics and the opportunities and challenges appearing on the horizon. Emphasis was given to signaling processes for their role in metabolic heath as regulated by circadian clocks and environmental factors. Those signaling processes are expected to be better and more deeply characterized in the coming years with proteomics. PMID:25103677
Zhang, Yaoyang; Xu, Tao; Shan, Bing; Hart, Jonathan; Aslanian, Aaron; Han, Xuemei; Zong, Nobel; Li, Haomin; Choi, Howard; Wang, Dong; Acharya, Lipi; Du, Lisa; Vogt, Peter K; Ping, Peipei; Yates, John R
2015-11-03
Shotgun proteomics generates valuable information from large-scale and target protein characterizations, including protein expression, protein quantification, protein post-translational modifications (PTMs), protein localization, and protein-protein interactions. Typically, peptides derived from proteolytic digestion, rather than intact proteins, are analyzed by mass spectrometers because peptides are more readily separated, ionized and fragmented. The amino acid sequences of peptides can be interpreted by matching the observed tandem mass spectra to theoretical spectra derived from a protein sequence database. Identified peptides serve as surrogates for their proteins and are often used to establish what proteins were present in the original mixture and to quantify protein abundance. Two major issues exist for assigning peptides to their originating protein. The first issue is maintaining a desired false discovery rate (FDR) when comparing or combining multiple large datasets generated by shotgun analysis and the second issue is properly assigning peptides to proteins when homologous proteins are present in the database. Herein we demonstrate a new computational tool, ProteinInferencer, which can be used for protein inference with both small- or large-scale data sets to produce a well-controlled protein FDR. In addition, ProteinInferencer introduces confidence scoring for individual proteins, which makes protein identifications evaluable. This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015. Published by Elsevier B.V.
Yu, Yadong; Li, Tao; Wu, Na; Ren, Lujing; Jiang, Ling; Ji, Xiaojun; Huang, He
2016-11-30
Arachidonic acid (ARA) is an important polyunsaturated fatty acid having various beneficial physiological effects on the human body. The aging of Mortierella alpina has long been known to significantly improve ARA yield, but the exact mechanism is still elusive. Herein, multiple approaches including large-scale label-free comparative proteomics were employed to systematically investigate the mechanism mentioned above. Upon ultrastructural observation, abnormal mitochondria were found to aggregate around shrunken lipid droplets. Proteomics analysis revealed a total of 171 proteins with significant alterations of expression during aging. Pathway analysis suggested that reactive oxygen species (ROS) were accumulated and stimulated the activation of the malate/pyruvate cycle and isocitrate dehydrogenase, which might provide additional NADPH for ARA synthesis. EC 4.2.1.17-hydratase might be a key player in ARA accumulation during aging. These findings provide a valuable resource for efforts to further improve the ARA content in the oil produced by aging M. alpina.
Liquid chromatography tandem-mass spectrometry (LC-MS/MS)- based methods such as isobaric tags for relative and absolute quantification (iTRAQ) and tandem mass tags (TMT) have been shown to provide overall better quantification accuracy and reproducibility over other LC-MS/MS techniques. However, large scale projects like the Clinical Proteomic Tumor Analysis Consortium (CPTAC) require comparisons across many genomically characterized clinical specimens in a single study and often exceed the capability of traditional iTRAQ-based quantification.
Tuncbag, Nurcan; Gursoy, Attila; Nussinov, Ruth; Keskin, Ozlem
2011-08-11
Prediction of protein-protein interactions at the structural level on the proteome scale is important because it allows prediction of protein function, helps drug discovery and takes steps toward genome-wide structural systems biology. We provide a protocol (termed PRISM, protein interactions by structural matching) for large-scale prediction of protein-protein interactions and assembly of protein complex structures. The method consists of two components: rigid-body structural comparisons of target proteins to known template protein-protein interfaces and flexible refinement using a docking energy function. The PRISM rationale follows our observation that globally different protein structures can interact via similar architectural motifs. PRISM predicts binding residues by using structural similarity and evolutionary conservation of putative binding residue 'hot spots'. Ultimately, PRISM could help to construct cellular pathways and functional, proteome-scale annotation. PRISM is implemented in Python and runs in a UNIX environment. The program accepts Protein Data Bank-formatted protein structures and is available at http://prism.ccbb.ku.edu.tr/prism_protocol/.
FunRich proteomics software analysis, let the fun begin!
Benito-Martin, Alberto; Peinado, Héctor
2015-08-01
Protein MS analysis is the preferred method for unbiased protein identification. It is normally applied to a large number of both small-scale and high-throughput studies. However, user-friendly computational tools for protein analysis are still needed. In this issue, Mathivanan and colleagues (Proteomics 2015, 15, 2597-2601) report the development of FunRich software, an open-access software that facilitates the analysis of proteomics data, providing tools for functional enrichment and interaction network analysis of genes and proteins. FunRich is a reinterpretation of proteomic software, a standalone tool combining ease of use with customizable databases, free access, and graphical representations. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The Escherichia coli Proteome: Past, Present, and Future Prospects†
Han, Mee-Jung; Lee, Sang Yup
2006-01-01
Proteomics has emerged as an indispensable methodology for large-scale protein analysis in functional genomics. The Escherichia coli proteome has been extensively studied and is well defined in terms of biochemical, biological, and biotechnological data. Even before the entire E. coli proteome was fully elucidated, the largest available data set had been integrated to decipher regulatory circuits and metabolic pathways, providing valuable insights into global cellular physiology and the development of metabolic and cellular engineering strategies. With the recent advent of advanced proteomic technologies, the E. coli proteome has been used for the validation of new technologies and methodologies such as sample prefractionation, protein enrichment, two-dimensional gel electrophoresis, protein detection, mass spectrometry (MS), combinatorial assays with n-dimensional chromatographies and MS, and image analysis software. These important technologies will not only provide a great amount of additional information on the E. coli proteome but also synergistically contribute to other proteomic studies. Here, we review the past development and current status of E. coli proteome research in terms of its biological, biotechnological, and methodological significance and suggest future prospects. PMID:16760308
MAPU: Max-Planck Unified database of organellar, cellular, tissue and body fluid proteomes
Zhang, Yanling; Zhang, Yong; Adachi, Jun; Olsen, Jesper V.; Shi, Rong; de Souza, Gustavo; Pasini, Erica; Foster, Leonard J.; Macek, Boris; Zougman, Alexandre; Kumar, Chanchal; Wiśniewski, Jacek R.; Jun, Wang; Mann, Matthias
2007-01-01
Mass spectrometry (MS)-based proteomics has become a powerful technology to map the protein composition of organelles, cell types and tissues. In our department, a large-scale effort to map these proteomes is complemented by the Max-Planck Unified (MAPU) proteome database. MAPU contains several body fluid proteomes; including plasma, urine, and cerebrospinal fluid. Cell lines have been mapped to a depth of several thousand proteins and the red blood cell proteome has also been analyzed in depth. The liver proteome is represented with 3200 proteins. By employing high resolution MS and stringent validation criteria, false positive identification rates in MAPU are lower than 1:1000. Thus MAPU datasets can serve as reference proteomes in biomarker discovery. MAPU contains the peptides identifying each protein, measured masses, scores and intensities and is freely available at using a clickable interface of cell or body parts. Proteome data can be queried across proteomes by protein name, accession number, sequence similarity, peptide sequence and annotation information. More than 4500 mouse and 2500 human proteins have already been identified in at least one proteome. Basic annotation information and links to other public databases are provided in MAPU and we plan to add further analysis tools. PMID:17090601
News Release: May 25, 2016 — Building on data from The Cancer Genome Atlas (TCGA) project, a multi-institutional team of scientists has completed the first large-scale “proteogenomic” study of breast cancer, linking DNA mutations to protein signaling and helping pinpoint the genes that drive cancer.
Computational Omics Pre-Awardees | Office of Cancer Clinical Proteomics Research
The National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) is pleased to announce the pre-awardees of the Computational Omics solicitation. Working with NVIDIA Foundation's Compute the Cure initiative and Leidos Biomedical Research Inc., the NCI, through this solicitation, seeks to leverage computational efforts to provide tools for the mining and interpretation of large-scale publicly available ‘omics’ datasets.
Comparative evaluation of saliva collection methods for proteome analysis.
Golatowski, Claas; Salazar, Manuela Gesell; Dhople, Vishnu Mukund; Hammer, Elke; Kocher, Thomas; Jehmlich, Nico; Völker, Uwe
2013-04-18
Saliva collection devices are widely used for large-scale screening approaches. This study was designed to compare the suitability of three different whole-saliva collection approaches for subsequent proteome analyses. From 9 young healthy volunteers (4 women and 5 men) saliva samples were collected either unstimulated by passive drooling or stimulated using a paraffin gum or Salivette® (cotton swab). Saliva volume, protein concentration and salivary protein patterns were analyzed comparatively. Samples collected using paraffin gum showed the highest saliva volume (4.1±1.5 ml) followed by Salivette® collection (1.8±0.4 ml) and drooling (1.0±0.4 ml). Saliva protein concentrations (average 1145 μg/ml) showed no significant differences between the three sampling schemes. Each collection approach facilitated the identification of about 160 proteins (≥2 distinct peptides) per subject, but collection-method dependent variations in protein composition were observed. Passive drooling, paraffin gum and Salivette® each allows similar coverage of the whole saliva proteome, but the specific proteins observed depended on the collection approach. Thus, only one type of collection device should be used for quantitative proteome analysis in one experiment, especially when performing large-scale cross-sectional or multi-centric studies. Copyright © 2013 Elsevier B.V. All rights reserved.
Alternative Splicing May Not Be the Key to Proteome Complexity.
Tress, Michael L; Abascal, Federico; Valencia, Alfonso
2017-02-01
Alternative splicing is commonly believed to be a major source of cellular protein diversity. However, although many thousands of alternatively spliced transcripts are routinely detected in RNA-seq studies, reliable large-scale mass spectrometry-based proteomics analyses identify only a small fraction of annotated alternative isoforms. The clearest finding from proteomics experiments is that most human genes have a single main protein isoform, while those alternative isoforms that are identified tend to be the most biologically plausible: those with the most cross-species conservation and those that do not compromise functional domains. Indeed, most alternative exons do not seem to be under selective pressure, suggesting that a large majority of predicted alternative transcripts may not even be translated into proteins. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
The HUPO proteomics standards initiative--overcoming the fragmentation of proteomics data.
Hermjakob, Henning
2006-09-01
Proteomics is a key field of modern biomolecular research, with many small and large scale efforts producing a wealth of proteomics data. However, the vast majority of this data is never exploited to its full potential. Even in publicly funded projects, often the raw data generated in a specific context is analysed, conclusions are drawn and published, but little attention is paid to systematic documentation, archiving, and public access to the data supporting the scientific results. It is often difficult to validate the results stated in a particular publication, and even simple global questions like "In which cellular contexts has my protein of interest been observed?" can currently not be answered with realistic effort, due to a lack of standardised reporting and collection of proteomics data. The Proteomics Standards Initiative (PSI), a work group of the Human Proteome Organisation (HUPO), defines community standards for data representation in proteomics to facilitate systematic data capture, comparison, exchange and verification. In this article we provide an overview of PSI organisational structure, activities, and current results, as well as ways to get involved in the broad-based, open PSI process.
Jorge, Inmaculada; Navarro, Pedro; Martínez-Acedo, Pablo; Núñez, Estefanía; Serrano, Horacio; Alfranca, Arántzazu; Redondo, Juan Miguel; Vázquez, Jesús
2009-01-01
Statistical models for the analysis of protein expression changes by stable isotope labeling are still poorly developed, particularly for data obtained by 16O/18O labeling. Besides large scale test experiments to validate the null hypothesis are lacking. Although the study of mechanisms underlying biological actions promoted by vascular endothelial growth factor (VEGF) on endothelial cells is of considerable interest, quantitative proteomics studies on this subject are scarce and have been performed after exposing cells to the factor for long periods of time. In this work we present the largest quantitative proteomics study to date on the short term effects of VEGF on human umbilical vein endothelial cells by 18O/16O labeling. Current statistical models based on normality and variance homogeneity were found unsuitable to describe the null hypothesis in a large scale test experiment performed on these cells, producing false expression changes. A random effects model was developed including four different sources of variance at the spectrum-fitting, scan, peptide, and protein levels. With the new model the number of outliers at scan and peptide levels was negligible in three large scale experiments, and only one false protein expression change was observed in the test experiment among more than 1000 proteins. The new model allowed the detection of significant protein expression changes upon VEGF stimulation for 4 and 8 h. The consistency of the changes observed at 4 h was confirmed by a replica at a smaller scale and further validated by Western blot analysis of some proteins. Most of the observed changes have not been described previously and are consistent with a pattern of protein expression that dynamically changes over time following the evolution of the angiogenic response. With this statistical model the 18O labeling approach emerges as a very promising and robust alternative to perform quantitative proteomics studies at a depth of several thousand proteins. PMID:19181660
Large scale systematic proteomic quantification from non-metastatic to metastatic colorectal cancer
NASA Astrophysics Data System (ADS)
Yin, Xuefei; Zhang, Yang; Guo, Shaowen; Jin, Hong; Wang, Wenhai; Yang, Pengyuan
2015-07-01
A systematic proteomic quantification of formalin-fixed, paraffin-embedded (FFPE) colorectal cancer tissues from stage I to stage IIIC was performed in large scale. 1017 proteins were identified with 338 proteins in quantitative changes by label free method, while 341 proteins were quantified with significant expression changes among 6294 proteins by iTRAQ method. We found that proteins related to migration expression increased and those for binding and adherent decreased during the colorectal cancer development according to the gene ontology (GO) annotation and ingenuity pathway analysis (IPA). The integrin alpha 5 (ITA5) in integrin family was focused, which was consistent with the metastasis related pathway. The expression level of ITA5 decreased in metastasis tissues and the result has been further verified by Western blotting. Another two cell migration related proteins vitronectin (VTN) and actin-related protein (ARP3) were also proved to be up-regulated by both mass spectrometry (MS) based quantification results and Western blotting. Up to now, our result shows one of the largest dataset in colorectal cancer proteomics research. Our strategy reveals a disease driven omics-pattern for the metastasis colorectal cancer.
Takemori, Nobuaki; Takemori, Ayako; Tanaka, Yuki; Endo, Yaeta; Hurst, Jane L.; Gómez-Baena, Guadalupe; Harman, Victoria M.; Beynon, Robert J.
2017-01-01
A major challenge in proteomics is the absolute accurate quantification of large numbers of proteins. QconCATs, artificial proteins that are concatenations of multiple standard peptides, are well established as an efficient means to generate standards for proteome quantification. Previously, QconCATs have been expressed in bacteria, but we now describe QconCAT expression in a robust, cell-free system. The new expression approach rescues QconCATs that previously were unable to be expressed in bacteria and can reduce the incidence of proteolytic damage to QconCATs. Moreover, it is possible to cosynthesize QconCATs in a highly-multiplexed translation reaction, coexpressing tens or hundreds of QconCATs simultaneously. By obviating bacterial culture and through the gain of high level multiplexing, it is now possible to generate tens of thousands of standard peptides in a matter of weeks, rendering absolute quantification of a complex proteome highly achievable in a reproducible, broadly deployable system. PMID:29055021
Schwämmle, Veit; León, Ileana Rodríguez; Jensen, Ole Nørregaard
2013-09-06
Large-scale quantitative analyses of biological systems are often performed with few replicate experiments, leading to multiple nonidentical data sets due to missing values. For example, mass spectrometry driven proteomics experiments are frequently performed with few biological or technical replicates due to sample-scarcity or due to duty-cycle or sensitivity constraints, or limited capacity of the available instrumentation, leading to incomplete results where detection of significant feature changes becomes a challenge. This problem is further exacerbated for the detection of significant changes on the peptide level, for example, in phospho-proteomics experiments. In order to assess the extent of this problem and the implications for large-scale proteome analysis, we investigated and optimized the performance of three statistical approaches by using simulated and experimental data sets with varying numbers of missing values. We applied three tools, including standard t test, moderated t test, also known as limma, and rank products for the detection of significantly changing features in simulated and experimental proteomics data sets with missing values. The rank product method was improved to work with data sets containing missing values. Extensive analysis of simulated and experimental data sets revealed that the performance of the statistical analysis tools depended on simple properties of the data sets. High-confidence results were obtained by using the limma and rank products methods for analyses of triplicate data sets that exhibited more than 1000 features and more than 50% missing values. The maximum number of differentially represented features was identified by using limma and rank products methods in a complementary manner. We therefore recommend combined usage of these methods as a novel and optimal way to detect significantly changing features in these data sets. This approach is suitable for large quantitative data sets from stable isotope labeling and mass spectrometry experiments and should be applicable to large data sets of any type. An R script that implements the improved rank products algorithm and the combined analysis is available.
MAPU: Max-Planck Unified database of organellar, cellular, tissue and body fluid proteomes.
Zhang, Yanling; Zhang, Yong; Adachi, Jun; Olsen, Jesper V; Shi, Rong; de Souza, Gustavo; Pasini, Erica; Foster, Leonard J; Macek, Boris; Zougman, Alexandre; Kumar, Chanchal; Wisniewski, Jacek R; Jun, Wang; Mann, Matthias
2007-01-01
Mass spectrometry (MS)-based proteomics has become a powerful technology to map the protein composition of organelles, cell types and tissues. In our department, a large-scale effort to map these proteomes is complemented by the Max-Planck Unified (MAPU) proteome database. MAPU contains several body fluid proteomes; including plasma, urine, and cerebrospinal fluid. Cell lines have been mapped to a depth of several thousand proteins and the red blood cell proteome has also been analyzed in depth. The liver proteome is represented with 3200 proteins. By employing high resolution MS and stringent validation criteria, false positive identification rates in MAPU are lower than 1:1000. Thus MAPU datasets can serve as reference proteomes in biomarker discovery. MAPU contains the peptides identifying each protein, measured masses, scores and intensities and is freely available at http://www.mapuproteome.com using a clickable interface of cell or body parts. Proteome data can be queried across proteomes by protein name, accession number, sequence similarity, peptide sequence and annotation information. More than 4500 mouse and 2500 human proteins have already been identified in at least one proteome. Basic annotation information and links to other public databases are provided in MAPU and we plan to add further analysis tools.
Proteome Characterization of Leaves in Common Bean
Robison, Faith M.; Heuberger, Adam L.; Brick, Mark A.; Prenni, Jessica E.
2015-01-01
Dry edible bean (Phaseolus vulgaris L.) is a globally relevant food crop. The bean genome was recently sequenced and annotated allowing for proteomics investigations aimed at characterization of leaf phenotypes important to agriculture. The objective of this study was to utilize a shotgun proteomics approach to characterize the leaf proteome and to identify protein abundance differences between two bean lines with known variation in their physiological resistance to biotic stresses. Overall, 640 proteins were confidently identified. Among these are proteins known to be involved in a variety of molecular functions including oxidoreductase activity, binding peroxidase activity, and hydrolase activity. Twenty nine proteins were found to significantly vary in abundance (p-value < 0.05) between the two bean lines, including proteins associated with biotic stress. To our knowledge, this work represents the first large scale shotgun proteomic analysis of beans and our results lay the groundwork for future studies designed to investigate the molecular mechanisms involved in pathogen resistance. PMID:28248269
Automation, parallelism, and robotics for proteomics.
Alterovitz, Gil; Liu, Jonathan; Chow, Jijun; Ramoni, Marco F
2006-07-01
The speed of the human genome project (Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C. et al., Nature 2001, 409, 860-921) was made possible, in part, by developments in automation of sequencing technologies. Before these technologies, sequencing was a laborious, expensive, and personnel-intensive task. Similarly, automation and robotics are changing the field of proteomics today. Proteomics is defined as the effort to understand and characterize proteins in the categories of structure, function and interaction (Englbrecht, C. C., Facius, A., Comb. Chem. High Throughput Screen. 2005, 8, 705-715). As such, this field nicely lends itself to automation technologies since these methods often require large economies of scale in order to achieve cost and time-saving benefits. This article describes some of the technologies and methods being applied in proteomics in order to facilitate automation within the field as well as in linking proteomics-based information with other related research areas.
Malmström, Erik; Kilsgård, Ola; Hauri, Simon; Smeds, Emanuel; Herwald, Heiko; Malmström, Lars; Malmström, Johan
2016-01-01
The plasma proteome is highly dynamic and variable, composed of proteins derived from surrounding tissues and cells. To investigate the complex processes that control the composition of the plasma proteome, we developed a mass spectrometry-based proteomics strategy to infer the origin of proteins detected in murine plasma. The strategy relies on the construction of a comprehensive protein tissue atlas from cells and highly vascularized organs using shotgun mass spectrometry. The protein tissue atlas was transformed to a spectral library for highly reproducible quantification of tissue-specific proteins directly in plasma using SWATH-like data-independent mass spectrometry analysis. We show that the method can determine drastic changes of tissue-specific protein profiles in blood plasma from mouse animal models with sepsis. The strategy can be extended to several other species advancing our understanding of the complex processes that contribute to the plasma proteome dynamics. PMID:26732734
A Method for Label-Free, Differential Top-Down Proteomics.
Ntai, Ioanna; Toby, Timothy K; LeDuc, Richard D; Kelleher, Neil L
2016-01-01
Biomarker discovery in the translational research has heavily relied on labeled and label-free quantitative bottom-up proteomics. Here, we describe a new approach to biomarker studies that utilizes high-throughput top-down proteomics and is the first to offer whole protein characterization and relative quantitation within the same experiment. Using yeast as a model, we report procedures for a label-free approach to quantify the relative abundance of intact proteins ranging from 0 to 30 kDa in two different states. In this chapter, we describe the integrated methodology for the large-scale profiling and quantitation of the intact proteome by liquid chromatography-mass spectrometry (LC-MS) without the need for metabolic or chemical labeling. This recent advance for quantitative top-down proteomics is best implemented with a robust and highly controlled sample preparation workflow before data acquisition on a high-resolution mass spectrometer, and the application of a hierarchical linear statistical model to account for the multiple levels of variance contained in quantitative proteomic comparisons of samples for basic and clinical research.
Recent advances in stable isotope labeling based techniques for proteome relative quantification.
Zhou, Yuan; Shan, Yichu; Zhang, Lihua; Zhang, Yukui
2014-10-24
The large scale relative quantification of all proteins expressed in biological samples under different states is of great importance for discovering proteins with important biological functions, as well as screening disease related biomarkers and drug targets. Therefore, the accurate quantification of proteins at proteome level has become one of the key issues in protein science. Herein, the recent advances in stable isotope labeling based techniques for proteome relative quantification were reviewed, from the aspects of metabolic labeling, chemical labeling and enzyme-catalyzed labeling. Furthermore, the future research direction in this field was prospected. Copyright © 2014 Elsevier B.V. All rights reserved.
High throughput profile-profile based fold recognition for the entire human proteome.
McGuffin, Liam J; Smith, Richard T; Bryson, Kevin; Sørensen, Søren-Aksel; Jones, David T
2006-06-07
In order to maintain the most comprehensive structural annotation databases we must carry out regular updates for each proteome using the latest profile-profile fold recognition methods. The ability to carry out these updates on demand is necessary to keep pace with the regular updates of sequence and structure databases. Providing the highest quality structural models requires the most intensive profile-profile fold recognition methods running with the very latest available sequence databases and fold libraries. However, running these methods on such a regular basis for every sequenced proteome requires large amounts of processing power. In this paper we describe and benchmark the JYDE (Job Yield Distribution Environment) system, which is a meta-scheduler designed to work above cluster schedulers, such as Sun Grid Engine (SGE) or Condor. We demonstrate the ability of JYDE to distribute the load of genomic-scale fold recognition across multiple independent Grid domains. We use the most recent profile-profile version of our mGenTHREADER software in order to annotate the latest version of the Human proteome against the latest sequence and structure databases in as short a time as possible. We show that our JYDE system is able to scale to large numbers of intensive fold recognition jobs running across several independent computer clusters. Using our JYDE system we have been able to annotate 99.9% of the protein sequences within the Human proteome in less than 24 hours, by harnessing over 500 CPUs from 3 independent Grid domains. This study clearly demonstrates the feasibility of carrying out on demand high quality structural annotations for the proteomes of major eukaryotic organisms. Specifically, we have shown that it is now possible to provide complete regular updates of profile-profile based fold recognition models for entire eukaryotic proteomes, through the use of Grid middleware such as JYDE.
Spermatogenesis in mammals: proteomic insights.
Chocu, Sophie; Calvel, Pierre; Rolland, Antoine D; Pineau, Charles
2012-08-01
Spermatogenesis is a highly sophisticated process involved in the transmission of genetic heritage. It includes halving ploidy, repackaging of the chromatin for transport, and the equipment of developing spermatids and eventually spermatozoa with the advanced apparatus (e.g., tightly packed mitochondrial sheat in the mid piece, elongating of the tail, reduction of cytoplasmic volume) to elicit motility once they reach the epididymis. Mammalian spermatogenesis is divided into three phases. In the first the primitive germ cells or spermatogonia undergo a series of mitotic divisions. In the second the spermatocytes undergo two consecutive divisions in meiosis to produce haploid spermatids. In the third the spermatids differentiate into spermatozoa in a process called spermiogenesis. Paracrine, autocrine, juxtacrine, and endocrine pathways all contribute to the regulation of the process. The array of structural elements and chemical factors modulating somatic and germ cell activity is such that the network linking the various cellular activities during spermatogenesis is unimaginably complex. Over the past two decades, advances in genomics have greatly improved our knowledge of spermatogenesis, by identifying numerous genes essential for the development of functional male gametes. Large-scale analyses of testicular function have deepened our insight into normal and pathological spermatogenesis. Progress in genome sequencing and microarray technology have been exploited for genome-wide expression studies, leading to the identification of hundreds of genes differentially expressed within the testis. However, although proteomics has now come of age, the proteomics-based investigation of spermatogenesis remains in its infancy. Here, we review the state-of-the-art of large-scale proteomic analyses of spermatogenesis, from germ cell development during sex determination to spermatogenesis in the adult. Indeed, a few laboratories have undertaken differential protein profiling expression studies and/or systematic analyses of testicular proteomes in entire organs or isolated cells from various species. We consider the pros and cons of proteomics for studying the testicular germ cell gene expression program. Finally, we address the use of protein datasets, through integrative genomics (i.e., combining genomics, transcriptomics, and proteomics), bioinformatics, and modelling.
Next-Generation Proteomics and Its Application to Clinical Breast Cancer Research.
Mardamshina, Mariya; Geiger, Tamar
2017-10-01
Proteomics technology aims to map the protein landscapes of biological samples, and it can be applied to a variety of samples, including cells, tissues, and body fluids. Because the proteins are the main functional molecules in the cells, their levels reflect much more accurately the cellular phenotype and the regulatory processes within them than gene levels, mutations, and even mRNA levels. With the advancement in the technology, it is possible now to obtain comprehensive views of the biological systems and to study large patient cohorts in a streamlined manner. In this review we discuss the technological advancements in mass spectrometry-based proteomics, which allow analysis of breast cancer tissue samples, leading to the first large-scale breast cancer proteomics studies. Furthermore, we discuss the technological developments in blood-based biomarker discovery, which provide the basis for future development of assays for routine clinical use. Although these are only the first steps in implementation of proteomics into the clinic, extensive collaborative work between these worlds will undoubtedly lead to major discoveries and advances in clinical practice. Copyright © 2017 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.
The online Tabloid Proteome: an annotated database of protein associations
Turan, Demet; Tavernier, Jan
2018-01-01
Abstract A complete knowledge of the proteome can only be attained by determining the associations between proteins, along with the nature of these associations (e.g. physical contact in protein–protein interactions, participation in complex formation or different roles in the same pathway). Despite extensive efforts in elucidating direct protein interactions, our knowledge on the complete spectrum of protein associations remains limited. We therefore developed a new approach that detects protein associations from identifications obtained after re-processing of large-scale, public mass spectrometry-based proteomics data. Our approach infers protein association based on the co-occurrence of proteins across many different proteomics experiments, and provides information that is almost completely complementary to traditional direct protein interaction studies. We here present a web interface to query and explore the associations derived from this method, called the online Tabloid Proteome. The online Tabloid Proteome also integrates biological knowledge from several existing resources to annotate our derived protein associations. The online Tabloid Proteome is freely available through a user-friendly web interface, which provides intuitive navigation and data exploration options for the user at http://iomics.ugent.be/tabloidproteome. PMID:29040688
Offermann, Sascha; Friso, Giulia; Doroshenk, Kelly A; Sun, Qi; Sharpe, Richard M; Okita, Thomas W; Wimmer, Diana; Edwards, Gerald E; van Wijk, Klaas J
2015-05-01
Kranz C4 species strictly depend on separation of primary and secondary carbon fixation reactions in different cell types. In contrast, the single-cell C4 (SCC4) species Bienertia sinuspersici utilizes intracellular compartmentation including two physiologically and biochemically different chloroplast types; however, information on identity, localization, and induction of proteins required for this SCC4 system is currently very limited. In this study, we determined the distribution of photosynthesis-related proteins and the induction of the C4 system during development by label-free proteomics of subcellular fractions and leaves of different developmental stages. This was enabled by inferring a protein sequence database from 454 sequencing of Bienertia cDNAs. Large-scale proteome rearrangements were observed as C4 photosynthesis developed during leaf maturation. The proteomes of the two chloroplasts are different with differential accumulation of linear and cyclic electron transport components, primary and secondary carbon fixation reactions, and a triose-phosphate shuttle that is shared between the two chloroplast types. This differential protein distribution pattern suggests the presence of a mRNA or protein-sorting mechanism for nuclear-encoded, chloroplast-targeted proteins in SCC4 species. The combined information was used to provide a comprehensive model for NAD-ME type carbon fixation in SCC4 species.
Comparative bioinformatics analyses and profiling of lysosome-related organelle proteomes
NASA Astrophysics Data System (ADS)
Hu, Zhang-Zhi; Valencia, Julio C.; Huang, Hongzhan; Chi, An; Shabanowitz, Jeffrey; Hearing, Vincent J.; Appella, Ettore; Wu, Cathy
2007-01-01
Complete and accurate profiling of cellular organelle proteomes, while challenging, is important for the understanding of detailed cellular processes at the organelle level. Mass spectrometry technologies coupled with bioinformatics analysis provide an effective approach for protein identification and functional interpretation of organelle proteomes. In this study, we have compiled human organelle reference datasets from large-scale proteomic studies and protein databases for seven lysosome-related organelles (LROs), as well as the endoplasmic reticulum and mitochondria, for comparative organelle proteome analysis. Heterogeneous sources of human organelle proteins and rodent homologs are mapped to human UniProtKB protein entries based on ID and/or peptide mappings, followed by functional annotation and categorization using the iProXpress proteomic expression analysis system. Cataloging organelle proteomes allows close examination of both shared and unique proteins among various LROs and reveals their functional relevance. The proteomic comparisons show that LROs are a closely related family of organelles. The shared proteins indicate the dynamic and hybrid nature of LROs, while the unique transmembrane proteins may represent additional candidate marker proteins for LROs. This comparative analysis, therefore, provides a basis for hypothesis formulation and experimental validation of organelle proteins and their functional roles.
Frequently Asked Questions about Genetic and Genomic Science
... of the new genetic and genomic techniques and technologies? Proteomics The suffix "-ome" comes from the Greek ... pharmacogenomics is one of the large-scale "omic" technologies, it can examine the entirety of the genome, ...
Cell-free protein synthesis: applications in proteomics and biotechnology.
He, Mingyue
2008-01-01
Protein production is one of the key steps in biotechnology and functional proteomics. Expression of proteins in heterologous hosts (such as in E. coli) is generally lengthy and costly. Cell-free protein synthesis is thus emerging as an attractive alternative. In addition to the simplicity and speed for protein production, cell-free expression allows generation of functional proteins that are difficult to produce by in vivo systems. Recent exploitation of cell-free systems enables novel development of technologies for rapid discovery of proteins with desirable properties from very large libraries. This article reviews the recent development in cell-free systems and their application in the large scale protein analysis.
Computer aided manual validation of mass spectrometry-based proteomic data.
Curran, Timothy G; Bryson, Bryan D; Reigelhaupt, Michael; Johnson, Hannah; White, Forest M
2013-06-15
Advances in mass spectrometry-based proteomic technologies have increased the speed of analysis and the depth provided by a single analysis. Computational tools to evaluate the accuracy of peptide identifications from these high-throughput analyses have not kept pace with technological advances; currently the most common quality evaluation methods are based on statistical analysis of the likelihood of false positive identifications in large-scale data sets. While helpful, these calculations do not consider the accuracy of each identification, thus creating a precarious situation for biologists relying on the data to inform experimental design. Manual validation is the gold standard approach to confirm accuracy of database identifications, but is extremely time-intensive. To palliate the increasing time required to manually validate large proteomic datasets, we provide computer aided manual validation software (CAMV) to expedite the process. Relevant spectra are collected, catalogued, and pre-labeled, allowing users to efficiently judge the quality of each identification and summarize applicable quantitative information. CAMV significantly reduces the burden associated with manual validation and will hopefully encourage broader adoption of manual validation in mass spectrometry-based proteomics. Copyright © 2013 Elsevier Inc. All rights reserved.
Ubiquitinated Proteome: Ready for Global?*
Shi, Yi; Xu, Ping; Qin, Jun
2011-01-01
Ubiquitin (Ub) is a small and highly conserved protein that can covalently modify protein substrates. Ubiquitination is one of the major post-translational modifications that regulate a broad spectrum of cellular functions. The advancement of mass spectrometers as well as the development of new affinity purification tools has greatly expedited proteome-wide analysis of several post-translational modifications (e.g. phosphorylation, glycosylation, and acetylation). In contrast, large-scale profiling of lysine ubiquitination remains a challenge. Most recently, new Ub affinity reagents such as Ub remnant antibody and tandem Ub binding domains have been developed, allowing for relatively large-scale detection of several hundreds of lysine ubiquitination events in human cells. Here we review different strategies for the identification of ubiquitination site and discuss several issues associated with data analysis. We suggest that careful interpretation and orthogonal confirmation of MS spectra is necessary to minimize false positive assignments by automatic searching algorithms. PMID:21339389
Tools for phospho- and glycoproteomics of plasma membranes.
Wiśniewski, Jacek R
2011-07-01
Analysis of plasma membrane proteins and their posttranslational modifications is considered as important for identification of disease markers and targets for drug treatment. Due to their insolubility in water, studying of plasma membrane proteins using mass spectrometry has been difficult for a long time. Recent technological developments in sample preparation together with important improvements in mass spectrometric analysis have facilitated analysis of these proteins and their posttranslational modifications. Now, large scale proteomic analyses allow identification of thousands of membrane proteins from minute amounts of sample. Optimized protocols for affinity enrichment of phosphorylated and glycosylated peptides have set new dimensions in the depth of characterization of these posttranslational modifications of plasma membrane proteins. Here, I summarize recent advances in proteomic technology for the characterization of the cell surface proteins and their modifications. In the focus are approaches allowing large scale mapping rather than analytical methods suitable for studying individual proteins or non-complex mixtures.
A Scalable Approach for Protein False Discovery Rate Estimation in Large Proteomic Data Sets.
Savitski, Mikhail M; Wilhelm, Mathias; Hahne, Hannes; Kuster, Bernhard; Bantscheff, Marcus
2015-09-01
Calculating the number of confidently identified proteins and estimating false discovery rate (FDR) is a challenge when analyzing very large proteomic data sets such as entire human proteomes. Biological and technical heterogeneity in proteomic experiments further add to the challenge and there are strong differences in opinion regarding the conceptual validity of a protein FDR and no consensus regarding the methodology for protein FDR determination. There are also limitations inherent to the widely used classic target-decoy strategy that particularly show when analyzing very large data sets and that lead to a strong over-representation of decoy identifications. In this study, we investigated the merits of the classic, as well as a novel target-decoy-based protein FDR estimation approach, taking advantage of a heterogeneous data collection comprised of ∼19,000 LC-MS/MS runs deposited in ProteomicsDB (https://www.proteomicsdb.org). The "picked" protein FDR approach treats target and decoy sequences of the same protein as a pair rather than as individual entities and chooses either the target or the decoy sequence depending on which receives the highest score. We investigated the performance of this approach in combination with q-value based peptide scoring to normalize sample-, instrument-, and search engine-specific differences. The "picked" target-decoy strategy performed best when protein scoring was based on the best peptide q-value for each protein yielding a stable number of true positive protein identifications over a wide range of q-value thresholds. We show that this simple and unbiased strategy eliminates a conceptual issue in the commonly used "classic" protein FDR approach that causes overprediction of false-positive protein identification in large data sets. The approach scales from small to very large data sets without losing performance, consistently increases the number of true-positive protein identifications and is readily implemented in proteomics analysis software. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
A Scalable Approach for Protein False Discovery Rate Estimation in Large Proteomic Data Sets
Savitski, Mikhail M.; Wilhelm, Mathias; Hahne, Hannes; Kuster, Bernhard; Bantscheff, Marcus
2015-01-01
Calculating the number of confidently identified proteins and estimating false discovery rate (FDR) is a challenge when analyzing very large proteomic data sets such as entire human proteomes. Biological and technical heterogeneity in proteomic experiments further add to the challenge and there are strong differences in opinion regarding the conceptual validity of a protein FDR and no consensus regarding the methodology for protein FDR determination. There are also limitations inherent to the widely used classic target–decoy strategy that particularly show when analyzing very large data sets and that lead to a strong over-representation of decoy identifications. In this study, we investigated the merits of the classic, as well as a novel target–decoy-based protein FDR estimation approach, taking advantage of a heterogeneous data collection comprised of ∼19,000 LC-MS/MS runs deposited in ProteomicsDB (https://www.proteomicsdb.org). The “picked” protein FDR approach treats target and decoy sequences of the same protein as a pair rather than as individual entities and chooses either the target or the decoy sequence depending on which receives the highest score. We investigated the performance of this approach in combination with q-value based peptide scoring to normalize sample-, instrument-, and search engine-specific differences. The “picked” target–decoy strategy performed best when protein scoring was based on the best peptide q-value for each protein yielding a stable number of true positive protein identifications over a wide range of q-value thresholds. We show that this simple and unbiased strategy eliminates a conceptual issue in the commonly used “classic” protein FDR approach that causes overprediction of false-positive protein identification in large data sets. The approach scales from small to very large data sets without losing performance, consistently increases the number of true-positive protein identifications and is readily implemented in proteomics analysis software. PMID:25987413
Current trends in quantitative proteomics - an update.
Li, H; Han, J; Pan, J; Liu, T; Parker, C E; Borchers, C H
2017-05-01
Proteins can provide insights into biological processes at the functional level, so they are very promising biomarker candidates. The quantification of proteins in biological samples has been routinely used for the diagnosis of diseases and monitoring the treatment. Although large-scale protein quantification in complex samples is still a challenging task, a great amount of effort has been made to advance the technologies that enable quantitative proteomics. Seven years ago, in 2009, we wrote an article about the current trends in quantitative proteomics. In writing this current paper, we realized that, today, we have an even wider selection of potential tools for quantitative proteomics. These tools include new derivatization reagents, novel sampling formats, new types of analyzers and scanning techniques, and recently developed software to assist in assay development and data analysis. In this review article, we will discuss these innovative methods, and their current and potential applications in proteomics. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Trevisan-Silva, Dilza; Bednaski, Aline V.; Fischer, Juliana S.G.; Veiga, Silvio S.; Bandeira, Nuno; Guthals, Adrian; Marchini, Fabricio K.; Leprevost, Felipe V.; Barbosa, Valmir C.; Senff-Ribeiro, Andrea; Carvalho, Paulo C.
2017-01-01
Venoms are a rich source for the discovery of molecules with biotechnological applications, but their analysis is challenging even for state-of-the-art proteomics. Here we report on a large-scale proteomic assessment of the venom of Loxosceles intermedia, the so-called brown spider. Venom was extracted from 200 spiders and fractioned into two aliquots relative to a 10 kDa cutoff mass. Each of these was further fractioned and digested with trypsin (4 h), trypsin (18 h), pepsin (18 h), and chymotrypsin (18 h), then analyzed by MudPIT on an LTQ-Orbitrap XL ETD mass spectrometer fragmenting precursors by CID, HCD, and ETD. Aliquots of undigested samples were also analyzed. Our experimental design allowed us to apply spectral networks, thus enabling us to obtain meta-contig assemblies, and consequently de novo sequencing of practically complete proteins, culminating in a deep proteome assessment of the venom. Data are available via ProteomeXchange, with identifier PXD005523. PMID:28696408
Comparing Simplification Strategies for the Skeletal Muscle Proteome
Geary, Bethany; Young, Iain S.; Cash, Phillip; Whitfield, Phillip D.; Doherty, Mary K.
2016-01-01
Skeletal muscle is a complex tissue that is dominated by the presence of a few abundant proteins. This wide dynamic range can mask the presence of lower abundance proteins, which can be a confounding factor in large-scale proteomic experiments. In this study, we have investigated a number of pre-fractionation methods, at both the protein and peptide level, for the characterization of the skeletal muscle proteome. The analyses revealed that the use of OFFGEL isoelectric focusing yielded the largest number of protein identifications (>750) compared to alternative gel-based and protein equalization strategies. Further, OFFGEL led to a substantial enrichment of a different sub-population of the proteome. Filter-aided sample preparation (FASP), coupled to peptide-level OFFGEL provided more confidence in the results due to a substantial increase in the number of peptides assigned to each protein. The findings presented here support the use of a multiplexed approach to proteome characterization of skeletal muscle, which has a recognized imbalance in the dynamic range of its protein complement. PMID:28248220
Proteomic Profiling of Mitochondrial Enzymes during Skeletal Muscle Aging.
Staunton, Lisa; O'Connell, Kathleen; Ohlendieck, Kay
2011-03-07
Mitochondria are of central importance for energy generation in skeletal muscles. Expression changes or functional alterations in mitochondrial enzymes play a key role during myogenesis, fibre maturation, and various neuromuscular pathologies, as well as natural fibre aging. Mass spectrometry-based proteomics suggests itself as a convenient large-scale and high-throughput approach to catalogue the mitochondrial protein complement and determine global changes during health and disease. This paper gives a brief overview of the relatively new field of mitochondrial proteomics and discusses the findings from recent proteomic surveys of mitochondrial elements in aged skeletal muscles. Changes in the abundance, biochemical activity, subcellular localization, and/or posttranslational modifications in key mitochondrial enzymes might be useful as novel biomarkers of aging. In the long term, this may advance diagnostic procedures, improve the monitoring of disease progression, help in the testing of side effects due to new drug regimes, and enhance our molecular understanding of age-related muscle degeneration.
Mass spectrometry-based biomarker discovery: toward a global proteome index of individuality.
Hawkridge, Adam M; Muddiman, David C
2009-01-01
Biomarker discovery and proteomics have become synonymous with mass spectrometry in recent years. Although this conflation is an injustice to the many essential biomolecular techniques widely used in biomarker-discovery platforms, it underscores the power and potential of contemporary mass spectrometry. Numerous novel and powerful technologies have been developed around mass spectrometry, proteomics, and biomarker discovery over the past 20 years to globally study complex proteomes (e.g., plasma). However, very few large-scale longitudinal studies have been carried out using these platforms to establish the analytical variability relative to true biological variability. The purpose of this review is not to cover exhaustively the applications of mass spectrometry to biomarker discovery, but rather to discuss the analytical methods and strategies that have been developed for mass spectrometry-based biomarker-discovery platforms and to place them in the context of the many challenges and opportunities yet to be addressed.
Pan, Lang; Zhang, Jian; Wang, Junzhi; Yu, Qin; Bai, Lianyang; Dong, Liyao
2017-05-08
American sloughgrass (Beckmannia syzigachne Steud.) is a weed widely distributed in wheat fields of China. In recent years, the evolution of herbicide (fenoxaprop-P-ethyl)-resistant populations has decreased the susceptibility of B. syzigachne. This study compared 4 B. syzigachne populations (3 resistant and 1 susceptible) using iTRAQ to characterize fenoxaprop-P-ethyl resistance in B. syzigachne at the proteomic level. Through searching the UniProt database, 3104 protein species were identified from 13,335 unique peptides. Approximately 2834 protein species were assigned to 23 functional classifications provided by the COG database. Among these, 2299 protein species were assigned to 125 predicted pathways. The resistant biotype contained 8 protein species that changed in abundance relative to the susceptible biotype; they were involved in photosynthesis, oxidative phosphorylation, and fatty acid biosynthesis pathways. In contrast to previous studies comparing only 1 resistant and 1 susceptible population, our use of 3 fenoxaprop-resistant B. syzigachne populations with different genetic backgrounds minimized irrelevant differential expression and eliminated false positives. Therefore, we could more confidently link the differentially expressed proteins to herbicide resistance. Proteomic analysis demonstrated that fenoxaprop-P-ethyl resistance is associated with photosynthetic capacity, a connection that might be related to the target-site mutations in resistant B. syzigachne. This is the first large-scale proteomics study examining herbicide stress responses in different B. syzigachne biotypes. This study has biological relevance because it is the first to employ proteomic analysis for understanding the mechanisms underlying Beckmannia syzigachne herbicide resistance. The plant is a major weed in China and negatively affects crop yield, but has developed considerable resistance to the most common herbicide, fenoxaprop-P-ethyl. Through comparisons of resistant and sensitive biotypes, our study identified multiple proteins (involved in photosynthesis, oxidative phosphorylation, and fatty acid biosynthesis) that are putatively linked to B. syzigachne herbicide response. This large-scale proteomics study, sorely lacking in weed science, contributes valuable data that can be applied to more fine-tuned analyses on the functions of specific proteins in herbicide resistance. Copyright © 2017 Elsevier B.V. All rights reserved.
Investigating the Role of Large-Scale Domain Dynamics in Protein-Protein Interactions.
Delaforge, Elise; Milles, Sigrid; Huang, Jie-Rong; Bouvier, Denis; Jensen, Malene Ringkjøbing; Sattler, Michael; Hart, Darren J; Blackledge, Martin
2016-01-01
Intrinsically disordered linkers provide multi-domain proteins with degrees of conformational freedom that are often essential for function. These highly dynamic assemblies represent a significant fraction of all proteomes, and deciphering the physical basis of their interactions represents a considerable challenge. Here we describe the difficulties associated with mapping the large-scale domain dynamics and describe two recent examples where solution state methods, in particular NMR spectroscopy, are used to investigate conformational exchange on very different timescales.
Investigating the Role of Large-Scale Domain Dynamics in Protein-Protein Interactions
Delaforge, Elise; Milles, Sigrid; Huang, Jie-rong; Bouvier, Denis; Jensen, Malene Ringkjøbing; Sattler, Michael; Hart, Darren J.; Blackledge, Martin
2016-01-01
Intrinsically disordered linkers provide multi-domain proteins with degrees of conformational freedom that are often essential for function. These highly dynamic assemblies represent a significant fraction of all proteomes, and deciphering the physical basis of their interactions represents a considerable challenge. Here we describe the difficulties associated with mapping the large-scale domain dynamics and describe two recent examples where solution state methods, in particular NMR spectroscopy, are used to investigate conformational exchange on very different timescales. PMID:27679800
Linking the proteins--elucidation of proteome-scale networks using mass spectrometry.
Pflieger, Delphine; Gonnet, Florence; de la Fuente van Bentem, Sergio; Hirt, Heribert; de la Fuente, Alberto
2011-01-01
Proteomes are intricate. Typically, thousands of proteins interact through physical association and post-translational modifications (PTMs) to give rise to the emergent functions of cells. Understanding these functions requires one to study proteomes as "systems" rather than collections of individual protein molecules. The abstraction of the interacting proteome to "protein networks" has recently gained much attention, as networks are effective representations, that lose specific molecular details, but provide the ability to see the proteome as a whole. Mostly two aspects of the proteome have been represented by network models: proteome-wide physical protein-protein-binding interactions organized into Protein Interaction Networks (PINs), and proteome-wide PTM relations organized into Protein Signaling Networks (PSNs). Mass spectrometry (MS) techniques have been shown to be essential to reveal both of these aspects on a proteome-wide scale. Techniques such as affinity purification followed by MS have been used to elucidate protein-protein interactions, and MS-based quantitative phosphoproteomics is critical to understand the structure and dynamics of signaling through the proteome. We here review the current state-of-the-art MS-based analytical pipelines for the purpose to characterize proteome-scale networks. Copyright © 2010 Wiley Periodicals, Inc.
Automated image alignment for 2D gel electrophoresis in a high-throughput proteomics pipeline.
Dowsey, Andrew W; Dunn, Michael J; Yang, Guang-Zhong
2008-04-01
The quest for high-throughput proteomics has revealed a number of challenges in recent years. Whilst substantial improvements in automated protein separation with liquid chromatography and mass spectrometry (LC/MS), aka 'shotgun' proteomics, have been achieved, large-scale open initiatives such as the Human Proteome Organization (HUPO) Brain Proteome Project have shown that maximal proteome coverage is only possible when LC/MS is complemented by 2D gel electrophoresis (2-DE) studies. Moreover, both separation methods require automated alignment and differential analysis to relieve the bioinformatics bottleneck and so make high-throughput protein biomarker discovery a reality. The purpose of this article is to describe a fully automatic image alignment framework for the integration of 2-DE into a high-throughput differential expression proteomics pipeline. The proposed method is based on robust automated image normalization (RAIN) to circumvent the drawbacks of traditional approaches. These use symbolic representation at the very early stages of the analysis, which introduces persistent errors due to inaccuracies in modelling and alignment. In RAIN, a third-order volume-invariant B-spline model is incorporated into a multi-resolution schema to correct for geometric and expression inhomogeneity at multiple scales. The normalized images can then be compared directly in the image domain for quantitative differential analysis. Through evaluation against an existing state-of-the-art method on real and synthetically warped 2D gels, the proposed analysis framework demonstrates substantial improvements in matching accuracy and differential sensitivity. High-throughput analysis is established through an accelerated GPGPU (general purpose computation on graphics cards) implementation. Supplementary material, software and images used in the validation are available at http://www.proteomegrid.org/rain/.
Lan, Jiayi; Núñez Galindo, Antonio; Doecke, James; Fowler, Christopher; Martins, Ralph N; Rainey-Smith, Stephanie R; Cominetti, Ornella; Dayon, Loïc
2018-04-06
Over the last two decades, EDTA-plasma has been used as the preferred sample matrix for human blood proteomic profiling. Serum has also been employed widely. Only a few studies have assessed the difference and relevance of the proteome profiles obtained from plasma samples, such as EDTA-plasma or lithium-heparin-plasma, and serum. A more complete evaluation of the use of EDTA-plasma, heparin-plasma, and serum would greatly expand the comprehensiveness of shotgun proteomics of blood samples. In this study, we evaluated the use of heparin-plasma with respect to EDTA-plasma and serum to profile blood proteomes using a scalable automated proteomic pipeline (ASAP 2 ). The use of plasma and serum for mass-spectrometry-based shotgun proteomics was first tested with commercial pooled samples. The proteome coverage consistency and the quantitative performance were compared. Furthermore, protein measurements in EDTA-plasma and heparin-plasma samples were comparatively studied using matched sample pairs from 20 individuals from the Australian Imaging, Biomarkers and Lifestyle (AIBL) Study. We identified 442 proteins in common between EDTA-plasma and heparin-plasma samples. Overall agreement of the relative protein quantification between the sample pairs demonstrated that shotgun proteomics using workflows such as the ASAP 2 is suitable in analyzing heparin-plasma and that such sample type may be considered in large-scale clinical research studies. Moreover, the partial proteome coverage overlaps (e.g., ∼70%) showed that measures from heparin-plasma could be complementary to those obtained from EDTA-plasma.
Prieto, Gorka; Fullaondo, Asier; Rodríguez, Jose A.
2016-01-01
Large-scale sequencing projects are uncovering a growing number of missense mutations in human tumors. Understanding the phenotypic consequences of these alterations represents a formidable challenge. In silico prediction of functionally relevant amino acid motifs disrupted by cancer mutations could provide insight into the potential impact of a mutation, and guide functional tests. We have previously described Wregex, a tool for the identification of potential functional motifs, such as nuclear export signals (NESs), in proteins. Here, we present an improved version that allows motif prediction to be combined with data from large repositories, such as the Catalogue of Somatic Mutations in Cancer (COSMIC), and to be applied to a whole proteome scale. As an example, we have searched the human proteome for candidate NES motifs that could be altered by cancer-related mutations included in the COSMIC database. A subset of the candidate NESs identified was experimentally tested using an in vivo nuclear export assay. A significant proportion of the selected motifs exhibited nuclear export activity, which was abrogated by the COSMIC mutations. In addition, our search identified a cancer mutation that inactivates the NES of the human deubiquitinase USP21, and leads to the aberrant accumulation of this protein in the nucleus. PMID:27174732
Large-scale label-free quantitative proteomics of the pea aphid-Buchnera symbiosis.
Poliakov, Anton; Russell, Calum W; Ponnala, Lalit; Hoops, Harold J; Sun, Qi; Douglas, Angela E; van Wijk, Klaas J
2011-06-01
Many insects are nutritionally dependent on symbiotic microorganisms that have tiny genomes and are housed in specialized host cells called bacteriocytes. The obligate symbiosis between the pea aphid Acyrthosiphon pisum and the γ-proteobacterium Buchnera aphidicola (only 584 predicted proteins) is particularly amenable for molecular analysis because the genomes of both partners have been sequenced. To better define the symbiotic relationship between this aphid and Buchnera, we used large-scale, high accuracy tandem mass spectrometry (nanoLC-LTQ-Orbtrap) to identify aphid and Buchnera proteins in the whole aphid body, purified bacteriocytes, isolated Buchnera cells and the residual bacteriocyte fraction. More than 1900 aphid and 400 Buchnera proteins were identified. All enzymes in amino acid metabolism annotated in the Buchnera genome were detected, reflecting the high (68%) coverage of the proteome and supporting the core function of Buchnera in the aphid symbiosis. Transporters mediating the transport of predicted metabolites were present in the bacteriocyte. Label-free spectral counting combined with hierarchical clustering, allowed to define the quantitative distribution of a subset of these proteins across both symbiotic partners, yielding no evidence for the selective transfer of protein among the partners in either direction. This is the first quantitative proteome analysis of bacteriocyte symbiosis, providing a wealth of information about molecular function of both the host cell and bacterial symbiont.
Hermjakob, Henning; Montecchi-Palazzi, Luisa; Bader, Gary; Wojcik, Jérôme; Salwinski, Lukasz; Ceol, Arnaud; Moore, Susan; Orchard, Sandra; Sarkans, Ugis; von Mering, Christian; Roechert, Bernd; Poux, Sylvain; Jung, Eva; Mersch, Henning; Kersey, Paul; Lappe, Michael; Li, Yixue; Zeng, Rong; Rana, Debashis; Nikolski, Macha; Husi, Holger; Brun, Christine; Shanker, K; Grant, Seth G N; Sander, Chris; Bork, Peer; Zhu, Weimin; Pandey, Akhilesh; Brazma, Alvis; Jacq, Bernard; Vidal, Marc; Sherman, David; Legrain, Pierre; Cesareni, Gianni; Xenarios, Ioannis; Eisenberg, David; Steipe, Boris; Hogue, Chris; Apweiler, Rolf
2004-02-01
A major goal of proteomics is the complete description of the protein interaction network underlying cell physiology. A large number of small scale and, more recently, large-scale experiments have contributed to expanding our understanding of the nature of the interaction network. However, the necessary data integration across experiments is currently hampered by the fragmentation of publicly available protein interaction data, which exists in different formats in databases, on authors' websites or sometimes only in print publications. Here, we propose a community standard data model for the representation and exchange of protein interaction data. This data model has been jointly developed by members of the Proteomics Standards Initiative (PSI), a work group of the Human Proteome Organization (HUPO), and is supported by major protein interaction data providers, in particular the Biomolecular Interaction Network Database (BIND), Cellzome (Heidelberg, Germany), the Database of Interacting Proteins (DIP), Dana Farber Cancer Institute (Boston, MA, USA), the Human Protein Reference Database (HPRD), Hybrigenics (Paris, France), the European Bioinformatics Institute's (EMBL-EBI, Hinxton, UK) IntAct, the Molecular Interactions (MINT, Rome, Italy) database, the Protein-Protein Interaction Database (PPID, Edinburgh, UK) and the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING, EMBL, Heidelberg, Germany).
Substrate-Mediated Laser Ablation under Ambient Conditions for Spatially-Resolved Tissue Proteomics
Fatou, Benoit; Wisztorski, Maxence; Focsa, Cristian; Salzet, Michel; Ziskind, Michael; Fournier, Isabelle
2015-01-01
Numerous applications of ambient Mass Spectrometry (MS) have been demonstrated over the past decade. They promoted the emergence of various micro-sampling techniques such as Laser Ablation/Droplet Capture (LADC). LADC consists in the ablation of analytes from a surface and their subsequent capture in a solvent droplet which can then be analyzed by MS. LADC is thus generally performed in the UV or IR range, using a wavelength at which analytes or the matrix absorb. In this work, we explore the potential of visible range LADC (532 nm) as a micro-sampling technology for large-scale proteomics analyses. We demonstrate that biomolecule analyses using 532 nm LADC are possible, despite the low absorbance of biomolecules at this wavelength. This is due to the preponderance of an indirect substrate-mediated ablation mechanism at low laser energy which contrasts with the conventional direct ablation driven by sample absorption. Using our custom LADC system and taking advantage of this substrate-mediated ablation mechanism, we were able to perform large-scale proteomic analyses of micro-sampled tissue sections and demonstrated the possible identification of proteins with relevant biological functions. Consequently, the 532 nm LADC technique offers a new tool for biological and clinical applications. PMID:26674367
Large Scale Proteomic Data and Network-Based Systems Biology Approaches to Explore the Plant World.
Di Silvestre, Dario; Bergamaschi, Andrea; Bellini, Edoardo; Mauri, PierLuigi
2018-06-03
The investigation of plant organisms by means of data-derived systems biology approaches based on network modeling is mainly characterized by genomic data, while the potential of proteomics is largely unexplored. This delay is mainly caused by the paucity of plant genomic/proteomic sequences and annotations which are fundamental to perform mass-spectrometry (MS) data interpretation. However, Next Generation Sequencing (NGS) techniques are contributing to filling this gap and an increasing number of studies are focusing on plant proteome profiling and protein-protein interactions (PPIs) identification. Interesting results were obtained by evaluating the topology of PPI networks in the context of organ-associated biological processes as well as plant-pathogen relationships. These examples foreshadow well the benefits that these approaches may provide to plant research. Thus, in addition to providing an overview of the main-omic technologies recently used on plant organisms, we will focus on studies that rely on concepts of module, hub and shortest path, and how they can contribute to the plant discovery processes. In this scenario, we will also consider gene co-expression networks, and some examples of integration with metabolomic data and genome-wide association studies (GWAS) to select candidate genes will be mentioned.
Highly multiplexed targeted proteomics using precise control of peptide retention time.
Gallien, Sebastien; Peterman, Scott; Kiyonami, Reiko; Souady, Jamal; Duriez, Elodie; Schoen, Alan; Domon, Bruno
2012-04-01
Large-scale proteomics applications using SRM analysis on triple quadrupole mass spectrometers present new challenges to LC-MS/MS experimental design. Despite the automation of building large-scale LC-SRM methods, the increased numbers of targeted peptides can compromise the balance between sensitivity and selectivity. To facilitate large target numbers, time-scheduled SRM transition acquisition is performed. Previously published results have demonstrated incorporation of a well-characterized set of synthetic peptides enabled chromatographic characterization of the elution profile for most endogenous peptides. We have extended this application of peptide trainer kits to not only build SRM methods but to facilitate real-time elution profile characterization that enables automated adjustment of the scheduled detection windows. Incorporation of dynamic retention time adjustments better facilitate targeted assays lasting several days without the need for constant supervision. This paper provides an overview of how the dynamic retention correction approach identifies and corrects for commonly observed LC variations. This adjustment dramatically improves robustness in targeted discovery experiments as well as routine quantification experiments. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Automated selected reaction monitoring software for accurate label-free protein quantification.
Teleman, Johan; Karlsson, Christofer; Waldemarson, Sofia; Hansson, Karin; James, Peter; Malmström, Johan; Levander, Fredrik
2012-07-06
Selected reaction monitoring (SRM) is a mass spectrometry method with documented ability to quantify proteins accurately and reproducibly using labeled reference peptides. However, the use of labeled reference peptides becomes impractical if large numbers of peptides are targeted and when high flexibility is desired when selecting peptides. We have developed a label-free quantitative SRM workflow that relies on a new automated algorithm, Anubis, for accurate peak detection. Anubis efficiently removes interfering signals from contaminating peptides to estimate the true signal of the targeted peptides. We evaluated the algorithm on a published multisite data set and achieved results in line with manual data analysis. In complex peptide mixtures from whole proteome digests of Streptococcus pyogenes we achieved a technical variability across the entire proteome abundance range of 6.5-19.2%, which was considerably below the total variation across biological samples. Our results show that the label-free SRM workflow with automated data analysis is feasible for large-scale biological studies, opening up new possibilities for quantitative proteomics and systems biology.
MaRaCluster: A Fragment Rarity Metric for Clustering Fragment Spectra in Shotgun Proteomics.
The, Matthew; Käll, Lukas
2016-03-04
Shotgun proteomics experiments generate large amounts of fragment spectra as primary data, normally with high redundancy between and within experiments. Here, we have devised a clustering technique to identify fragment spectra stemming from the same species of peptide. This is a powerful alternative method to traditional search engines for analyzing spectra, specifically useful for larger scale mass spectrometry studies. As an aid in this process, we propose a distance calculation relying on the rarity of experimental fragment peaks, following the intuition that peaks shared by only a few spectra offer more evidence than peaks shared by a large number of spectra. We used this distance calculation and a complete-linkage scheme to cluster data from a recent large-scale mass spectrometry-based study. The clusterings produced by our method have up to 40% more identified peptides for their consensus spectra compared to those produced by the previous state-of-the-art method. We see that our method would advance the construction of spectral libraries as well as serve as a tool for mining large sets of fragment spectra. The source code and Ubuntu binary packages are available at https://github.com/statisticalbiotechnology/maracluster (under an Apache 2.0 license).
ProteoSign: an end-user online differential proteomics statistical analysis platform.
Efstathiou, Georgios; Antonakis, Andreas N; Pavlopoulos, Georgios A; Theodosiou, Theodosios; Divanach, Peter; Trudgian, David C; Thomas, Benjamin; Papanikolaou, Nikolas; Aivaliotis, Michalis; Acuto, Oreste; Iliopoulos, Ioannis
2017-07-03
Profiling of proteome dynamics is crucial for understanding cellular behavior in response to intrinsic and extrinsic stimuli and maintenance of homeostasis. Over the last 20 years, mass spectrometry (MS) has emerged as the most powerful tool for large-scale identification and characterization of proteins. Bottom-up proteomics, the most common MS-based proteomics approach, has always been challenging in terms of data management, processing, analysis and visualization, with modern instruments capable of producing several gigabytes of data out of a single experiment. Here, we present ProteoSign, a freely available web application, dedicated in allowing users to perform proteomics differential expression/abundance analysis in a user-friendly and self-explanatory way. Although several non-commercial standalone tools have been developed for post-quantification statistical analysis of proteomics data, most of them are not end-user appealing as they often require very stringent installation of programming environments, third-party software packages and sometimes further scripting or computer programming. To avoid this bottleneck, we have developed a user-friendly software platform accessible via a web interface in order to enable proteomics laboratories and core facilities to statistically analyse quantitative proteomics data sets in a resource-efficient manner. ProteoSign is available at http://bioinformatics.med.uoc.gr/ProteoSign and the source code at https://github.com/yorgodillo/ProteoSign. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Lindsey, Merry L; Mayr, Manuel; Gomes, Aldrin V; Delles, Christian; Arrell, D Kent; Murphy, Anne M; Lange, Richard A; Costello, Catherine E; Jin, Yu-Fang; Laskowitz, Daniel T; Sam, Flora; Terzic, Andre; Van Eyk, Jennifer; Srinivas, Pothur R
2015-09-01
The year 2014 marked the 20th anniversary of the coining of the term proteomics. The purpose of this scientific statement is to summarize advances over this period that have catalyzed our capacity to address the experimental, translational, and clinical implications of proteomics as applied to cardiovascular health and disease and to evaluate the current status of the field. Key successes that have energized the field are delineated; opportunities for proteomics to drive basic science research, facilitate clinical translation, and establish diagnostic and therapeutic healthcare algorithms are discussed; and challenges that remain to be solved before proteomic technologies can be readily translated from scientific discoveries to meaningful advances in cardiovascular care are addressed. Proteomics is the result of disruptive technologies, namely, mass spectrometry and database searching, which drove protein analysis from 1 protein at a time to protein mixture analyses that enable large-scale analysis of proteins and facilitate paradigm shifts in biological concepts that address important clinical questions. Over the past 20 years, the field of proteomics has matured, yet it is still developing rapidly. The scope of this statement will extend beyond the reaches of a typical review article and offer guidance on the use of next-generation proteomics for future scientific discovery in the basic research laboratory and clinical settings. © 2015 American Heart Association, Inc.
Deng, Ning; Li, Zhenye; Pan, Chao; Duan, Huilong
2015-01-01
Study of complex proteome brings forward higher request for the quantification method using mass spectrometry technology. In this paper, we present a mass spectrometry label-free quantification tool for complex proteomes, called freeQuant, which integrated quantification with functional analysis effectively. freeQuant consists of two well-integrated modules: label-free quantification and functional analysis with biomedical knowledge. freeQuant supports label-free quantitative analysis which makes full use of tandem mass spectrometry (MS/MS) spectral count, protein sequence length, shared peptides, and ion intensity. It adopts spectral count for quantitative analysis and builds a new method for shared peptides to accurately evaluate abundance of isoforms. For proteins with low abundance, MS/MS total ion count coupled with spectral count is included to ensure accurate protein quantification. Furthermore, freeQuant supports the large-scale functional annotations for complex proteomes. Mitochondrial proteomes from the mouse heart, the mouse liver, and the human heart were used to evaluate the usability and performance of freeQuant. The evaluation showed that the quantitative algorithms implemented in freeQuant can improve accuracy of quantification with better dynamic range.
DelVecchio, Vito G; Wagner, Mary Ann; Eschenbrenner, Michel; Horn, Troy A; Kraycer, Jo Ann; Estock, Frank; Elzer, Phil; Mujer, Cesar V
2002-12-20
The proteomes of selected Brucella spp. have been extensively analyzed by utilizing current proteomic technology involving 2-DE and MALDI-MS. In Brucella melitensis, more than 500 proteins were identified. The rapid and large-scale identification of proteins in this organism was accomplished by using the annotated B. melitensis genome which is now available in the GenBank. Coupled with new and powerful tools for data analysis, differentially expressed proteins were identified and categorized into several classes. A global overview of protein expression patterns emerged, thereby facilitating the simultaneous analysis of different metabolic pathways in B. melitensis. Such a global characterization would not have been possible by using time consuming and traditional biochemical approaches. The era of post-genomic technology offers new and exciting opportunities to understand the complete biology of different Brucella species.
Bladergroen, Marco R.; van der Burgt, Yuri E. M.
2015-01-01
For large-scale and standardized applications in mass spectrometry- (MS-) based proteomics automation of each step is essential. Here we present high-throughput sample preparation solutions for balancing the speed of current MS-acquisitions and the time needed for analytical workup of body fluids. The discussed workflows reduce body fluid sample complexity and apply for both bottom-up proteomics experiments and top-down protein characterization approaches. Various sample preparation methods that involve solid-phase extraction (SPE) including affinity enrichment strategies have been automated. Obtained peptide and protein fractions can be mass analyzed by direct infusion into an electrospray ionization (ESI) source or by means of matrix-assisted laser desorption ionization (MALDI) without further need of time-consuming liquid chromatography (LC) separations. PMID:25692071
Razban, Rostam M; Gilson, Amy I; Durfee, Niamh; Strobelt, Hendrik; Dinkla, Kasper; Choi, Jeong-Mo; Pfister, Hanspeter; Shakhnovich, Eugene I
2018-05-08
Protein evolution spans time scales and its effects span the length of an organism. A web app named ProteomeVis is developed to provide a comprehensive view of protein evolution in the S. cerevisiae and E. coli proteomes. ProteomeVis interactively creates protein chain graphs, where edges between nodes represent structure and sequence similarities within user-defined ranges, to study the long time scale effects of protein structure evolution. The short time scale effects of protein sequence evolution are studied by sequence evolutionary rate (ER) correlation analyses with protein properties that span from the molecular to the organismal level. We demonstrate the utility and versatility of ProteomeVis by investigating the distribution of edges per node in organismal protein chain universe graphs (oPCUGs) and putative ER determinants. S. cerevisiae and E. coli oPCUGs are scale-free with scaling constants of 1.79 and 1.56, respectively. Both scaling constants can be explained by a previously reported theoretical model describing protein structure evolution (Dokholyan et al., 2002). Protein abundance most strongly correlates with ER among properties in ProteomeVis, with Spearman correlations of -0.49 (p-value<10-10) and -0.46 (p-value<10-10) for S. cerevisiae and E. coli, respectively. This result is consistent with previous reports that found protein expression to be the most important ER determinant (Zhang and Yang, 2015). ProteomeVis is freely accessible at http://proteomevis.chem.harvard.edu. Supplementary data are available at Bioinformatics. shakhnovich@chemistry.harvard.edu.
pyGeno: A Python package for precision medicine and proteogenomics.
Daouda, Tariq; Perreault, Claude; Lemieux, Sébastien
2016-01-01
pyGeno is a Python package mainly intended for precision medicine applications that revolve around genomics and proteomics. It integrates reference sequences and annotations from Ensembl, genomic polymorphisms from the dbSNP database and data from next-gen sequencing into an easy to use, memory-efficient and fast framework, therefore allowing the user to easily explore subject-specific genomes and proteomes. Compared to a standalone program, pyGeno gives the user access to the complete expressivity of Python, a general programming language. Its range of application therefore encompasses both short scripts and large scale genome-wide studies.
pyGeno: A Python package for precision medicine and proteogenomics
Daouda, Tariq; Perreault, Claude; Lemieux, Sébastien
2016-01-01
pyGeno is a Python package mainly intended for precision medicine applications that revolve around genomics and proteomics. It integrates reference sequences and annotations from Ensembl, genomic polymorphisms from the dbSNP database and data from next-gen sequencing into an easy to use, memory-efficient and fast framework, therefore allowing the user to easily explore subject-specific genomes and proteomes. Compared to a standalone program, pyGeno gives the user access to the complete expressivity of Python, a general programming language. Its range of application therefore encompasses both short scripts and large scale genome-wide studies. PMID:27785359
Stable isotope dimethyl labelling for quantitative proteomics and beyond
Hsu, Jue-Liang; Chen, Shu-Hui
2016-01-01
Stable-isotope reductive dimethylation, a cost-effective, simple, robust, reliable and easy-to- multiplex labelling method, is widely applied to quantitative proteomics using liquid chromatography-mass spectrometry. This review focuses on biological applications of stable-isotope dimethyl labelling for a large-scale comparative analysis of protein expression and post-translational modifications based on its unique properties of the labelling chemistry. Some other applications of the labelling method for sample preparation and mass spectrometry-based protein identification and characterization are also summarized. This article is part of the themed issue ‘Quantitative mass spectrometry’. PMID:27644970
Identification of Maturation-Specific Proteins by Single-Cell Proteomics of Human Oocytes
Virant-Klun, Irma; Leicht, Stefan; Hughes, Christopher; Krijgsveld, Jeroen
2016-01-01
Oocytes undergo a range of complex processes via oogenesis, maturation, fertilization, and early embryonic development, eventually giving rise to a fully functioning organism. To understand proteome composition and diversity during maturation of human oocytes, here we have addressed crucial aspects of oocyte collection and proteome analysis, resulting in the first proteome and secretome maps of human oocytes. Starting from 100 oocytes collected via a novel serum-free hanging drop culture system, we identified 2,154 proteins, whose function indicate that oocytes are largely resting cells with a proteome that is tailored for homeostasis, cellular attachment, and interaction with its environment via secretory factors. In addition, we have identified 158 oocyte-enriched proteins (such as ECAT1, PIWIL3, NLRP7)1 not observed in high-coverage proteomics studies of other human cell lines or tissues. Exploiting SP3, a novel technology for proteomic sample preparation using magnetic beads, we scaled down proteome analysis to single cells. Despite the low protein content of only ∼100 ng per cell, we consistently identified ∼450 proteins from individual oocytes. When comparing individual oocytes at the germinal vesicle (GV) and metaphase II (MII) stage, we found that the Tudor and KH domain-containing protein (TDRKH) is preferentially expressed in immature oocytes, while Wee2, PCNA, and DNMT1 were enriched in mature cells, collectively indicating that maintenance of genome integrity is crucial during oocyte maturation. This study demonstrates that an innovative proteomics workflow facilitates analysis of single human oocytes to investigate human oocyte biology and preimplantation development. The approach presented here paves the way for quantitative proteomics in other quantity-limited tissues and cell types. Data associated with this study are available via ProteomeXchange with identifier PXD004142. PMID:27215607
Identification of Maturation-Specific Proteins by Single-Cell Proteomics of Human Oocytes.
Virant-Klun, Irma; Leicht, Stefan; Hughes, Christopher; Krijgsveld, Jeroen
2016-08-01
Oocytes undergo a range of complex processes via oogenesis, maturation, fertilization, and early embryonic development, eventually giving rise to a fully functioning organism. To understand proteome composition and diversity during maturation of human oocytes, here we have addressed crucial aspects of oocyte collection and proteome analysis, resulting in the first proteome and secretome maps of human oocytes. Starting from 100 oocytes collected via a novel serum-free hanging drop culture system, we identified 2,154 proteins, whose function indicate that oocytes are largely resting cells with a proteome that is tailored for homeostasis, cellular attachment, and interaction with its environment via secretory factors. In addition, we have identified 158 oocyte-enriched proteins (such as ECAT1, PIWIL3, NLRP7)(1) not observed in high-coverage proteomics studies of other human cell lines or tissues. Exploiting SP3, a novel technology for proteomic sample preparation using magnetic beads, we scaled down proteome analysis to single cells. Despite the low protein content of only ∼100 ng per cell, we consistently identified ∼450 proteins from individual oocytes. When comparing individual oocytes at the germinal vesicle (GV) and metaphase II (MII) stage, we found that the Tudor and KH domain-containing protein (TDRKH) is preferentially expressed in immature oocytes, while Wee2, PCNA, and DNMT1 were enriched in mature cells, collectively indicating that maintenance of genome integrity is crucial during oocyte maturation. This study demonstrates that an innovative proteomics workflow facilitates analysis of single human oocytes to investigate human oocyte biology and preimplantation development. The approach presented here paves the way for quantitative proteomics in other quantity-limited tissues and cell types. Data associated with this study are available via ProteomeXchange with identifier PXD004142. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Griss, Johannes; Perez-Riverol, Yasset; Lewis, Steve; Tabb, David L.; Dianes, José A.; del-Toro, Noemi; Rurik, Marc; Walzer, Mathias W.; Kohlbacher, Oliver; Hermjakob, Henning; Wang, Rui; Vizcaíno, Juan Antonio
2016-01-01
Mass spectrometry (MS) is the main technology used in proteomics approaches. However, on average 75% of spectra analysed in an MS experiment remain unidentified. We propose to use spectrum clustering at a large-scale to shed a light on these unidentified spectra. PRoteomics IDEntifications database (PRIDE) Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in PRIDE Archive, coming from hundreds of datasets, we were able to consistently characterize three distinct groups of spectra: 1) incorrectly identified spectra, 2) spectra correctly identified but below the set scoring threshold, and 3) truly unidentified spectra. Using a multitude of complementary analysis approaches, we were able to identify less than 20% of the consistently unidentified spectra. The complete spectrum clustering results are available through the new version of the PRIDE Cluster resource (http://www.ebi.ac.uk/pride/cluster). This resource is intended, among other aims, to encourage and simplify further investigation into these unidentified spectra. PMID:27493588
Griss, Johannes; Perez-Riverol, Yasset; Lewis, Steve; Tabb, David L; Dianes, José A; Del-Toro, Noemi; Rurik, Marc; Walzer, Mathias W; Kohlbacher, Oliver; Hermjakob, Henning; Wang, Rui; Vizcaíno, Juan Antonio
2016-08-01
Mass spectrometry (MS) is the main technology used in proteomics approaches. However, on average 75% of spectra analysed in an MS experiment remain unidentified. We propose to use spectrum clustering at a large-scale to shed a light on these unidentified spectra. PRoteomics IDEntifications database (PRIDE) Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in PRIDE Archive, coming from hundreds of datasets, we were able to consistently characterize three distinct groups of spectra: 1) incorrectly identified spectra, 2) spectra correctly identified but below the set scoring threshold, and 3) truly unidentified spectra. Using a multitude of complementary analysis approaches, we were able to identify less than 20% of the consistently unidentified spectra. The complete spectrum clustering results are available through the new version of the PRIDE Cluster resource (http://www.ebi.ac.uk/pride/cluster). This resource is intended, among other aims, to encourage and simplify further investigation into these unidentified spectra.
Wu, Si; Brown, Roslyn N.; Payne, Samuel H.; ...
2013-01-01
The periplasm of Gram-negative bacteria is a dynamic and physiologically important subcellular compartment where the constant exposure to potential environmental insults amplifies the need for proper protein folding and modifications. Top-down proteomics analysis of the periplasmic fraction at the intact protein level provides unrestricted characterization and annotation of the periplasmic proteome, including the post-translational modifications (PTMs) on these proteins. Here, we used single-dimension ultra-high pressure liquid chromatography coupled with the Fourier transform mass spectrometry (FTMS) to investigate the intact periplasmic proteome of Novosphingobium aromaticivorans . Our top-down analysis provided the confident identification of 55 proteins in the periplasm and characterizedmore » their PTMs including signal peptide removal, N-terminal methionine excision, acetylation, glutathionylation, pyroglutamate, and disulfide bond formation. This study provides the first experimental evidence for the expression and periplasmic localization of many hypothetical and uncharacterized proteins and the first unrestrictive, large-scale data on PTMs in the bacterial periplasm.« less
Sugahara, Daisuke; Kaji, Hiroyuki; Sugihara, Kazushi; Asano, Masahide; Narimatsu, Hisashi
2012-01-01
Model organisms containing deletion or mutation in a glycosyltransferase-gene exhibit various physiological abnormalities, suggesting that specific glycan motifs on certain proteins play important roles in vivo. Identification of the target proteins of glycosyltransferase isozymes is the key to understand the roles of glycans. Here, we demonstrated the proteome-scale identification of the target proteins specific for a glycosyltransferase isozyme, β1,4-galactosyltransferase-I (β4GalT-I). Although β4GalT-I is the most characterized glycosyltransferase, its distinctive contribution to β1,4-galactosylation has been hardly described so far. We identified a large number of candidates for the target proteins specific to β4GalT-I by comparative analysis of β4GalT-I-deleted and wild-type mice using the LC/MS-based technique with the isotope-coded glycosylation site-specific tagging (IGOT) of lectin-captured N-glycopeptides. Our approach to identify the target proteins in a proteome-scale offers common features and trends in the target proteins, which facilitate understanding of the mechanism that controls assembly of a particular glycan motif on specific proteins. PMID:23002422
Galisson, Frederic; Mahrouche, Louiza; Courcelles, Mathieu; Bonneil, Eric; Meloche, Sylvain; Chelbi-Alix, Mounira K.; Thibault, Pierre
2011-01-01
The small ubiquitin-related modifier (SUMO) is a small group of proteins that are reversibly attached to protein substrates to modify their functions. The large scale identification of protein SUMOylation and their modification sites in mammalian cells represents a significant challenge because of the relatively small number of in vivo substrates and the dynamic nature of this modification. We report here a novel proteomics approach to selectively enrich and identify SUMO conjugates from human cells. We stably expressed different SUMO paralogs in HEK293 cells, each containing a His6 tag and a strategically located tryptic cleavage site at the C terminus to facilitate the recovery and identification of SUMOylated peptides by affinity enrichment and mass spectrometry. Tryptic peptides with short SUMO remnants offer significant advantages in large scale SUMOylome experiments including the generation of paralog-specific fragment ions following CID and ETD activation, and the identification of modified peptides using conventional database search engines such as Mascot. We identified 205 unique protein substrates together with 17 precise SUMOylation sites present in 12 SUMO protein conjugates including three new sites (Lys-380, Lys-400, and Lys-497) on the protein promyelocytic leukemia. Label-free quantitative proteomics analyses on purified nuclear extracts from untreated and arsenic trioxide-treated cells revealed that all identified SUMOylated sites of promyelocytic leukemia were differentially SUMOylated upon stimulation. PMID:21098080
Development of proteome-wide binding reagents for research and diagnostics.
Taussig, Michael J; Schmidt, Ronny; Cook, Elizabeth A; Stoevesandt, Oda
2013-12-01
Alongside MS, antibodies and other specific protein-binding molecules have a special place in proteomics as affinity reagents in a toolbox of applications for determining protein location, quantitative distribution and function (affinity proteomics). The realisation that the range of research antibodies available, while apparently vast is nevertheless still very incomplete and frequently of uncertain quality, has stimulated projects with an objective of raising comprehensive, proteome-wide sets of protein binders. With progress in automation and throughput, a remarkable number of recent publications refer to the practical possibility of selecting binders to every protein encoded in the genome. Here we review the requirements of a pipeline of production of protein binders for the human proteome, including target prioritisation, antigen design, 'next generation' methods, databases and the approaches taken by ongoing projects in Europe and the USA. While the task of generating affinity reagents for all human proteins is complex and demanding, the benefits of well-characterised and quality-controlled pan-proteome binder resources for biomedical research, industry and life sciences in general would be enormous and justify the effort. Given the technical, personnel and financial resources needed to fulfil this aim, expansion of current efforts may best be addressed through large-scale international collaboration. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The emergence of top-down proteomics in clinical research
2013-01-01
Proteomic technology has advanced steadily since the development of 'soft-ionization' techniques for mass-spectrometry-based molecular identification more than two decades ago. Now, the large-scale analysis of proteins (proteomics) is a mainstay of biological research and clinical translation, with researchers seeking molecular diagnostics, as well as protein-based markers for personalized medicine. Proteomic strategies using the protease trypsin (known as bottom-up proteomics) were the first to be developed and optimized and form the dominant approach at present. However, researchers are now beginning to understand the limitations of bottom-up techniques, namely the inability to characterize and quantify intact protein molecules from a complex mixture of digested peptides. To overcome these limitations, several laboratories are taking a whole-protein-based approach, in which intact protein molecules are the analytical targets for characterization and quantification. We discuss these top-down techniques and how they have been applied to clinical research and are likely to be applied in the near future. Given the recent improvements in mass-spectrometry-based proteomics and stronger cooperation between researchers, clinicians and statisticians, both peptide-based (bottom-up) strategies and whole-protein-based (top-down) strategies are set to complement each other and help researchers and clinicians better understand and detect complex disease phenotypes. PMID:23806018
Microbial Interactions in Plants: Perspectives and Applications of Proteomics.
Imam, Jahangir; Shukla, Pratyoosh; Mandal, Nimai Prasad; Variar, Mukund
2017-01-01
The structure and function of proteins involved in plant-microbe interactions is investigated through large-scale proteomics technology in a complex biological sample. Since the whole genome sequences are now available for several plant species and microbes, proteomics study has become easier, accurate and huge amount of data can be generated and analyzed during plant-microbe interactions. Proteomics approaches are highly important and relevant in many studies and showed that only genomics approaches are not sufficient enough as much significant information are lost as the proteins and not the genes coding them are final product that is responsible for the observed phenotype. Novel approaches in proteomics are developing continuously enabling the study of the various aspects in arrangements and configuration of proteins and its functions. Its application is becoming more common and frequently used in plant-microbe interactions with the advancement in new technologies. They are more used for the portrayal of cell and extracellular destructiveness and pathogenicity variables delivered by pathogens. This distinguishes the protein level adjustments in host plants when infected with pathogens and advantageous partners. This review provides a brief overview of different proteomics technology which is currently available followed by their exploitation to study the plant-microbe interaction. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Keates, Tracy; Cooper, Christopher D O; Savitsky, Pavel; Allerston, Charles K; Phillips, Claire; Hammarström, Martin; Daga, Neha; Berridge, Georgina; Mahajan, Pravin; Burgess-Brown, Nicola A; Müller, Susanne; Gräslund, Susanne; Gileadi, Opher
2012-06-15
The generation of affinity reagents to large numbers of human proteins depends on the ability to express the target proteins as high-quality antigens. The Structural Genomics Consortium (SGC) focuses on the production and structure determination of human proteins. In a 7-year period, the SGC has deposited crystal structures of >800 human protein domains, and has additionally expressed and purified a similar number of protein domains that have not yet been crystallised. The targets include a diversity of protein domains, with an attempt to provide high coverage of protein families. The family approach provides an excellent basis for characterising the selectivity of affinity reagents. We present a summary of the approaches used to generate purified human proteins or protein domains, a test case demonstrating the ability to rapidly generate new proteins, and an optimisation study on the modification of >70 proteins by biotinylation in vivo. These results provide a unique synergy between large-scale structural projects and the recent efforts to produce a wide coverage of affinity reagents to the human proteome. Copyright © 2011 Elsevier B.V. All rights reserved.
Keates, Tracy; Cooper, Christopher D.O.; Savitsky, Pavel; Allerston, Charles K.; Phillips, Claire; Hammarström, Martin; Daga, Neha; Berridge, Georgina; Mahajan, Pravin; Burgess-Brown, Nicola A.; Müller, Susanne; Gräslund, Susanne; Gileadi, Opher
2012-01-01
The generation of affinity reagents to large numbers of human proteins depends on the ability to express the target proteins as high-quality antigens. The Structural Genomics Consortium (SGC) focuses on the production and structure determination of human proteins. In a 7-year period, the SGC has deposited crystal structures of >800 human protein domains, and has additionally expressed and purified a similar number of protein domains that have not yet been crystallised. The targets include a diversity of protein domains, with an attempt to provide high coverage of protein families. The family approach provides an excellent basis for characterising the selectivity of affinity reagents. We present a summary of the approaches used to generate purified human proteins or protein domains, a test case demonstrating the ability to rapidly generate new proteins, and an optimisation study on the modification of >70 proteins by biotinylation in vivo. These results provide a unique synergy between large-scale structural projects and the recent efforts to produce a wide coverage of affinity reagents to the human proteome. PMID:22027370
2013-01-01
Background Guanine-cytosine (GC) composition is an important feature of genomes. Likewise, amino acid composition is a distinct, but less valued, feature of proteomes. A major concern is that it is not clear what valuable information can be acquired from amino acid composition data. To address this concern, in-depth analyses of the amino acid composition of the complete proteomes from 63 archaea, 270 bacteria, and 128 eukaryotes were performed. Results Principal component analysis of the amino acid matrices showed that the main contributors to proteomic architecture were genomic GC variation, phylogeny, and environmental influences. GC pressure drove positive selection on Ala, Arg, Gly, Pro, Trp, and Val, and adverse selection on Asn, Lys, Ile, Phe, and Tyr. The physico-chemical framework of the complete proteomes withstood GC pressure by frequency complementation of GC-dependent amino acid pairs with similar physico-chemical properties. Gln, His, Ser, and Val were responsible for phylogeny and their constituted components could differentiate archaea, bacteria, and eukaryotes. Environmental niche was also a significant factor in determining proteomic architecture, especially for archaea for which the main amino acids were Cys, Leu, and Thr. In archaea, hyperthermophiles, acidophiles, mesophiles, psychrophiles, and halophiles gathered successively along the environment-based principal component. Concordance between proteomic architecture and the genetic code was also related closely to genomic GC content, phylogeny, and lifestyles. Conclusions Large-scale analyses of the complete proteomes of a wide range of organisms suggested that amino acid composition retained the trace of GC variation, phylogeny, and environmental influences during evolution. The findings from this study will help in the development of a global understanding of proteome evolution, and even biological evolution. PMID:24088322
BIG: a large-scale data integration tool for renal physiology.
Zhao, Yue; Yang, Chin-Rang; Raghuram, Viswanathan; Parulekar, Jaya; Knepper, Mark A
2016-10-01
Due to recent advances in high-throughput techniques, we and others have generated multiple proteomic and transcriptomic databases to describe and quantify gene expression, protein abundance, or cellular signaling on the scale of the whole genome/proteome in kidney cells. The existence of so much data from diverse sources raises the following question: "How can researchers find information efficiently for a given gene product over all of these data sets without searching each data set individually?" This is the type of problem that has motivated the "Big-Data" revolution in Data Science, which has driven progress in fields such as marketing. Here we present an online Big-Data tool called BIG (Biological Information Gatherer) that allows users to submit a single online query to obtain all relevant information from all indexed databases. BIG is accessible at http://big.nhlbi.nih.gov/.
Aptamer-based multiplexed proteomic technology for biomarker discovery.
Gold, Larry; Ayers, Deborah; Bertino, Jennifer; Bock, Christopher; Bock, Ashley; Brody, Edward N; Carter, Jeff; Dalby, Andrew B; Eaton, Bruce E; Fitzwater, Tim; Flather, Dylan; Forbes, Ashley; Foreman, Trudi; Fowler, Cate; Gawande, Bharat; Goss, Meredith; Gunn, Magda; Gupta, Shashi; Halladay, Dennis; Heil, Jim; Heilig, Joe; Hicke, Brian; Husar, Gregory; Janjic, Nebojsa; Jarvis, Thale; Jennings, Susan; Katilius, Evaldas; Keeney, Tracy R; Kim, Nancy; Koch, Tad H; Kraemer, Stephan; Kroiss, Luke; Le, Ngan; Levine, Daniel; Lindsey, Wes; Lollo, Bridget; Mayfield, Wes; Mehan, Mike; Mehler, Robert; Nelson, Sally K; Nelson, Michele; Nieuwlandt, Dan; Nikrad, Malti; Ochsner, Urs; Ostroff, Rachel M; Otis, Matt; Parker, Thomas; Pietrasiewicz, Steve; Resnicow, Daniel I; Rohloff, John; Sanders, Glenn; Sattin, Sarah; Schneider, Daniel; Singer, Britta; Stanton, Martin; Sterkel, Alana; Stewart, Alex; Stratford, Suzanne; Vaught, Jonathan D; Vrkljan, Mike; Walker, Jeffrey J; Watrobka, Mike; Waugh, Sheela; Weiss, Allison; Wilcox, Sheri K; Wolfson, Alexey; Wolk, Steven K; Zhang, Chi; Zichi, Dom
2010-12-07
The interrogation of proteomes ("proteomics") in a highly multiplexed and efficient manner remains a coveted and challenging goal in biology and medicine. We present a new aptamer-based proteomic technology for biomarker discovery capable of simultaneously measuring thousands of proteins from small sample volumes (15 µL of serum or plasma). Our current assay measures 813 proteins with low limits of detection (1 pM median), 7 logs of overall dynamic range (~100 fM-1 µM), and 5% median coefficient of variation. This technology is enabled by a new generation of aptamers that contain chemically modified nucleotides, which greatly expand the physicochemical diversity of the large randomized nucleic acid libraries from which the aptamers are selected. Proteins in complex matrices such as plasma are measured with a process that transforms a signature of protein concentrations into a corresponding signature of DNA aptamer concentrations, which is quantified on a DNA microarray. Our assay takes advantage of the dual nature of aptamers as both folded protein-binding entities with defined shapes and unique nucleotide sequences recognizable by specific hybridization probes. To demonstrate the utility of our proteomics biomarker discovery technology, we applied it to a clinical study of chronic kidney disease (CKD). We identified two well known CKD biomarkers as well as an additional 58 potential CKD biomarkers. These results demonstrate the potential utility of our technology to rapidly discover unique protein signatures characteristic of various disease states. We describe a versatile and powerful tool that allows large-scale comparison of proteome profiles among discrete populations. This unbiased and highly multiplexed search engine will enable the discovery of novel biomarkers in a manner that is unencumbered by our incomplete knowledge of biology, thereby helping to advance the next generation of evidence-based medicine.
Li, Ginny X H; Vogel, Christine; Choi, Hyungwon
2018-06-07
While tandem mass spectrometry can detect post-translational modifications (PTM) at the proteome scale, reported PTM sites are often incomplete and include false positives. Computational approaches can complement these datasets by additional predictions, but most available tools use prediction models pre-trained for single PTM type by the developers and it remains a difficult task to perform large-scale batch prediction for multiple PTMs with flexible user control, including the choice of training data. We developed an R package called PTMscape which predicts PTM sites across the proteome based on a unified and comprehensive set of descriptors of the physico-chemical microenvironment of modified sites, with additional downstream analysis modules to test enrichment of individual or pairs of PTMs in protein domains. PTMscape is flexible in the ability to process any major modifications, such as phosphorylation and ubiquitination, while achieving the sensitivity and specificity comparable to single-PTM methods and outperforming other multi-PTM tools. Applying this framework, we expanded proteome-wide coverage of five major PTMs affecting different residues by prediction, especially for lysine and arginine modifications. Using a combination of experimentally acquired sites (PSP) and newly predicted sites, we discovered that the crosstalk among multiple PTMs occur more frequently than by random chance in key protein domains such as histone, protein kinase, and RNA recognition motifs, spanning various biological processes such as RNA processing, DNA damage response, signal transduction, and regulation of cell cycle. These results provide a proteome-scale analysis of crosstalk among major PTMs and can be easily extended to other types of PTM.
Sadygov, Rovshan G; Cociorva, Daniel; Yates, John R
2004-12-01
Database searching is an essential element of large-scale proteomics. Because these methods are widely used, it is important to understand the rationale of the algorithms. Most algorithms are based on concepts first developed in SEQUEST and PeptideSearch. Four basic approaches are used to determine a match between a spectrum and sequence: descriptive, interpretative, stochastic and probability-based matching. We review the basic concepts used by most search algorithms, the computational modeling of peptide identification and current challenges and limitations of this approach for protein identification.
TUBEs-Mass Spectrometry for Identification and Analysis of the Ubiquitin-Proteome.
Azkargorta, Mikel; Escobes, Iraide; Elortza, Felix; Matthiesen, Rune; Rodríguez, Manuel S
2016-01-01
Mass spectrometry (MS) has become the method of choice for the large-scale analysis of protein ubiquitylation. There exist a number of proposed methods for mapping ubiquitin sites, each with different pros and cons. We present here a protocol for the MS analysis of the ubiquitin-proteome captured by TUBEs and subsequent data analysis. Using dedicated software and algorithms, specific information on the presence of ubiquitylated peptides can be obtained from the MS search results. In addition, a quantitative and functional analysis of the ubiquitylated proteins and their interacting partners helps to unravel the biological and molecular processes they are involved in.
Software Tools | Office of Cancer Clinical Proteomics Research
The CPTAC program develops new approaches to elucidate aspects of the molecular complexity of cancer made from large-scale proteogenomic datasets, and advance them toward precision medicine. Part of the CPTAC mission is to make data and tools available and accessible to the greater research community to accelerate the discovery process.
Quality Assessments of Long-Term Quantitative Proteomic Analysis of Breast Cancer Xenograft Tissues
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, Jian-Ying; Chen, Lijun; Zhang, Bai
The identification of protein biomarkers requires large-scale analysis of human specimens to achieve statistical significance. In this study, we evaluated the long-term reproducibility of an iTRAQ (isobaric tags for relative and absolute quantification) based quantitative proteomics strategy using one channel for universal normalization across all samples. A total of 307 liquid chromatography tandem mass spectrometric (LC-MS/MS) analyses were completed, generating 107 one-dimensional (1D) LC-MS/MS datasets and 8 offline two-dimensional (2D) LC-MS/MS datasets (25 fractions for each set) for human-in-mouse breast cancer xenograft tissues representative of basal and luminal subtypes. Such large-scale studies require the implementation of robust metrics to assessmore » the contributions of technical and biological variability in the qualitative and quantitative data. Accordingly, we developed a quantification confidence score based on the quality of each peptide-spectrum match (PSM) to remove quantification outliers from each analysis. After combining confidence score filtering and statistical analysis, reproducible protein identification and quantitative results were achieved from LC-MS/MS datasets collected over a 16 month period.« less
Advancing Cell Biology Through Proteomics in Space and Time (PROSPECTS)*
Lamond, Angus I.; Uhlen, Mathias; Horning, Stevan; Makarov, Alexander; Robinson, Carol V.; Serrano, Luis; Hartl, F. Ulrich; Baumeister, Wolfgang; Werenskiold, Anne Katrin; Andersen, Jens S.; Vorm, Ole; Linial, Michal; Aebersold, Ruedi; Mann, Matthias
2012-01-01
The term “proteomics” encompasses the large-scale detection and analysis of proteins and their post-translational modifications. Driven by major improvements in mass spectrometric instrumentation, methodology, and data analysis, the proteomics field has burgeoned in recent years. It now provides a range of sensitive and quantitative approaches for measuring protein structures and dynamics that promise to revolutionize our understanding of cell biology and molecular mechanisms in both human cells and model organisms. The Proteomics Specification in Time and Space (PROSPECTS) Network is a unique EU-funded project that brings together leading European research groups, spanning from instrumentation to biomedicine, in a collaborative five year initiative to develop new methods and applications for the functional analysis of cellular proteins. This special issue of Molecular and Cellular Proteomics presents 16 research papers reporting major recent progress by the PROSPECTS groups, including improvements to the resolution and sensitivity of the Orbitrap family of mass spectrometers, systematic detection of proteins using highly characterized antibody collections, and new methods for absolute as well as relative quantification of protein levels. Manuscripts in this issue exemplify approaches for performing quantitative measurements of cell proteomes and for studying their dynamic responses to perturbation, both during normal cellular responses and in disease mechanisms. Here we present a perspective on how the proteomics field is moving beyond simply identifying proteins with high sensitivity toward providing a powerful and versatile set of assay systems for characterizing proteome dynamics and thereby creating a new “third generation” proteomics strategy that offers an indispensible tool for cell biology and molecular medicine. PMID:22311636
MaxReport: An Enhanced Proteomic Result Reporting Tool for MaxQuant.
Zhou, Tao; Li, Chuyu; Zhao, Wene; Wang, Xinru; Wang, Fuqiang; Sha, Jiahao
2016-01-01
MaxQuant is a proteomic software widely used for large-scale tandem mass spectrometry data. We have designed and developed an enhanced result reporting tool for MaxQuant, named as MaxReport. This tool can optimize the results of MaxQuant and provide additional functions for result interpretation. MaxReport can generate report tables for protein N-terminal modifications. It also supports isobaric labelling based relative quantification at the protein, peptide or site level. To obtain an overview of the results, MaxReport performs general descriptive statistical analyses for both identification and quantification results. The output results of MaxReport are well organized and therefore helpful for proteomic users to better understand and share their data. The script of MaxReport, which is freely available at http://websdoor.net/bioinfo/maxreport/, is developed using Python code and is compatible across multiple systems including Windows and Linux.
Gao, Liyan; Ge, Haitao; Huang, Xiahe; Liu, Kehui; Zhang, Yuanya; Xu, Wu; Wang, Yingchun
2015-01-01
Large-scale quantitative evaluation of the tightness of membrane association for nontransmembrane proteins is important for identifying true peripheral membrane proteins with functional significance. Herein, we simultaneously ranked more than 1000 proteins of the photosynthetic model organism Synechocystis sp. PCC 6803 for their relative tightness of membrane association using a proteomic approach. Using multiple precisely ranked and experimentally verified peripheral subunits of photosynthetic protein complexes as the landmarks, we found that proteins involved in two-component signal transduction systems and transporters are overall tightly associated with the membranes, whereas the associations of ribosomal proteins are much weaker. Moreover, we found that hypothetical proteins containing the same domains generally have similar tightness. This work provided a global view of the structural organization of the membrane proteome with respect to divergent functions, and built the foundation for future investigation of the dynamic membrane proteome reorganization in response to different environmental or internal stimuli. PMID:25505158
MALDI versus ESI: The Impact of the Ion Source on Peptide Identification.
Nadler, Wiebke Maria; Waidelich, Dietmar; Kerner, Alexander; Hanke, Sabrina; Berg, Regina; Trumpp, Andreas; Rösli, Christoph
2017-03-03
For mass spectrometry-based proteomic analyses, electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI) are the commonly used ionization techniques. To investigate the influence of the ion source on peptide detection in large-scale proteomics, an optimized GeLC/MS workflow was developed and applied either with ESI/MS or with MALDI/MS for the proteomic analysis of different human cell lines of pancreatic origin. Statistical analysis of the resulting data set with more than 72 000 peptides emphasized the complementary character of the two methods, as the percentage of peptides identified with both approaches was as low as 39%. Significant differences between the resulting peptide sets were observed with respect to amino acid composition, charge-related parameters, hydrophobicity, and modifications of the detected peptides and could be linked to factors governing the respective ion yields in ESI and MALDI.
Use of proteomic methods in the analysis of human body fluids in Alzheimer research.
Zürbig, Petra; Jahn, Holger
2012-12-01
Proteomics is the study of the entire population of proteins and peptides in an organism or a part of it, such as a cell, tissue, or fluids like cerebrospinal fluid, plasma, serum, urine, or saliva. It is widely assumed that changes in the composition of the proteome may reflect disease states and provide clues to its origin, eventually leading to targets for new treatments. The ability to perform large-scale proteomic studies now is based jointly on recent advances in our analytical methods. Separation techniques like CE and 2DE have developed and matured. Detection methods like MS have also improved greatly in the last 5 years. These developments have also driven the fields of bioinformatics, needed to deal with the increased data production and systems biology. All these developing methods offer specific advantages but also come with certain limitations. This review describes the different proteomic methods used in the field, their limitations, and their possible pitfalls. Based on a literature search in PubMed, we identified 112 studies that applied proteomic techniques to identify biomarkers for Alzheimer disease. This review describes the results of these studies on proteome changes in human body fluids of Alzheimer patients reviewing the most important studies. We extracted a list of 366 proteins and peptides that were identified by these studies as potential targets in Alzheimer research. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
TRIC: an automated alignment strategy for reproducible protein quantification in targeted proteomics
Röst, Hannes L.; Liu, Yansheng; D’Agostino, Giuseppe; Zanella, Matteo; Navarro, Pedro; Rosenberger, George; Collins, Ben C.; Gillet, Ludovic; Testa, Giuseppe; Malmström, Lars; Aebersold, Ruedi
2016-01-01
Large scale, quantitative proteomic studies have become essential for the analysis of clinical cohorts, large perturbation experiments and systems biology studies. While next-generation mass spectrometric techniques such as SWATH-MS have substantially increased throughput and reproducibility, ensuring consistent quantification of thousands of peptide analytes across multiple LC-MS/MS runs remains a challenging and laborious manual process. To produce highly consistent and quantitatively accurate proteomics data matrices in an automated fashion, we have developed the TRIC software which utilizes fragment ion data to perform cross-run alignment, consistent peak-picking and quantification for high throughput targeted proteomics. TRIC uses a graph-based alignment strategy based on non-linear retention time correction to integrate peak elution information from all LC-MS/MS runs acquired in a study. When compared to state-of-the-art SWATH-MS data analysis, the algorithm was able to reduce the identification error by more than 3-fold at constant recall, while correcting for highly non-linear chromatographic effects. On a pulsed-SILAC experiment performed on human induced pluripotent stem (iPS) cells, TRIC was able to automatically align and quantify thousands of light and heavy isotopic peak groups and substantially increased the quantitative completeness and biological information in the data, providing insights into protein dynamics of iPS cells. Overall, this study demonstrates the importance of consistent quantification in highly challenging experimental setups, and proposes an algorithm to automate this task, constituting the last missing piece in a pipeline for automated analysis of massively parallel targeted proteomics datasets. PMID:27479329
Veras, Patrícia Sampaio Tavares; Bezerra de Menezes, Juliana Perrone
2016-01-01
Leishmania is a protozoan parasite that causes a wide range of different clinical manifestations in mammalian hosts. It is a major public health risk on different continents and represents one of the most important neglected diseases. Due to the high toxicity of the drugs currently used, and in the light of increasing drug resistance, there is a critical need to develop new drugs and vaccines to control Leishmania infection. Over the past few years, proteomics has become an important tool to understand the underlying biology of Leishmania parasites and host interaction. The large-scale study of proteins, both in parasites and within the host in response to infection, can accelerate the discovery of new therapeutic targets. By studying the proteomes of host cells and tissues infected with Leishmania, as well as changes in protein profiles among promastigotes and amastigotes, scientists hope to better understand the biology involved in the parasite survival and the host-parasite interaction. This review demonstrates the feasibility of proteomics as an approach to identify new proteins involved in Leishmania differentiation and intracellular survival. PMID:27548150
Veras, Patrícia Sampaio Tavares; Bezerra de Menezes, Juliana Perrone
2016-08-19
Leishmania is a protozoan parasite that causes a wide range of different clinical manifestations in mammalian hosts. It is a major public health risk on different continents and represents one of the most important neglected diseases. Due to the high toxicity of the drugs currently used, and in the light of increasing drug resistance, there is a critical need to develop new drugs and vaccines to control Leishmania infection. Over the past few years, proteomics has become an important tool to understand the underlying biology of Leishmania parasites and host interaction. The large-scale study of proteins, both in parasites and within the host in response to infection, can accelerate the discovery of new therapeutic targets. By studying the proteomes of host cells and tissues infected with Leishmania, as well as changes in protein profiles among promastigotes and amastigotes, scientists hope to better understand the biology involved in the parasite survival and the host-parasite interaction. This review demonstrates the feasibility of proteomics as an approach to identify new proteins involved in Leishmania differentiation and intracellular survival.
Jimenez, Connie R; Verheul, Henk M W
2014-01-01
Proteomics is optimally suited to bridge the gap between genomic information on the one hand and biologic functions and disease phenotypes at the other, since it studies the expression and/or post-translational modification (especially phosphorylation) of proteins--the major cellular players bringing about cellular functions--at a global level in biologic specimens. Mass spectrometry technology and (bio)informatic tools have matured to the extent that they can provide high-throughput, comprehensive, and quantitative protein inventories of cells, tissues, and biofluids in clinical samples at low level. In this article, we focus on next-generation proteomics employing nanoliquid chromatography coupled to high-resolution tandem mass spectrometry for in-depth (phospho)protein profiling of tumor tissues and (proximal) biofluids, with a focus on studies employing clinical material. In addition, we highlight emerging proteogenomic approaches for the identification of tumor-specific protein variants, and targeted multiplex mass spectrometry strategies for large-scale biomarker validation. Below we provide a discussion of recent progress, some research highlights, and challenges that remain for clinical translation of proteomic discoveries.
Jiang, Xiaogang; Feng, Shun; Tian, Ruijun; Han, Guanghui; Jiang, Xinning; Ye, Mingliang; Zou, Hanfa
2007-02-01
An approach was developed to automate sample introduction for nanoflow LC-MS/MS (microLC-MS/MS) analysis using a strong cation exchange (SCX) trap column. The system consisted of a 100 microm id x 2 cm SCX trap column and a 75 microm id x 12 cm C18 RP analytical column. During the sample loading step, the flow passing through the SCX trap column was directed to waste for loading a large volume of sample at high flow rate. Then the peptides bound on the SCX trap column were eluted onto the RP analytical column by a high salt buffer followed by RP chromatographic separation of the peptides at nanoliter flow rate. It was observed that higher performance of separation could be achieved with the system using SCX trap column than with the system using C18 trap column. The high proteomic coverage using this approach was demonstrated in the analysis of tryptic digest of BSA and yeast cell lysate. In addition, this system was also applied to two-dimensional separation of tryptic digest of human hepatocellular carcinoma cell line SMMC-7721 for large scale proteome analysis. This system was fully automated and required minimum changes on current microLC-MS/MS system. This system represented a promising platform for routine proteome analysis.
Vella, Danila; Zoppis, Italo; Mauri, Giancarlo; Mauri, Pierluigi; Di Silvestre, Dario
2017-12-01
The reductionist approach of dissecting biological systems into their constituents has been successful in the first stage of the molecular biology to elucidate the chemical basis of several biological processes. This knowledge helped biologists to understand the complexity of the biological systems evidencing that most biological functions do not arise from individual molecules; thus, realizing that the emergent properties of the biological systems cannot be explained or be predicted by investigating individual molecules without taking into consideration their relations. Thanks to the improvement of the current -omics technologies and the increasing understanding of the molecular relationships, even more studies are evaluating the biological systems through approaches based on graph theory. Genomic and proteomic data are often combined with protein-protein interaction (PPI) networks whose structure is routinely analyzed by algorithms and tools to characterize hubs/bottlenecks and topological, functional, and disease modules. On the other hand, co-expression networks represent a complementary procedure that give the opportunity to evaluate at system level including organisms that lack information on PPIs. Based on these premises, we introduce the reader to the PPI and to the co-expression networks, including aspects of reconstruction and analysis. In particular, the new idea to evaluate large-scale proteomic data by means of co-expression networks will be discussed presenting some examples of application. Their use to infer biological knowledge will be shown, and a special attention will be devoted to the topological and module analysis.
Metaproteomics as a Complementary Approach to Gut Microbiota in Health and Disease
NASA Astrophysics Data System (ADS)
Petriz, Bernardo A.; Franco, Octávio L.
2017-01-01
Classic studies on phylotype profiling are limited to the identification of microbial constituents, where information is lacking about the molecular interaction of these bacterial communities with the host genome and the possible outcomes in host biology. A range of OMICs approaches have provided great progress linking the microbiota to health and disease. However, the investigation of this context through proteomic mass spectrometry-based tools is still being improved. Therefore, metaproteomics or community proteogenomics has emerged as a complementary approach to metagenomic data, as a field in proteomics aiming to perform large-scale characterization of proteins from environmental microbiota such as the human gut. The advances in molecular separation methods coupled with mass spectrometry (e.g. LC-MS/MS) and proteome bioinformatics have been fundamental in these novel large-scale metaproteomic studies, which have further been performed in a wide range of samples including soil, plant and human environments. Metaproteomic studies will make major progress if a comprehensive database covering the genes and expresses proteins from all gut microbial species is developed. To this end, we here present some of the main limitations of metaproteomic studies in complex microbiota environments such as the gut, also addressing the up-to-date pipelines in sample preparation prior to fractionation/separation and mass spectrometry analysis. In addition, a novel approach to the limitations of metagenomic databases is also discussed. Finally, prospects are addressed regarding the application of metaproteomic analysis using a unified host-microbiome gene database and other meta-OMICs platforms.
Advances in targeted proteomics and applications to biomedical research
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shi, Tujin; Song, Ehwang; Nie, Song
Targeted proteomics technique has emerged as a powerful protein quantification tool in systems biology, biomedical research, and increasing for clinical applications. The most widely used targeted proteomics approach, selected reaction monitoring (SRM), also known as multiple reaction monitoring (MRM), can be used for quantification of cellular signaling networks and preclinical verification of candidate protein biomarkers. As an extension to our previous review on advances in SRM sensitivity (Shi et al., Proteomics, 12, 1074–1092, 2012) herein we review recent advances in the method and technology for further enhancing SRM sensitivity (from 2012 to present), and highlighting its broad biomedical applications inmore » human bodily fluids, tissue and cell lines. Furthermore, we also review two recently introduced targeted proteomics approaches, parallel reaction monitoring (PRM) and data-independent acquisition (DIA) with targeted data extraction on fast scanning high-resolution accurate-mass (HR/AM) instruments. Such HR/AM targeted quantification with monitoring all target product ions addresses SRM limitations effectively in specificity and multiplexing; whereas when compared to SRM, PRM and DIA are still in the infancy with a limited number of applications. Thus, for HR/AM targeted quantification we focus our discussion on method development, data processing and analysis, and its advantages and limitations in targeted proteomics. Finally, general perspectives on the potential of achieving both high sensitivity and high sample throughput for large-scale quantification of hundreds of target proteins are discussed.« less
Runau, Franscois; Arshad, Ali; Isherwood, John; Norris, Leonie; Howells, Lynne; Metcalfe, Matthew; Dennison, Ashley
2015-06-01
Pancreatic cancer is a disease with a significantly poor prognosis. Despite modern advances in other medical, surgical, and oncologic therapy, the outcome from pancreatic cancer has improved little over the last 40 years. To improve the management of this difficult disease, trials investigating the use of dietary and parenteral fish oils rich in omega-3 (ω-3) fatty acids, exhibiting proven anti-inflammatory and anticarcinogenic properties, have revealed favorable results in pancreatic cancers. Proteomics is the large-scale study of proteins that attempts to characterize the complete set of proteins encoded by the genome of an organism and that, with the use of sensitive mass spectrometric-based techniques, has allowed high-throughput analysis of the proteome to aid identification of putative biomarkers pertinent to given disease states. These biomarkers provide useful insight into potentially discovering new markers for early detection or elucidating the efficacy of treatment on pancreatic cancers. Here, our review identifies potential proteomic-based biomarkers in pancreatic cancer relating to apoptosis, cell proliferation, angiogenesis, and metabolic regulation in clinical studies. We also reviewed proteomic biomarkers from the administration of ω-3 fatty acids that act on similar anticarcinogenic pathways as above and reflect that proteomic studies on the effect of ω-3 fatty acids in pancreatic cancer will yield favorable results. © 2015 American Society for Parenteral and Enteral Nutrition.
Proteomic insights into floral biology.
Li, Xiaobai; Jackson, Aaron; Xie, Ming; Wu, Dianxing; Tsai, Wen-Chieh; Zhang, Sheng
2016-08-01
The flower is the most important biological structure for ensuring angiosperms reproductive success. Not only does the flower contain critical reproductive organs, but the wide variation in morphology, color, and scent has evolved to entice specialized pollinators, and arguably mankind in many cases, to ensure the successful propagation of its species. Recent proteomic approaches have identified protein candidates related to these flower traits, which has shed light on a number of previously unknown mechanisms underlying these traits. This review article provides a comprehensive overview of the latest advances in proteomic research in floral biology according to the order of flower structure, from corolla to male and female reproductive organs. It summarizes mainstream proteomic methods for plant research and recent improvements on two dimensional gel electrophoresis and gel-free workflows for both peptide level and protein level analysis. The recent advances in sequencing technologies provide a new paradigm for the ever-increasing genome and transcriptome information on many organisms. It is now possible to integrate genomic and transcriptomic data with proteomic results for large-scale protein characterization, so that a global understanding of the complex molecular networks in flower biology can be readily achieved. This article is part of a Special Issue entitled: Plant Proteomics--a bridge between fundamental processes and crop production, edited by Dr. Hans-Peter Mock. Copyright © 2016 Elsevier B.V. All rights reserved.
BIG: a large-scale data integration tool for renal physiology
Zhao, Yue; Yang, Chin-Rang; Raghuram, Viswanathan; Parulekar, Jaya
2016-01-01
Due to recent advances in high-throughput techniques, we and others have generated multiple proteomic and transcriptomic databases to describe and quantify gene expression, protein abundance, or cellular signaling on the scale of the whole genome/proteome in kidney cells. The existence of so much data from diverse sources raises the following question: “How can researchers find information efficiently for a given gene product over all of these data sets without searching each data set individually?” This is the type of problem that has motivated the “Big-Data” revolution in Data Science, which has driven progress in fields such as marketing. Here we present an online Big-Data tool called BIG (Biological Information Gatherer) that allows users to submit a single online query to obtain all relevant information from all indexed databases. BIG is accessible at http://big.nhlbi.nih.gov/. PMID:27279488
Application of Large-Scale Aptamer-Based Proteomic Profiling to Planned Myocardial Infarctions.
Jacob, Jaison; Ngo, Debby; Finkel, Nancy; Pitts, Rebecca; Gleim, Scott; Benson, Mark D; Keyes, Michelle J; Farrell, Laurie A; Morgan, Thomas; Jennings, Lori L; Gerszten, Robert E
2018-03-20
Emerging proteomic technologies using novel affinity-based reagents allow for efficient multiplexing with high-sample throughput. To identify early biomarkers of myocardial injury, we recently applied an aptamer-based proteomic profiling platform that measures 1129 proteins to samples from patients undergoing septal alcohol ablation for hypertrophic cardiomyopathy, a human model of planned myocardial injury. Here, we examined the scalability of this approach using a markedly expanded platform to study a far broader range of human proteins in the context of myocardial injury. We applied a highly multiplexed, expanded proteomic technique that uses single-stranded DNA aptamers to assay 4783 human proteins (4137 distinct human gene targets) to derivation and validation cohorts of planned myocardial injury, individuals with spontaneous myocardial infarction, and at-risk controls. We found 376 target proteins that significantly changed in the blood after planned myocardial injury in a derivation cohort (n=20; P <1.05E-05, 1-way repeated measures analysis of variance, Bonferroni threshold). Two hundred forty-seven of these proteins were validated in an independent planned myocardial injury cohort (n=15; P <1.33E-04, 1-way repeated measures analysis of variance); >90% were directionally consistent and reached nominal significance in the validation cohort. Among the validated proteins that were increased within 1 hour after planned myocardial injury, 29 were also elevated in patients with spontaneous myocardial infarction (n=63; P <6.17E-04). Many of the novel markers identified in our study are intracellular proteins not previously identified in the peripheral circulation or have functional roles relevant to myocardial injury. For example, the cardiac LIM protein, cysteine- and glycine-rich protein 3, is thought to mediate cardiac mechanotransduction and stress responses, whereas the mitochondrial ATP synthase F 0 subunit component is a vasoactive peptide on its release from cells. Last, we performed aptamer-affinity enrichment coupled with mass spectrometry to technically verify aptamer specificity for a subset of the new biomarkers. Our results demonstrate the feasibility of large-scale aptamer multiplexing at a level that has not previously been reported and with sample throughput that greatly exceeds other existing proteomic methods. The expanded aptamer-based proteomic platform provides a unique opportunity for biomarker and pathway discovery after myocardial injury. © 2017 American Heart Association, Inc.
Data Use Agreement | Office of Cancer Clinical Proteomics Research
CPTAC requests that data users abide by the same principles that were previously established in the Fort Lauderdale and Amsterdam meetings. The recommendations from the Fort Lauderdale meeting (2003) on best practices and principles for sharing large-scale genomic data address the roles and responsibilities of data producers, data users and funders of community resource projects.
Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework
2012-01-01
Background For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. Results We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. Conclusion The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources. PMID:23216909
Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework.
Lewis, Steven; Csordas, Attila; Killcoyne, Sarah; Hermjakob, Henning; Hoopmann, Michael R; Moritz, Robert L; Deutsch, Eric W; Boyle, John
2012-12-05
For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources.
Expediting SRM assay development for large-scale targeted proteomics experiments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Chaochao; Shi, Tujin; Brown, Joseph N.
2014-08-22
Due to their high sensitivity and specificity, targeted proteomics measurements, e.g. selected reaction monitoring (SRM), are becoming increasingly popular for biological and translational applications. Selection of optimal transitions and optimization of collision energy (CE) are important assay development steps for achieving sensitive detection and accurate quantification; however, these steps can be labor-intensive, especially for large-scale applications. Herein, we explored several options for accelerating SRM assay development evaluated in the context of a relatively large set of 215 synthetic peptide targets. We first showed that HCD fragmentation is very similar to CID in triple quadrupole (QQQ) instrumentation, and by selection ofmore » top six y fragment ions from HCD spectra, >86% of top transitions optimized from direct infusion on QQQ instrument are covered. We also demonstrated that the CE calculated by existing prediction tools was less accurate for +3 precursors, and a significant increase in intensity for transitions could be obtained using a new CE prediction equation constructed from the present experimental data. Overall, our study illustrates the feasibility of expediting the development of larger numbers of high-sensitivity SRM assays through automation of transitions selection and accurate prediction of optimal CE to improve both SRM throughput and measurement quality.« less
The Response of the Root Proteome to the Synthetic Strigolactone GR24 in Arabidopsis*
Walton, Alan; Stes, Elisabeth; Goeminne, Geert; Braem, Lukas; Vuylsteke, Marnik; Matthys, Cedrick; De Cuyper, Carolien; Staes, An; Vandenbussche, Jonathan; Boyer, François-Didier; Vanholme, Ruben; Fromentin, Justine; Boerjan, Wout; Gevaert, Kris; Goormachtig, Sofie
2016-01-01
Strigolactones are plant metabolites that act as phytohormones and rhizosphere signals. Whereas most research on unraveling the action mechanisms of strigolactones is focused on plant shoots, we investigated proteome adaptation during strigolactone signaling in the roots of Arabidopsis thaliana. Through large-scale, time-resolved, and quantitative proteomics, the impact of the strigolactone analog rac-GR24 was elucidated on the root proteome of the wild type and the signaling mutant more axillary growth 2 (max2). Our study revealed a clear MAX2-dependent rac-GR24 response: an increase in abundance of enzymes involved in flavonol biosynthesis, which was reduced in the max2–1 mutant. Mass spectrometry-driven metabolite profiling and thin-layer chromatography experiments demonstrated that these changes in protein expression lead to the accumulation of specific flavonols. Moreover, quantitative RT-PCR revealed that the flavonol-related protein expression profile was caused by rac-GR24-induced changes in transcript levels of the corresponding genes. This induction of flavonol production was shown to be activated by the two pure enantiomers that together make up rac-GR24. Finally, our data provide much needed clues concerning the multiple roles played by MAX2 in the roots and a comprehensive view of the rac-GR24-induced response in the root proteome. PMID:27317401
Ma, Yue; Tuskan, Gerald A.
2018-01-01
The existence of complete genome sequences makes it important to develop different approaches for classification of large-scale data sets and to make extraction of biological insights easier. Here, we propose an approach for classification of complete proteomes/protein sets based on protein distributions on some basic attributes. We demonstrate the usefulness of this approach by determining protein distributions in terms of two attributes: protein lengths and protein intrinsic disorder contents (ID). The protein distributions based on L and ID are surveyed for representative proteome organisms and protein sets from the three domains of life. The two-dimensional maps (designated as fingerprints here) from the protein distribution densities in the LD space defined by ln(L) and ID are then constructed. The fingerprints for different organisms and protein sets are found to be distinct with each other, and they can therefore be used for comparative studies. As a test case, phylogenetic trees have been constructed based on the protein distribution densities in the fingerprints of proteomes of organisms without performing any protein sequence comparison and alignments. The phylogenetic trees generated are biologically meaningful, demonstrating that the protein distributions in the LD space may serve as unique phylogenetic signals of the organisms at the proteome level. PMID:29686995
PNAC: a protein nucleolar association classifier
2011-01-01
Background Although primarily known as the site of ribosome subunit production, the nucleolus is involved in numerous and diverse cellular processes. Recent large-scale proteomics projects have identified thousands of human proteins that associate with the nucleolus. However, in most cases, we know neither the fraction of each protein pool that is nucleolus-associated nor whether their association is permanent or conditional. Results To describe the dynamic localisation of proteins in the nucleolus, we investigated the extent of nucleolar association of proteins by first collating an extensively curated literature-derived dataset. This dataset then served to train a probabilistic predictor which integrates gene and protein characteristics. Unlike most previous experimental and computational studies of the nucleolar proteome that produce large static lists of nucleolar proteins regardless of their extent of nucleolar association, our predictor models the fluidity of the nucleolus by considering different classes of nucleolar-associated proteins. The new method predicts all human proteins as either nucleolar-enriched, nucleolar-nucleoplasmic, nucleolar-cytoplasmic or non-nucleolar. Leave-one-out cross validation tests reveal sensitivity values for these four classes ranging from 0.72 to 0.90 and positive predictive values ranging from 0.63 to 0.94. The overall accuracy of the classifier was measured to be 0.85 on an independent literature-based test set and 0.74 using a large independent quantitative proteomics dataset. While the three nucleolar-association groups display vastly different Gene Ontology biological process signatures and evolutionary characteristics, they collectively represent the most well characterised nucleolar functions. Conclusions Our proteome-wide classification of nucleolar association provides a novel representation of the dynamic content of the nucleolus. This model of nucleolar localisation thus increases the coverage while providing accurate and specific annotations of the nucleolar proteome. It will be instrumental in better understanding the central role of the nucleolus in the cell and its interaction with other subcellular compartments. PMID:21272300
Wiśniewski, Jacek R; Mann, Matthias
2016-07-01
Proteomics and other protein-based analysis methods such as Western blotting all face the challenge of discriminating changes in the levels of proteins of interest from inadvertent changes in the amount loaded for analysis. Mass-spectrometry-based proteomics can now estimate the relative and absolute amounts of thousands of proteins across diverse biological systems. We reasoned that this new technology could prove useful for selection of very stably expressed proteins that could serve as better loading controls than those traditionally employed. Large-scale proteomic analyses of SDS lysates of cultured cells and tissues revealed deglycase DJ-1 as the protein with the lowest variability in abundance among different cell types in human, mouse, and amphibian cells. The protein constitutes 0.069 ± 0.017% of total cellular protein and occurs at a specific concentration of 34.6 ± 8.7 pmol/mg of total protein. Since DJ-1 is ubiquitous and therefore easily detectable with several peptides, it can be helpful in normalization of proteomic data sets. In addition, DJ-1 appears to be an advantageous loading control for Western blot that is superior to those used commonly used, allowing comparisons between tissues and cells originating from evolutionarily distant vertebrate species. Notably, this is not possible by the detection and quantitation of housekeeping proteins, which are often used in the Western blot technique. The approach introduced here can be applied to select the most appropriate loading controls for MS-based proteomics or Western blotting in any biological system.
Aptamer-Based Multiplexed Proteomic Technology for Biomarker Discovery
Gold, Larry; Ayers, Deborah; Bertino, Jennifer; Bock, Christopher; Bock, Ashley; Brody, Edward N.; Carter, Jeff; Dalby, Andrew B.; Eaton, Bruce E.; Fitzwater, Tim; Flather, Dylan; Forbes, Ashley; Foreman, Trudi; Fowler, Cate; Gawande, Bharat; Goss, Meredith; Gunn, Magda; Gupta, Shashi; Halladay, Dennis; Heil, Jim; Heilig, Joe; Hicke, Brian; Husar, Gregory; Janjic, Nebojsa; Jarvis, Thale; Jennings, Susan; Katilius, Evaldas; Keeney, Tracy R.; Kim, Nancy; Koch, Tad H.; Kraemer, Stephan; Kroiss, Luke; Le, Ngan; Levine, Daniel; Lindsey, Wes; Lollo, Bridget; Mayfield, Wes; Mehan, Mike; Mehler, Robert; Nelson, Sally K.; Nelson, Michele; Nieuwlandt, Dan; Nikrad, Malti; Ochsner, Urs; Ostroff, Rachel M.; Otis, Matt; Parker, Thomas; Pietrasiewicz, Steve; Resnicow, Daniel I.; Rohloff, John; Sanders, Glenn; Sattin, Sarah; Schneider, Daniel; Singer, Britta; Stanton, Martin; Sterkel, Alana; Stewart, Alex; Stratford, Suzanne; Vaught, Jonathan D.; Vrkljan, Mike; Walker, Jeffrey J.; Watrobka, Mike; Waugh, Sheela; Weiss, Allison; Wilcox, Sheri K.; Wolfson, Alexey; Wolk, Steven K.; Zhang, Chi; Zichi, Dom
2010-01-01
Background The interrogation of proteomes (“proteomics”) in a highly multiplexed and efficient manner remains a coveted and challenging goal in biology and medicine. Methodology/Principal Findings We present a new aptamer-based proteomic technology for biomarker discovery capable of simultaneously measuring thousands of proteins from small sample volumes (15 µL of serum or plasma). Our current assay measures 813 proteins with low limits of detection (1 pM median), 7 logs of overall dynamic range (∼100 fM–1 µM), and 5% median coefficient of variation. This technology is enabled by a new generation of aptamers that contain chemically modified nucleotides, which greatly expand the physicochemical diversity of the large randomized nucleic acid libraries from which the aptamers are selected. Proteins in complex matrices such as plasma are measured with a process that transforms a signature of protein concentrations into a corresponding signature of DNA aptamer concentrations, which is quantified on a DNA microarray. Our assay takes advantage of the dual nature of aptamers as both folded protein-binding entities with defined shapes and unique nucleotide sequences recognizable by specific hybridization probes. To demonstrate the utility of our proteomics biomarker discovery technology, we applied it to a clinical study of chronic kidney disease (CKD). We identified two well known CKD biomarkers as well as an additional 58 potential CKD biomarkers. These results demonstrate the potential utility of our technology to rapidly discover unique protein signatures characteristic of various disease states. Conclusions/Significance We describe a versatile and powerful tool that allows large-scale comparison of proteome profiles among discrete populations. This unbiased and highly multiplexed search engine will enable the discovery of novel biomarkers in a manner that is unencumbered by our incomplete knowledge of biology, thereby helping to advance the next generation of evidence-based medicine. PMID:21165148
O'Dwyer, David N; Norman, Katy C; Xia, Meng; Huang, Yong; Gurczynski, Stephen J; Ashley, Shanna L; White, Eric S; Flaherty, Kevin R; Martinez, Fernando J; Murray, Susan; Noth, Imre; Arnold, Kelly B; Moore, Bethany B
2017-04-25
Idiopathic pulmonary fibrosis (IPF) is a progressive and fatal interstitial pneumonia. The disease pathophysiology is poorly understood and the etiology remains unclear. Recent advances have generated new therapies and improved knowledge of the natural history of IPF. These gains have been brokered by advances in technology and improved insight into the role of various genes in mediating disease, but gene expression and protein levels do not always correlate. Thus, in this paper we apply a novel large scale high throughput aptamer approach to identify more than 1100 proteins in the peripheral blood of well-characterized IPF patients and normal volunteers. We use systems biology approaches to identify a unique IPF proteome signature and give insight into biological processes driving IPF. We found IPF plasma to be altered and enriched for proteins involved in defense response, wound healing and protein phosphorylation when compared to normal human plasma. Analysis also revealed a minimal protein signature that differentiated IPF patients from normal controls, which may allow for accurate diagnosis of IPF based on easily-accessible peripheral blood. This report introduces large scale unbiased protein discovery analysis to IPF and describes distinct biological processes that further inform disease biology.
Balbuena, Tiago Santana; He, Ruifeng; Salvato, Fernanda; Gang, David R.; Thelen, Jay J.
2012-01-01
Horsetail (Equisetum hyemale) is a widespread vascular plant species, whose reproduction is mainly dependent on the growth and development of the rhizomes. Due to its key evolutionary position, the identification of factors that could be involved in the existence of the rhizomatous trait may contribute to a better understanding of the role of this underground organ for the successful propagation of this and other plant species. In the present work, we characterized the proteome of E. hyemale rhizomes using a GeLC-MS spectral-counting proteomics strategy. A total of 1,911 and 1,860 non-redundant proteins were identified in the rhizomes apical tip and elongation zone, respectively. Rhizome-characteristic proteins were determined by comparisons of the developing rhizome tissues to developing roots. A total of 87 proteins were found to be up-regulated in both horsetail rhizome tissues in relation to developing roots. Hierarchical clustering indicated a vast dynamic range in the regulation of the 87 characteristic proteins and revealed, based on the regulation profile, the existence of nine major protein groups. Gene ontology analyses suggested an over-representation of the terms involved in macromolecular and protein biosynthetic processes, gene expression, and nucleotide and protein binding functions. Spatial difference analysis between the rhizome apical tip and the elongation zone revealed that only eight proteins were up-regulated in the apical tip including RNA-binding proteins and an acyl carrier protein, as well as a KH domain protein and a T-complex subunit; while only seven proteins were up-regulated in the elongation zone including phosphomannomutase, galactomannan galactosyltransferase, endoglucanase 10 and 25, and mannose-1-phosphate guanyltransferase subunits alpha and beta. This is the first large-scale characterization of the proteome of a plant rhizome. Implications of the findings were discussed in relation to other underground organs and related species. PMID:22740841
Quantitative Missense Variant Effect Prediction Using Large-Scale Mutagenesis Data.
Gray, Vanessa E; Hause, Ronald J; Luebeck, Jens; Shendure, Jay; Fowler, Douglas M
2018-01-24
Large datasets describing the quantitative effects of mutations on protein function are becoming increasingly available. Here, we leverage these datasets to develop Envision, which predicts the magnitude of a missense variant's molecular effect. Envision combines 21,026 variant effect measurements from nine large-scale experimental mutagenesis datasets, a hitherto untapped training resource, with a supervised, stochastic gradient boosting learning algorithm. Envision outperforms other missense variant effect predictors both on large-scale mutagenesis data and on an independent test dataset comprising 2,312 TP53 variants whose effects were measured using a low-throughput approach. This dataset was never used for hyperparameter tuning or model training and thus serves as an independent validation set. Envision prediction accuracy is also more consistent across amino acids than other predictors. Finally, we demonstrate that Envision's performance improves as more large-scale mutagenesis data are incorporated. We precompute Envision predictions for every possible single amino acid variant in human, mouse, frog, zebrafish, fruit fly, worm, and yeast proteomes (https://envision.gs.washington.edu/). Copyright © 2017 Elsevier Inc. All rights reserved.
Proteomics and Systems Biology: Current and Future Applications in the Nutritional Sciences1
Moore, J. Bernadette; Weeks, Mark E.
2011-01-01
In the last decade, advances in genomics, proteomics, and metabolomics have yielded large-scale datasets that have driven an interest in global analyses, with the objective of understanding biological systems as a whole. Systems biology integrates computational modeling and experimental biology to predict and characterize the dynamic properties of biological systems, which are viewed as complex signaling networks. Whereas the systems analysis of disease-perturbed networks holds promise for identification of drug targets for therapy, equally the identified critical network nodes may be targeted through nutritional intervention in either a preventative or therapeutic fashion. As such, in the context of the nutritional sciences, it is envisioned that systems analysis of normal and nutrient-perturbed signaling networks in combination with knowledge of underlying genetic polymorphisms will lead to a future in which the health of individuals will be improved through predictive and preventative nutrition. Although high-throughput transcriptomic microarray data were initially most readily available and amenable to systems analysis, recent technological and methodological advances in MS have contributed to a linear increase in proteomic investigations. It is now commonplace for combined proteomic technologies to generate complex, multi-faceted datasets, and these will be the keystone of future systems biology research. This review will define systems biology, outline current proteomic methodologies, highlight successful applications of proteomics in nutrition research, and discuss the challenges for future applications of systems biology approaches in the nutritional sciences. PMID:22332076
QC-ART: A tool for real-time quality control assessment of mass spectrometry-based proteomics data.
Stanfill, Bryan A; Nakayasu, Ernesto S; Bramer, Lisa M; Thompson, Allison M; Ansong, Charles K; Clauss, Therese; Gritsenko, Marina A; Monroe, Matthew E; Moore, Ronald J; Orton, Daniel J; Piehowski, Paul D; Schepmoes, Athena A; Smith, Richard D; Webb-Robertson, Bobbie-Jo; Metz, Thomas O
2018-04-17
Liquid chromatography-mass spectrometry (LC-MS)-based proteomics studies of large sample cohorts can easily require from months to years to complete. Acquiring consistent, high-quality data in such large-scale studies is challenging because of normal variations in instrumentation performance over time, as well as artifacts introduced by the samples themselves, such as those due to collection, storage and processing. Existing quality control methods for proteomics data primarily focus on post-hoc analysis to remove low-quality data that would degrade downstream statistics; they are not designed to evaluate the data in near real-time, which would allow for interventions as soon as deviations in data quality are detected. In addition to flagging analyses that demonstrate outlier behavior, evaluating how the data structure changes over time can aide in understanding typical instrument performance or identify issues such as a degradation in data quality due to the need for instrument cleaning and/or re-calibration. To address this gap for proteomics, we developed Quality Control Analysis in Real-Time (QC-ART), a tool for evaluating data as they are acquired in order to dynamically flag potential issues with instrument performance or sample quality. QC-ART has similar accuracy as standard post-hoc analysis methods with the additional benefit of real-time analysis. We demonstrate the utility and performance of QC-ART in identifying deviations in data quality due to both instrument and sample issues in near real-time for LC-MS-based plasma proteomics analyses of a sample subset of The Environmental Determinants of Diabetes in the Young cohort. We also present a case where QC-ART facilitated the identification of oxidative modifications, which are often underappreciated in proteomic experiments. Published under license by The American Society for Biochemistry and Molecular Biology, Inc.
Proteomics meets blue biotechnology: a wealth of novelties and opportunities.
Hartmann, Erica M; Durighello, Emie; Pible, Olivier; Nogales, Balbina; Beltrametti, Fabrizio; Bosch, Rafael; Christie-Oleza, Joseph A; Armengaud, Jean
2014-10-01
Blue biotechnology, in which aquatic environments provide the inspiration for various products such as food additives, aquaculture, biosensors, green chemistry, bioenergy, and pharmaceuticals, holds enormous promise. Large-scale efforts to sequence aquatic genomes and metagenomes, as well as campaigns to isolate new organisms and culture-based screenings, are helping to push the boundaries of known organisms. Mass spectrometry-based proteomics can complement 16S gene sequencing in the effort to discover new organisms of potential relevance to blue biotechnology by facilitating the rapid screening of microbial isolates and by providing in depth profiles of the proteomes and metaproteomes of marine organisms, both model cultivable isolates and, more recently, exotic non-cultivable species and communities. Proteomics has already contributed to blue biotechnology by identifying aquatic proteins with potential applications to food fermentation, the textile industry, and biomedical drug development. In this review, we discuss historical developments in blue biotechnology, the current limitations to the known marine biosphere, and the ways in which mass spectrometry can expand that knowledge. We further speculate about directions that research in blue biotechnology will take given current and near-future technological advancements in mass spectrometry. Copyright © 2014 Elsevier B.V. All rights reserved.
Quantitative proteomic analysis reveals posttranslational responses to aneuploidy in yeast
Dephoure, Noah; Hwang, Sunyoung; O'Sullivan, Ciara; Dodgson, Stacie E; Gygi, Steven P; Amon, Angelika; Torres, Eduardo M
2014-01-01
Aneuploidy causes severe developmental defects and is a near universal feature of tumor cells. Despite its profound effects, the cellular processes affected by aneuploidy are not well characterized. Here, we examined the consequences of aneuploidy on the proteome of aneuploid budding yeast strains. We show that although protein levels largely scale with gene copy number, subunits of multi-protein complexes are notable exceptions. Posttranslational mechanisms attenuate their expression when their encoding genes are in excess. Our proteomic analyses further revealed a novel aneuploidy-associated protein expression signature characteristic of altered metabolism and redox homeostasis. Indeed aneuploid cells harbor increased levels of reactive oxygen species (ROS). Interestingly, increased protein turnover attenuates ROS levels and this novel aneuploidy-associated signature and improves the fitness of most aneuploid strains. Our results show that aneuploidy causes alterations in metabolism and redox homeostasis. Cells respond to these alterations through both transcriptional and posttranscriptional mechanisms. DOI: http://dx.doi.org/10.7554/eLife.03023.001 PMID:25073701
Principles of proteome allocation are revealed using proteomic data and genome-scale models
Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.; Ebrahim, Ali; Saunders, Michael A.; Palsson, Bernhard O.
2016-01-01
Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the “generalist” (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions, prediction errors for growth rate and metabolic fluxes were 69% and 14% lower, respectively. The sector-constrained ME model thus represents a generalist ME model reflecting both growth rate maximization and “hedging” against uncertain environments and stresses, as indicated by significant enrichment of these sectors for the general stress response sigma factor σS. Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally-related protein groups) as demonstrated here. This flexible formalism provides an accessible approach for narrowing the gap between the complexity captured by omics data and governing principles of proteome allocation described by systems-level models. PMID:27857205
Principles of proteome allocation are revealed using proteomic data and genome-scale models
Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.; ...
2016-11-18
Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the “generalist” (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions, prediction errors for growth rate and metabolic fluxes were 69% and 14% lower, respectively. The sector-constrained ME model thusmore » represents a generalist ME model reflecting both growth rate maximization and “hedging” against uncertain environments and stresses, as indicated by significant enrichment of these sectors for the general stress response sigma factor σS. Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally-related protein groups) as demonstrated here. Furthermore, this flexible formalism provides an accessible approach for narrowing the gap between the complexity captured by omics data and governing principles of proteome allocation described by systems-level models.« less
Woo, Jongmin; Han, Dohyun; Wang, Joseph Injae; Park, Joonho; Kim, Hyunsoo; Kim, Youngsoo
2017-09-01
The development of systematic proteomic quantification techniques in systems biology research has enabled one to perform an in-depth analysis of cellular systems. We have developed a systematic proteomic approach that encompasses the spectrum from global to targeted analysis on a single platform. We have applied this technique to an activated microglia cell system to examine changes in the intracellular and extracellular proteomes. Microglia become activated when their homeostatic microenvironment is disrupted. There are varying degrees of microglial activation, and we chose to focus on the proinflammatory reactive state that is induced by exposure to such stimuli as lipopolysaccharide (LPS) and interferon-gamma (IFN-γ). Using an improved shotgun proteomics approach, we identified 5497 proteins in the whole-cell proteome and 4938 proteins in the secretome that were associated with the activation of BV2 mouse microglia by LPS or IFN-γ. Of the differentially expressed proteins in stimulated microglia, we classified pathways that were related to immune-inflammatory responses and metabolism. Our label-free parallel reaction monitoring (PRM) approach made it possible to comprehensively measure the hyper-multiplex quantitative value of each protein by high-resolution mass spectrometry. Over 450 peptides that corresponded to pathway proteins and direct or indirect interactors via the STRING database were quantified by label-free PRM in a single run. Moreover, we performed a longitudinal quantification of secreted proteins during microglial activation, in which neurotoxic molecules that mediate neuronal cell loss in the brain are released. These data suggest that latent pathways that are associated with neurodegenerative diseases can be discovered by constructing and analyzing a pathway network model of proteins. Furthermore, this systematic quantification platform has tremendous potential for applications in large-scale targeted analyses. The proteomics data for discovery and label-free PRM analysis have been deposited to the ProteomeXchange Consortium with identifiers
Gao, Yan; Lim, Teck Kwang; Lin, Qingsong; Li, Sam Fong Yau
2016-04-29
Cypermethrin (CYP) is one of the most widely used pesticides in large scale for agricultural and domestic purpose and the residue often seriously affects aquatic system. Environmental pollutant-induced protein changes in organisms could be detected by proteomics, leading to discovery of potential biomarkers and understanding of mode of action. While proteomics investigations of CYP stress in some animal models have been well studied, few reports about the effects of exposure to CYP on algae proteome were published. To determine CYP effect in algae, the impact of various dosages (0.001μg/L, 0.01μg/L and 1μg/L) of CYP on green algae Chlorella vulgaris for 24h and 96h was investigated by using iTRAQ quantitative proteomics technique. A total of 162 and 198 proteins were significantly altered after CYP exposure for 24h and 96h, respectively. Overview of iTRAQ results indicated that the influence of CYP on algae protein might be dosage-dependent. Functional analysis of differentially expressed proteins showed that CYP could induce protein alterations related to photosynthesis, stress responses and carbohydrate metabolism. This study provides a comprehensive view of complex mode of action of algae under CYP stress and highlights several potential biomarkers for further investigation of pesticide-exposed plant and algae. Copyright © 2016 Elsevier B.V. All rights reserved.
Guo, Hao-Bo; Ma, Yue; Tuskan, Gerald A.; ...
2018-01-01
The existence of complete genome sequences makes it important to develop different approaches for classification of large-scale data sets and to make extraction of biological insights easier. Here, we propose an approach for classification of complete proteomes/protein sets based on protein distributions on some basic attributes. We demonstrate the usefulness of this approach by determining protein distributions in terms of two attributes: protein lengths and protein intrinsic disorder contents (ID). The protein distributions based on L and ID are surveyed for representative proteome organisms and protein sets from the three domains of life. The two-dimensional maps (designated as fingerprints here)more » from the protein distribution densities in the LD space defined by ln( L ) and ID are then constructed. The fingerprints for different organisms and protein sets are found to be distinct with each other, and they can therefore be used for comparative studies. As a test case, phylogenetic trees have been constructed based on the protein distribution densities in the fingerprints of proteomes of organisms without performing any protein sequence comparison and alignments. The phylogenetic trees generated are biologically meaningful, demonstrating that the protein distributions in the LD space may serve as unique phylogenetic signals of the organisms at the proteome level.« less
Gu, Liqing; Robinson, Renã A. S.
2016-01-01
Cysteine is a highly reactive amino acid and is subject to a variety of reversible post-translational modifications (PTMs), including nitrosylation, glutathionylation, palmitoylation, as well as formation of sulfenic acid and disulfides. These modifications are not only involved in normal biological activities, such as enzymatic catalysis, redox signaling and cellular homeostasis, but can also be the result of oxidative damage. Especially in aging and neurodegenerative diseases, oxidative stress leads to aberrant cysteine oxidations that affect protein structure and function leading to neurodegeneration as well as other detrimental effects. Methods that can identify cysteine modifications by type, including the site of modification, as well as the relative stoichiometry of the modification can be very helpful for understanding the role of the thiol proteome and redox homeostasis in the context of disease. Cysteine reversible modifications however, are challenging to investigate as they are low abundant, diverse, and labile especially under endogenous conditions. Thanks to the development of redox proteomic approaches, large-scale quantification of cysteine reversible modifications is possible. These approaches cover a range of strategies to enrich, identify, and quantify cysteine reversible modifications from biological samples. This review will focus on nongel-based redox proteomics workflows that give quantitative information about cysteine PTMs and highlight how these strategies have been useful for investigating the redox thiol proteome in aging and neurodegenerative diseases. PMID:27666938
Advances in targeted proteomics and applications to biomedical research
Shi, Tujin; Song, Ehwang; Nie, Song; Rodland, Karin D.; Liu, Tao; Qian, Wei-Jun; Smith, Richard D.
2016-01-01
Targeted proteomics technique has emerged as a powerful protein quantification tool in systems biology, biomedical research, and increasing for clinical applications. The most widely used targeted proteomics approach, selected reaction monitoring (SRM), also known as multiple reaction monitoring (MRM), can be used for quantification of cellular signaling networks and preclinical verification of candidate protein biomarkers. As an extension to our previous review on advances in SRM sensitivity herein we review recent advances in the method and technology for further enhancing SRM sensitivity (from 2012 to present), and highlighting its broad biomedical applications in human bodily fluids, tissue and cell lines. Furthermore, we also review two recently introduced targeted proteomics approaches, parallel reaction monitoring (PRM) and data-independent acquisition (DIA) with targeted data extraction on fast scanning high-resolution accurate-mass (HR/AM) instruments. Such HR/AM targeted quantification with monitoring all target product ions addresses SRM limitations effectively in specificity and multiplexing; whereas when compared to SRM, PRM and DIA are still in the infancy with a limited number of applications. Thus, for HR/AM targeted quantification we focus our discussion on method development, data processing and analysis, and its advantages and limitations in targeted proteomics. Finally, general perspectives on the potential of achieving both high sensitivity and high sample throughput for large-scale quantification of hundreds of target proteins are discussed. PMID:27302376
An object model and database for functional genomics.
Jones, Andrew; Hunt, Ela; Wastling, Jonathan M; Pizarro, Angel; Stoeckert, Christian J
2004-07-10
Large-scale functional genomics analysis is now feasible and presents significant challenges in data analysis, storage and querying. Data standards are required to enable the development of public data repositories and to improve data sharing. There is an established data format for microarrays (microarray gene expression markup language, MAGE-ML) and a draft standard for proteomics (PEDRo). We believe that all types of functional genomics experiments should be annotated in a consistent manner, and we hope to open up new ways of comparing multiple datasets used in functional genomics. We have created a functional genomics experiment object model (FGE-OM), developed from the microarray model, MAGE-OM and two models for proteomics, PEDRo and our own model (Gla-PSI-Glasgow Proposal for the Proteomics Standards Initiative). FGE-OM comprises three namespaces representing (i) the parts of the model common to all functional genomics experiments; (ii) microarray-specific components; and (iii) proteomics-specific components. We believe that FGE-OM should initiate discussion about the contents and structure of the next version of MAGE and the future of proteomics standards. A prototype database called RNA And Protein Abundance Database (RAPAD), based on FGE-OM, has been implemented and populated with data from microbial pathogenesis. FGE-OM and the RAPAD schema are available from http://www.gusdb.org/fge.html, along with a set of more detailed diagrams. RAPAD can be accessed by registration at the site.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guo, Hao-Bo; Ma, Yue; Tuskan, Gerald A.
The existence of complete genome sequences makes it important to develop different approaches for classification of large-scale data sets and to make extraction of biological insights easier. Here, we propose an approach for classification of complete proteomes/protein sets based on protein distributions on some basic attributes. We demonstrate the usefulness of this approach by determining protein distributions in terms of two attributes: protein lengths and protein intrinsic disorder contents (ID). The protein distributions based on L and ID are surveyed for representative proteome organisms and protein sets from the three domains of life. The two-dimensional maps (designated as fingerprints here)more » from the protein distribution densities in the LD space defined by ln( L ) and ID are then constructed. The fingerprints for different organisms and protein sets are found to be distinct with each other, and they can therefore be used for comparative studies. As a test case, phylogenetic trees have been constructed based on the protein distribution densities in the fingerprints of proteomes of organisms without performing any protein sequence comparison and alignments. The phylogenetic trees generated are biologically meaningful, demonstrating that the protein distributions in the LD space may serve as unique phylogenetic signals of the organisms at the proteome level.« less
hEIDI: An Intuitive Application Tool To Organize and Treat Large-Scale Proteomics Data.
Hesse, Anne-Marie; Dupierris, Véronique; Adam, Claire; Court, Magali; Barthe, Damien; Emadali, Anouk; Masselon, Christophe; Ferro, Myriam; Bruley, Christophe
2016-10-07
Advances in high-throughput proteomics have led to a rapid increase in the number, size, and complexity of the associated data sets. Managing and extracting reliable information from such large series of data sets require the use of dedicated software organized in a consistent pipeline to reduce, validate, exploit, and ultimately export data. The compilation of multiple mass-spectrometry-based identification and quantification results obtained in the context of a large-scale project represents a real challenge for developers of bioinformatics solutions. In response to this challenge, we developed a dedicated software suite called hEIDI to manage and combine both identifications and semiquantitative data related to multiple LC-MS/MS analyses. This paper describes how, through a user-friendly interface, hEIDI can be used to compile analyses and retrieve lists of nonredundant protein groups. Moreover, hEIDI allows direct comparison of series of analyses, on the basis of protein groups, while ensuring consistent protein inference and also computing spectral counts. hEIDI ensures that validated results are compliant with MIAPE guidelines as all information related to samples and results is stored in appropriate databases. Thanks to the database structure, validated results generated within hEIDI can be easily exported in the PRIDE XML format for subsequent publication. hEIDI can be downloaded from http://biodev.extra.cea.fr/docs/heidi .
Guidelines for reporting quantitative mass spectrometry based experiments in proteomics.
Martínez-Bartolomé, Salvador; Deutsch, Eric W; Binz, Pierre-Alain; Jones, Andrew R; Eisenacher, Martin; Mayer, Gerhard; Campos, Alex; Canals, Francesc; Bech-Serra, Joan-Josep; Carrascal, Montserrat; Gay, Marina; Paradela, Alberto; Navajas, Rosana; Marcilla, Miguel; Hernáez, María Luisa; Gutiérrez-Blázquez, María Dolores; Velarde, Luis Felipe Clemente; Aloria, Kerman; Beaskoetxea, Jabier; Medina-Aunon, J Alberto; Albar, Juan P
2013-12-16
Mass spectrometry is already a well-established protein identification tool and recent methodological and technological developments have also made possible the extraction of quantitative data of protein abundance in large-scale studies. Several strategies for absolute and relative quantitative proteomics and the statistical assessment of quantifications are possible, each having specific measurements and therefore, different data analysis workflows. The guidelines for Mass Spectrometry Quantification allow the description of a wide range of quantitative approaches, including labeled and label-free techniques and also targeted approaches such as Selected Reaction Monitoring (SRM). The HUPO Proteomics Standards Initiative (HUPO-PSI) has invested considerable efforts to improve the standardization of proteomics data handling, representation and sharing through the development of data standards, reporting guidelines, controlled vocabularies and tooling. In this manuscript, we describe a key output from the HUPO-PSI-namely the MIAPE Quant guidelines, which have developed in parallel with the corresponding data exchange format mzQuantML [1]. The MIAPE Quant guidelines describe the HUPO-PSI proposal concerning the minimum information to be reported when a quantitative data set, derived from mass spectrometry (MS), is submitted to a database or as supplementary information to a journal. The guidelines have been developed with input from a broad spectrum of stakeholders in the proteomics field to represent a true consensus view of the most important data types and metadata, required for a quantitative experiment to be analyzed critically or a data analysis pipeline to be reproduced. It is anticipated that they will influence or be directly adopted as part of journal guidelines for publication and by public proteomics databases and thus may have an impact on proteomics laboratories across the world. This article is part of a Special Issue entitled: Standardization and Quality Control. Copyright © 2013 Elsevier B.V. All rights reserved.
Vanderperre, Benoît; Lucier, Jean-François; Bissonnette, Cyntia; Motard, Julie; Tremblay, Guillaume; Vanderperre, Solène; Wisztorski, Maxence; Salzet, Michel; Boisvert, François-Michel; Roucou, Xavier
2013-01-01
A fully mature mRNA is usually associated to a reference open reading frame encoding a single protein. Yet, mature mRNAs contain unconventional alternative open reading frames (AltORFs) located in untranslated regions (UTRs) or overlapping the reference ORFs (RefORFs) in non-canonical +2 and +3 reading frames. Although recent ribosome profiling and footprinting approaches have suggested the significant use of unconventional translation initiation sites in mammals, direct evidence of large-scale alternative protein expression at the proteome level is still lacking. To determine the contribution of alternative proteins to the human proteome, we generated a database of predicted human AltORFs revealing a new proteome mainly composed of small proteins with a median length of 57 amino acids, compared to 344 amino acids for the reference proteome. We experimentally detected a total of 1,259 alternative proteins by mass spectrometry analyses of human cell lines, tissues and fluids. In plasma and serum, alternative proteins represent up to 55% of the proteome and may be a potential unsuspected new source for biomarkers. We observed constitutive co-expression of RefORFs and AltORFs from endogenous genes and from transfected cDNAs, including tumor suppressor p53, and provide evidence that out-of-frame clones representing AltORFs are mistakenly rejected as false positive in cDNAs screening assays. Functional importance of alternative proteins is strongly supported by significant evolutionary conservation in vertebrates, invertebrates, and yeast. Our results imply that coding of multiple proteins in a single gene by the use of AltORFs may be a common feature in eukaryotes, and confirm that translation of unconventional ORFs generates an as yet unexplored proteome. PMID:23950983
Spectrum-to-Spectrum Searching Using a Proteome-wide Spectral Library*
Yen, Chia-Yu; Houel, Stephane; Ahn, Natalie G.; Old, William M.
2011-01-01
The unambiguous assignment of tandem mass spectra (MS/MS) to peptide sequences remains a key unsolved problem in proteomics. Spectral library search strategies have emerged as a promising alternative for peptide identification, in which MS/MS spectra are directly compared against a reference library of confidently assigned spectra. Two problems relate to library size. First, reference spectral libraries are limited to rediscovery of previously identified peptides and are not applicable to new peptides, because of their incomplete coverage of the human proteome. Second, problems arise when searching a spectral library the size of the entire human proteome. We observed that traditional dot product scoring methods do not scale well with spectral library size, showing reduction in sensitivity when library size is increased. We show that this problem can be addressed by optimizing scoring metrics for spectrum-to-spectrum searches with large spectral libraries. MS/MS spectra for the 1.3 million predicted tryptic peptides in the human proteome are simulated using a kinetic fragmentation model (MassAnalyzer version2.1) to create a proteome-wide simulated spectral library. Searches of the simulated library increase MS/MS assignments by 24% compared with Mascot, when using probabilistic and rank based scoring methods. The proteome-wide coverage of the simulated library leads to 11% increase in unique peptide assignments, compared with parallel searches of a reference spectral library. Further improvement is attained when reference spectra and simulated spectra are combined into a hybrid spectral library, yielding 52% increased MS/MS assignments compared with Mascot searches. Our study demonstrates the advantages of using probabilistic and rank based scores to improve performance of spectrum-to-spectrum search strategies. PMID:21532008
Zhan, Xianquan; Yang, Haiyan; Peng, Fang; Li, Jianglin; Mu, Yun; Long, Ying; Cheng, Tingting; Huang, Yuda; Li, Zhao; Lu, Miaolong; Li, Na; Li, Maoyu; Liu, Jianping; Jungblut, Peter R
2018-04-01
Two-dimensional gel electrophoresis (2DE) in proteomics is traditionally assumed to contain only one or two proteins in each 2DE spot. However, 2DE resolution is being complemented by the rapid development of high sensitivity mass spectrometers. Here we compared MALDI-MS, LC-Q-TOF MS and LC-Orbitrap Velos MS for the identification of proteins within one spot. With LC-Orbitrap Velos MS each Coomassie Blue-stained 2DE spot contained an average of at least 42 and 63 proteins/spot in an analysis of a human glioblastoma proteome and a human pituitary adenoma proteome, respectively, if a single gel spot was analyzed. If a pool of three matched gel spots was analyzed this number further increased up to an average of 230 and 118 proteins/spot for glioblastoma and pituitary adenoma proteome, respectively. Multiple proteins per spot confirm the necessity of isotopic labeling in large-scale quantification of different protein species in a proteome. Furthermore, a protein abundance analysis revealed that most of the identified proteins in each analyzed 2DE spot were low-abundance proteins. Many proteins were present in several of the analyzed spots showing the ability of 2DE-MS to separate at the protein species level. Therefore, 2DE coupled with high-sensitivity LC-MS has a clearly higher sensitivity as expected until now to detect, identify and quantify low abundance proteins in a complex human proteome with an estimated resolution of about 500 000 protein species. This clearly exceeds the resolution power of bottom-up LC-MS investigations. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Astrophysics Data System (ADS)
Khatri, Kshitij; Pu, Yi; Klein, Joshua A.; Wei, Juan; Costello, Catherine E.; Lin, Cheng; Zaia, Joseph
2018-04-01
Analysis of singly glycosylated peptides has evolved to a point where large-scale LC-MS analyses can be performed at almost the same scale as proteomics experiments. While collisionally activated dissociation (CAD) remains the mainstay of bottom-up analyses, it performs poorly for the middle-down analysis of multiply glycosylated peptides. With improvements in instrumentation, electron-activated dissociation (ExD) modes are becoming increasingly prevalent for proteomics experiments and for the analysis of fragile modifications such as glycosylation. While these methods have been applied for glycopeptide analysis in isolated studies, an organized effort to compare their efficiencies, particularly for analysis of multiply glycosylated peptides (termed here middle-down glycoproteomics), has not been made. We therefore compared the performance of different ExD modes for middle-down glycopeptide analyses. We identified key features among the different dissociation modes and show that increased electron energy and supplemental activation provide the most useful data for middle-down glycopeptide analysis. [Figure not available: see fulltext.
Identification of Phosphorylated Proteins on a Global Scale.
Iliuk, Anton
2018-05-31
Liquid chromatography (LC) coupled with tandem mass spectrometry (MS/MS) has enabled researchers to analyze complex biological samples with unprecedented depth. It facilitates the identification and quantification of modifications within thousands of proteins in a single large-scale proteomic experiment. Analysis of phosphorylation, one of the most common and important post-translational modifications, has particularly benefited from such progress in the field. Here, detailed protocols are provided for a few well-regarded, common sample preparation methods for an effective phosphoproteomic experiment. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.
Chen, Xiang; Velliste, Meel; Murphy, Robert F.
2010-01-01
Proteomics, the large scale identification and characterization of many or all proteins expressed in a given cell type, has become a major area of biological research. In addition to information on protein sequence, structure and expression levels, knowledge of a protein’s subcellular location is essential to a complete understanding of its functions. Currently subcellular location patterns are routinely determined by visual inspection of fluorescence microscope images. We review here research aimed at creating systems for automated, systematic determination of location. These employ numerical feature extraction from images, feature reduction to identify the most useful features, and various supervised learning (classification) and unsupervised learning (clustering) methods. These methods have been shown to perform significantly better than human interpretation of the same images. When coupled with technologies for tagging large numbers of proteins and high-throughput microscope systems, the computational methods reviewed here enable the new subfield of location proteomics. This subfield will make critical contributions in two related areas. First, it will provide structured, high-resolution information on location to enable Systems Biology efforts to simulate cell behavior from the gene level on up. Second, it will provide tools for Cytomics projects aimed at characterizing the behaviors of all cell types before, during and after the onset of various diseases. PMID:16752421
Yang, Laurence; Tan, Justin; O'Brien, Edward J; Monk, Jonathan M; Kim, Donghyuk; Li, Howard J; Charusanti, Pep; Ebrahim, Ali; Lloyd, Colton J; Yurkovich, James T; Du, Bin; Dräger, Andreas; Thomas, Alex; Sun, Yuekai; Saunders, Michael A; Palsson, Bernhard O
2015-08-25
Finding the minimal set of gene functions needed to sustain life is of both fundamental and practical importance. Minimal gene lists have been proposed by using comparative genomics-based core proteome definitions. A definition of a core proteome that is supported by empirical data, is understood at the systems-level, and provides a basis for computing essential cell functions is lacking. Here, we use a systems biology-based genome-scale model of metabolism and expression to define a functional core proteome consisting of 356 gene products, accounting for 44% of the Escherichia coli proteome by mass based on proteomics data. This systems biology core proteome includes 212 genes not found in previous comparative genomics-based core proteome definitions, accounts for 65% of known essential genes in E. coli, and has 78% gene function overlap with minimal genomes (Buchnera aphidicola and Mycoplasma genitalium). Based on transcriptomics data across environmental and genetic backgrounds, the systems biology core proteome is significantly enriched in nondifferentially expressed genes and depleted in differentially expressed genes. Compared with the noncore, core gene expression levels are also similar across genetic backgrounds (two times higher Spearman rank correlation) and exhibit significantly more complex transcriptional and posttranscriptional regulatory features (40% more transcription start sites per gene, 22% longer 5'UTR). Thus, genome-scale systems biology approaches rigorously identify a functional core proteome needed to support growth. This framework, validated by using high-throughput datasets, facilitates a mechanistic understanding of systems-level core proteome function through in silico models; it de facto defines a paleome.
Knoll-Gellida, Anja; André, Michèle; Gattegno, Tamar; Forgue, Jean; Admon, Arie; Babin, Patrick J
2006-01-01
Background The ability of an oocyte to develop into a viable embryo depends on the accumulation of specific maternal information and molecules, such as RNAs and proteins. A serial analysis of gene expression (SAGE) was carried out in parallel with proteomic analysis on fully-grown ovarian follicles from zebrafish (Danio rerio). The data obtained were compared with ovary/follicle/egg molecular phenotypes of other animals, published or available in public sequence databases. Results Sequencing of 27,486 SAGE tags identified 11,399 different ones, including 3,329 tags with an occurrence superior to one. Fifty-eight genes were expressed at over 0.15% of the total population and represented 17.34% of the mRNA population identified. The three most expressed transcripts were a rhamnose-binding lectin, beta-actin 2, and a transcribed locus similar to the H2B histone family. Comparison with the large-scale expressed sequence tags sequencing approach revealed highly expressed transcripts that were not previously known to be expressed at high levels in fish ovaries, like the short-sized polarized metallothionein 2 transcript. A higher sensitivity for the detection of transcripts with a characterized maternal genetic contribution was also demonstrated compared to large-scale sequencing of cDNA libraries. Ferritin heavy polypeptide 1, heat shock protein 90-beta, lactate dehydrogenase B4, beta-actin isoforms, tubulin beta 2, ATP synthase subunit 9, together with 40 S ribosomal protein S27a, were common highly-expressed transcripts of vertebrate ovary/unfertilized egg. Comparison of transcriptome and proteome data revealed that transcript levels provide little predictive value with respect to the extent of protein abundance. All the proteins identified by proteomic analysis of fully-grown zebrafish follicles had at least one transcript counterpart, with two exceptions: eosinophil chemotactic cytokine and nothepsin. Conclusion This study provides a complete sequence data set of maternal mRNA stored in zebrafish germ cells at the end of oogenesis. This catalogue contains highly-expressed transcripts that are part of a vertebrate ovarian expressed gene signature. Comparison of transcriptome and proteome data identified downregulated transcripts or proteins potentially incorporated in the oocyte by endocytosis. The molecular phenotype described provides groundwork for future experimental approaches aimed at identifying functionally important stored maternal transcripts and proteins involved in oogenesis and early stages of embryo development. PMID:16526958
Proteome-scale human interactomics
Luck, Katja; Sheynkman, Gloria M.; Zhang, Ivy; Vidal, Marc
2017-01-01
Cellular functions are mediated by complex interactome networks of physical, biochemical, and functional interactions between DNA sequences, RNA molecules, proteins, lipids, and small metabolites. A thorough understanding of cellular organization requires accurate and relatively complete models of interactome networks at proteome-scale. The recent publication of four human protein-protein interaction (PPI) maps represents a technological breakthrough and an unprecedented resource for the scientific community, heralding a new era of proteome-scale human interactomics. Our knowledge gained from these and complementary studies provides fresh insights into the opportunities and challenges when analyzing systematically generated interactome data, defines a clear roadmap towards the generation of a first reference interactome, and reveals new perspectives on the organization of cellular life. PMID:28284537
Boyanova, Desislava; Nilla, Santosh; Klau, Gunnar W.; Dandekar, Thomas; Müller, Tobias; Dittrich, Marcus
2014-01-01
The continuously evolving field of proteomics produces increasing amounts of data while improving the quality of protein identifications. Albeit quantitative measurements are becoming more popular, many proteomic studies are still based on non-quantitative methods for protein identification. These studies result in potentially large sets of identified proteins, where the biological interpretation of proteins can be challenging. Systems biology develops innovative network-based methods, which allow an integrated analysis of these data. Here we present a novel approach, which combines prior knowledge of protein-protein interactions (PPI) with proteomics data using functional similarity measurements of interacting proteins. This integrated network analysis exactly identifies network modules with a maximal consistent functional similarity reflecting biological processes of the investigated cells. We validated our approach on small (H9N2 virus-infected gastric cells) and large (blood constituents) proteomic data sets. Using this novel algorithm, we identified characteristic functional modules in virus-infected cells, comprising key signaling proteins (e.g. the stress-related kinase RAF1) and demonstrate that this method allows a module-based functional characterization of cell types. Analysis of a large proteome data set of blood constituents resulted in clear separation of blood cells according to their developmental origin. A detailed investigation of the T-cell proteome further illustrates how the algorithm partitions large networks into functional subnetworks each representing specific cellular functions. These results demonstrate that the integrated network approach not only allows a detailed analysis of proteome networks but also yields a functional decomposition of complex proteomic data sets and thereby provides deeper insights into the underlying cellular processes of the investigated system. PMID:24807868
Design and analysis issues in quantitative proteomics studies.
Karp, Natasha A; Lilley, Kathryn S
2007-09-01
Quantitative proteomics is the comparison of distinct proteomes which enables the identification of protein species which exhibit changes in expression or post-translational state in response to a given stimulus. Many different quantitative techniques are being utilized and generate large datasets. Independent of the technique used, these large datasets need robust data analysis to ensure valid conclusions are drawn from such studies. Approaches to address the problems that arise with large datasets are discussed to give insight into the types of statistical analyses of data appropriate for the various experimental strategies that can be employed by quantitative proteomic studies. This review also highlights the importance of employing a robust experimental design and highlights various issues surrounding the design of experiments. The concepts and examples discussed within will show how robust design and analysis will lead to confident results that will ensure quantitative proteomics delivers.
Wang, Jianguo; Xie, Haiyang; Li, Jie; Cao, Jili; Zhou, Lin; Zheng, Shusen
2016-01-01
The more accurate biomarkers have long been desired for hepatocellular carcinoma (HCC). Here, we characterized global large-scale proteomics of multistep hepatocarcinogenesis in an attempt to identify novel biomarkers for HCC. Quantitative data of 37874 sequences and 3017 proteins during hepatocarcinogenesis were obtained in cohort 1 of 75 samples (5 pooled groups: normal livers, hepatitis livers, cirrhotic livers, peritumoral livers, and HCC tissues) by iTRAQ 2D LC-MS/MS. The diagnostic performance of the top six most upregulated proteins in HCC group and HSP70 as reference were subsequently validated in cohort 2 of 114 samples (hepatocarcinogenesis from normal livers to HCC) using immunohistochemistry. Of seven candidate protein markers, PARP1, GS and NDRG1 showed the optimal diagnostic performance for HCC. PARP1, as a novel marker, showed comparable diagnostic performance to that of classic markers GS and NDRG1 in HCC (AUCs = 0.872, 0.856 and 0.792, respectively). A significant higher AUC of 0.945 was achieved when three markers combined. For diagnosis of HCC, the sensitivity and specificity were 88.2% and 81.0% when at least two of the markers were positive. Similar diagnostic values of PARP1, GS and NDRG1 were confirmed by immunohistochemistry in cohort 3 of 180 HCC patients. Further analysis indicated that PARP1 and NDRG1 were associated with some clinicopathological features, and the independent prognostic factors for HCC patients. Overall, global large-scale proteomics on spectrum of multistep hepatocarcinogenesis are obtained. PARP1 is a novel promising diagnostic/prognostic marker for HCC, and the three-marker panel (PARP1, GS and NDRG1) with excellent diagnostic performance for HCC was established. PMID:26883192
Rea, Giuseppina; Cristofaro, Francesco; Pani, Giuseppe; Pascucci, Barbara; Ghuge, Sandip A; Corsetto, Paola Antonia; Imbriani, Marcello; Visai, Livia; Rizzo, Angela M
2016-03-30
Space is a hostile environment characterized by high vacuum, extreme temperatures, meteoroids, space debris, ionospheric plasma, microgravity and space radiation, which all represent risks for human health. A deep understanding of the biological consequences of exposure to the space environment is required to design efficient countermeasures to minimize their negative impact on human health. Recently, proteomic approaches have received a significant amount of attention in the effort to further study microgravity-induced physiological changes. In this review, we summarize the current knowledge about the effects of microgravity on microorganisms (in particular Cupriavidus metallidurans CH34, Bacillus cereus and Rhodospirillum rubrum S1H), plants (whole plants, organs, and cell cultures), mammalian cells (endothelial cells, bone cells, chondrocytes, muscle cells, thyroid cancer cells, immune system cells) and animals (invertebrates, vertebrates and mammals). Herein, we describe their proteome's response to microgravity, focusing on proteomic discoveries and their future potential applications in space research. Space experiments and operational flight experience have identified detrimental effects on human health and performance because of exposure to weightlessness, even when currently available countermeasures are implemented. Many experimental tools and methods have been developed to study microgravity induced physiological changes. Recently, genomic and proteomic approaches have received a significant amount of attention. This review summarizes the recent research studies of the proteome response to microgravity inmicroorganisms, plants, mammalians cells and animals. Current proteomic tools allow large-scale, high-throughput analyses for the detection, identification, and functional investigation of all proteomes. Understanding gene and/or protein expression is the key to unlocking the mechanisms behind microgravity-induced problems and to finding effective countermeasures to spaceflight-induced alterations but also for the study of diseases on earth. Future perspectives are also highlighted. Copyright © 2015 Elsevier B.V. All rights reserved.
Friedman, David B
2012-01-01
All quantitative proteomics experiments measure variation between samples. When performing large-scale experiments that involve multiple conditions or treatments, the experimental design should include the appropriate number of individual biological replicates from each condition to enable the distinction between a relevant biological signal from technical noise. Multivariate statistical analyses, such as principal component analysis (PCA), provide a global perspective on experimental variation, thereby enabling the assessment of whether the variation describes the expected biological signal or the unanticipated technical/biological noise inherent in the system. Examples will be shown from high-resolution multivariable DIGE experiments where PCA was instrumental in demonstrating biologically significant variation as well as sample outliers, fouled samples, and overriding technical variation that would not be readily observed using standard univariate tests.
Barkla, Bronwyn J
2018-01-01
Free flow zonal electrophoresis (FFZE) is a versatile, reproducible, and potentially high-throughput technique for the separation of plant organelles and membranes by differences in membrane surface charge. It offers considerable benefits over traditional fractionation techniques, such as density gradient centrifugation and two-phase partitioning, as it is relatively fast, sample recovery is high, and the method provides unparalleled sample purity. It has been used to successfully purify chloroplasts and mitochondria from plants but also, to obtain highly pure fractions of plasma membrane, tonoplast, ER, Golgi, and thylakoid membranes. Application of the technique can significantly improve protein coverage in large-scale proteomics studies by decreasing sample complexity. Here, we describe the method for the fractionation of plant cellular membranes from leaves by FFZE.
A genome-wide structure-based survey of nucleotide binding proteins in M. tuberculosis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bhagavat, Raghu; Kim, Heung -Bok; Kim, Chang -Yub
Nucleoside tri-phosphates (NTP) form an important class of small molecule ligands that participate in, and are essential to a large number of biological processes. Here, we seek to identify the NTP binding proteome (NTPome) in M. tuberculosis (M.tb), a deadly pathogen. Identifying the NTPome is useful not only for gaining functional insights of the individual proteins but also for identifying useful drug targets. From an earlier study, we had structural models of M.tb at a proteome scale from which a set of 13,858 small molecule binding pockets were identified. We use a set of NTP binding sub-structural motifs derived frommore » a previous study and scan the M.tb pocketome, and find that 1,768 proteins or 43% of the proteome can theoretically bind NTP ligands. Using an experimental proteomics approach involving dye-ligand affinity chromatography, we confirm NTP binding to 47 different proteins, of which 4 are hypothetical proteins. Our analysis also provides the precise list of binding site residues in each case, and the probable ligand binding pose. In conclusion, as the list includes a number of known and potential drug targets, the identification of NTP binding can directly facilitate structure-based drug design of these targets.« less
Kim, Young-Ha; slam, Mohammad Saiful; You, Myung-Jo
2015-01-01
Proteomic tools allow large-scale, high-throughput analyses for the detection, identification, and functional investigation of proteome. For detection of antigens from Haemaphysalis longicornis, 1-dimensional electrophoresis (1-DE) quantitative immunoblotting technique combined with 2-dimensional electrophoresis (2-DE) immunoblotting was used for whole body proteins from unfed and partially fed female ticks. Reactivity bands and 2-DE immunoblotting were performed following 2-DE electrophoresis to identify protein spots. The proteome of the partially fed female had a larger number of lower molecular weight proteins than that of the unfed female tick. The total number of detected spots was 818 for unfed and 670 for partially fed female ticks. The 2-DE immunoblotting identified 10 antigenic spots from unfed females and 8 antigenic spots from partially fed females. Matrix Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry (MALDI-TOF) of relevant spots identified calreticulin, putative secreted WC salivary protein, and a conserved hypothetical protein from the National Center for Biotechnology Information and Swiss Prot protein sequence databases. These findings indicate that most of the whole body components of these ticks are non-immunogenic. The data reported here will provide guidance in the identification of antigenic proteins to prevent infestation and diseases transmitted by H. longicornis. PMID:25748713
A genome-wide structure-based survey of nucleotide binding proteins in M. tuberculosis
Bhagavat, Raghu; Kim, Heung -Bok; Kim, Chang -Yub; ...
2017-10-02
Nucleoside tri-phosphates (NTP) form an important class of small molecule ligands that participate in, and are essential to a large number of biological processes. Here, we seek to identify the NTP binding proteome (NTPome) in M. tuberculosis (M.tb), a deadly pathogen. Identifying the NTPome is useful not only for gaining functional insights of the individual proteins but also for identifying useful drug targets. From an earlier study, we had structural models of M.tb at a proteome scale from which a set of 13,858 small molecule binding pockets were identified. We use a set of NTP binding sub-structural motifs derived frommore » a previous study and scan the M.tb pocketome, and find that 1,768 proteins or 43% of the proteome can theoretically bind NTP ligands. Using an experimental proteomics approach involving dye-ligand affinity chromatography, we confirm NTP binding to 47 different proteins, of which 4 are hypothetical proteins. Our analysis also provides the precise list of binding site residues in each case, and the probable ligand binding pose. In conclusion, as the list includes a number of known and potential drug targets, the identification of NTP binding can directly facilitate structure-based drug design of these targets.« less
Nasir, Arshan; Kim, Kyung Mo; Caetano-Anollés, Gustavo
2017-01-01
Untangling the origin and evolution of viruses remains a challenging proposition. We recently studied the global distribution of protein domain structures in thousands of completely sequenced viral and cellular proteomes with comparative genomics, phylogenomics, and multidimensional scaling methods. A tree of life describing the evolution of proteomes revealed viruses emerging from the base of the tree as a fourth supergroup of life. A tree of domains indicated an early origin of modern viral lineages from ancient cells that co-existed with the cellular ancestors. However, it was recently argued that the rooting of our trees and the basal placement of viruses was artifactually induced by small genome (proteome) size. Here we show that these claims arise from misunderstanding and misinterpretations of cladistic methodology. Trees are reconstructed unrooted, and thus, their topologies cannot be distorted a posteriori by the rooting methodology. Tracing proteome size in trees and multidimensional views of evolutionary relationships as well as tests of leaf stability and exclusion/inclusion of taxa demonstrated that the smallest proteomes were neither attracted toward the root nor caused any topological distortions of the trees. Simulations confirmed that taxa clustering patterns were independent of proteome size and were determined by the presence of known evolutionary relatives in data matrices, highlighting the need for broader taxon sampling in phylogeny reconstruction. Instead, phylogenetic tracings of proteome size revealed a slowdown in innovation of the structural domain vocabulary and four regimes of allometric scaling that reflected a Heaps law. These regimes explained increasing economies of scale in the evolutionary growth and accretion of kernel proteome repertoires of viruses and cellular organisms that resemble growth of human languages with limited vocabulary sizes. Results reconcile dynamic and static views of domain frequency distributions that are consistent with the axiom of spatiotemporal continuity that is tenet of evolutionary thinking. PMID:28690608
2013-01-01
Background The goal of many proteomics experiments is to determine the abundance of proteins in biological samples, and the variation thereof in various physiological conditions. High-throughput quantitative proteomics, specifically label-free LC-MS/MS, allows rapid measurement of thousands of proteins, enabling large-scale studies of various biological systems. Prior to analyzing these information-rich datasets, raw data must undergo several computational processing steps. We present a method to address one of the essential steps in proteomics data processing - the matching of peptide measurements across samples. Results We describe a novel method for label-free proteomics data alignment with the ability to incorporate previously unused aspects of the data, particularly ion mobility drift times and product ion information. We compare the results of our alignment method to PEPPeR and OpenMS, and compare alignment accuracy achieved by different versions of our method utilizing various data characteristics. Our method results in increased match recall rates and similar or improved mismatch rates compared to PEPPeR and OpenMS feature-based alignment. We also show that the inclusion of drift time and product ion information results in higher recall rates and more confident matches, without increases in error rates. Conclusions Based on the results presented here, we argue that the incorporation of ion mobility drift time and product ion information are worthy pursuits. Alignment methods should be flexible enough to utilize all available data, particularly with recent advancements in experimental separation methods. PMID:24341404
Benjamin, Ashlee M; Thompson, J Will; Soderblom, Erik J; Geromanos, Scott J; Henao, Ricardo; Kraus, Virginia B; Moseley, M Arthur; Lucas, Joseph E
2013-12-16
The goal of many proteomics experiments is to determine the abundance of proteins in biological samples, and the variation thereof in various physiological conditions. High-throughput quantitative proteomics, specifically label-free LC-MS/MS, allows rapid measurement of thousands of proteins, enabling large-scale studies of various biological systems. Prior to analyzing these information-rich datasets, raw data must undergo several computational processing steps. We present a method to address one of the essential steps in proteomics data processing--the matching of peptide measurements across samples. We describe a novel method for label-free proteomics data alignment with the ability to incorporate previously unused aspects of the data, particularly ion mobility drift times and product ion information. We compare the results of our alignment method to PEPPeR and OpenMS, and compare alignment accuracy achieved by different versions of our method utilizing various data characteristics. Our method results in increased match recall rates and similar or improved mismatch rates compared to PEPPeR and OpenMS feature-based alignment. We also show that the inclusion of drift time and product ion information results in higher recall rates and more confident matches, without increases in error rates. Based on the results presented here, we argue that the incorporation of ion mobility drift time and product ion information are worthy pursuits. Alignment methods should be flexible enough to utilize all available data, particularly with recent advancements in experimental separation methods.
Nanoliter-Scale Oil-Air-Droplet Chip-Based Single Cell Proteomic Analysis.
Li, Zi-Yi; Huang, Min; Wang, Xiu-Kun; Zhu, Ying; Li, Jin-Song; Wong, Catherine C L; Fang, Qun
2018-04-17
Single cell proteomic analysis provides crucial information on cellular heterogeneity in biological systems. Herein, we describe a nanoliter-scale oil-air-droplet (OAD) chip for achieving multistep complex sample pretreatment and injection for single cell proteomic analysis in the shotgun mode. By using miniaturized stationary droplet microreaction and manipulation techniques, our system allows all sample pretreatment and injection procedures to be performed in a nanoliter-scale droplet with minimum sample loss and a high sample injection efficiency (>99%), thus substantially increasing the analytical sensitivity for single cell samples. We applied the present system in the proteomic analysis of 100 ± 10, 50 ± 5, 10, and 1 HeLa cell(s), and protein IDs of 1360, 612, 192, and 51 were identified, respectively. The OAD chip-based system was further applied in single mouse oocyte analysis, with 355 protein IDs identified at the single oocyte level, which demonstrated its special advantages of high enrichment of sequence coverage, hydrophobic proteins, and enzymatic digestion efficiency over the traditional in-tube system.
Proteome-Scale Human Interactomics.
Luck, Katja; Sheynkman, Gloria M; Zhang, Ivy; Vidal, Marc
2017-05-01
Cellular functions are mediated by complex interactome networks of physical, biochemical, and functional interactions between DNA sequences, RNA molecules, proteins, lipids, and small metabolites. A thorough understanding of cellular organization requires accurate and relatively complete models of interactome networks at proteome scale. The recent publication of four human protein-protein interaction (PPI) maps represents a technological breakthrough and an unprecedented resource for the scientific community, heralding a new era of proteome-scale human interactomics. Our knowledge gained from these and complementary studies provides fresh insights into the opportunities and challenges when analyzing systematically generated interactome data, defines a clear roadmap towards the generation of a first reference interactome, and reveals new perspectives on the organization of cellular life. Copyright © 2017 Elsevier Ltd. All rights reserved.
Chin, Chiew Foan; Tan, Hooi Sin
2018-05-04
In many tropical countries with agriculture as the mainstay of the economy, tropical crops are commonly cultivated at the plantation scale. The successful establishment of crop plantations depends on the availability of a large quantity of elite seedling plants. Many plantation companies establish plant tissue culture laboratories to supply planting materials for their plantations and one of the most common applications of plant tissue culture is the mass propagation of true-to-type elite seedlings. However, problems encountered in tissue culture technology prevent its applications being widely adopted. Proteomics can be a powerful tool for use in the analysis of cultures, and to understand the biological processes that takes place at the cellular and molecular levels in order to address these problems. This mini review presents the tissue culture technologies commonly used in the propagation of tropical crops. It provides an outline of some the genes and proteins isolated that are associated with somatic embryogenesis and the use of proteomic technology in analysing tissue culture samples and processes in tropical crops.
NASA Astrophysics Data System (ADS)
The, Matthew; MacCoss, Michael J.; Noble, William S.; Käll, Lukas
2016-11-01
Percolator is a widely used software tool that increases yield in shotgun proteomics experiments and assigns reliable statistical confidence measures, such as q values and posterior error probabilities, to peptides and peptide-spectrum matches (PSMs) from such experiments. Percolator's processing speed has been sufficient for typical data sets consisting of hundreds of thousands of PSMs. With our new scalable approach, we can now also analyze millions of PSMs in a matter of minutes on a commodity computer. Furthermore, with the increasing awareness for the need for reliable statistics on the protein level, we compared several easy-to-understand protein inference methods and implemented the best-performing method—grouping proteins by their corresponding sets of theoretical peptides and then considering only the best-scoring peptide for each protein—in the Percolator package. We used Percolator 3.0 to analyze the data from a recent study of the draft human proteome containing 25 million spectra (PM:24870542). The source code and Ubuntu, Windows, MacOS, and Fedora binary packages are available from http://percolator.ms/ under an Apache 2.0 license.
Recent advances in methods for the analysis of protein o-glycosylation at proteome level.
You, Xin; Qin, Hongqiang; Ye, Mingliang
2018-01-01
O-Glycosylation, which refers to the glycosylation of the hydroxyl group of side chains of Serine/Threonine/Tyrosine residues, is one of the most common post-translational modifications. Compared with N-linked glycosylation, O-glycosylation is less explored because of its complex structure and relatively low abundance. Recently, O-glycosylation has drawn more and more attention for its various functions in many sophisticated biological processes. To obtain a deep understanding of O-glycosylation, many efforts have been devoted to develop effective strategies to analyze the two most abundant types of O-glycosylation, i.e. O-N-acetylgalactosamine and O-N-acetylglucosamine glycosylation. In this review, we summarize the proteomics workflows to analyze these two types of O-glycosylation. For the large-scale analysis of mucin-type glycosylation, the glycan simplification strategies including the ''SimpleCell'' technology were introduced. A variety of enrichment methods including lectin affinity chromatography, hydrophilic interaction chromatography, hydrazide chemistry, and chemoenzymatic method were introduced for the proteomics analysis of O-N-acetylgalactosamine and O-N-acetylglucosamine glycosylation. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The, Matthew; MacCoss, Michael J; Noble, William S; Käll, Lukas
2016-11-01
Percolator is a widely used software tool that increases yield in shotgun proteomics experiments and assigns reliable statistical confidence measures, such as q values and posterior error probabilities, to peptides and peptide-spectrum matches (PSMs) from such experiments. Percolator's processing speed has been sufficient for typical data sets consisting of hundreds of thousands of PSMs. With our new scalable approach, we can now also analyze millions of PSMs in a matter of minutes on a commodity computer. Furthermore, with the increasing awareness for the need for reliable statistics on the protein level, we compared several easy-to-understand protein inference methods and implemented the best-performing method-grouping proteins by their corresponding sets of theoretical peptides and then considering only the best-scoring peptide for each protein-in the Percolator package. We used Percolator 3.0 to analyze the data from a recent study of the draft human proteome containing 25 million spectra (PM:24870542). The source code and Ubuntu, Windows, MacOS, and Fedora binary packages are available from http://percolator.ms/ under an Apache 2.0 license. Graphical Abstract ᅟ.
A large scale Plasmodium vivax- Saimiri boliviensis trophozoite-schizont transition proteome
Lapp, Stacey A.; Barnwell, John W.; Galinski, Mary R.
2017-01-01
Plasmodium vivax is a complex protozoan parasite with over 6,500 genes and stage-specific differential expression. Much of the unique biology of this pathogen remains unknown, including how it modifies and restructures the host reticulocyte. Using a recently published P. vivax reference genome, we report the proteome from two biological replicates of infected Saimiri boliviensis host reticulocytes undergoing transition from the late trophozoite to early schizont stages. Using five database search engines, we identified a total of 2000 P. vivax and 3487 S. boliviensis proteins, making this the most comprehensive P. vivax proteome to date. PlasmoDB GO-term enrichment analysis of proteins identified at least twice by a search engine highlighted core metabolic processes and molecular functions such as glycolysis, translation and protein folding, cell components such as ribosomes, proteasomes and the Golgi apparatus, and a number of vesicle and trafficking related clusters. Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.8 enriched functional annotation clusters of S. boliviensis proteins highlighted vesicle and trafficking-related clusters, elements of the cytoskeleton, oxidative processes and response to oxidative stress, macromolecular complexes such as the proteasome and ribosome, metabolism, translation, and cell death. Host and parasite proteins potentially involved in cell adhesion were also identified. Over 25% of the P. vivax proteins have no functional annotation; this group includes 45 VIR members of the large PIR family. A number of host and pathogen proteins contained highly oxidized or nitrated residues, extending prior trophozoite-enriched stage observations from S. boliviensis infections, and supporting the possibility of oxidative stress in relation to the disease. This proteome significantly expands the size and complexity of the known P. vivax and Saimiri host iRBC proteomes, and provides in-depth data that will be valuable for ongoing research on this parasite’s biology and pathogenesis. PMID:28829774
Proteomics Analysis of the Causative Agent of Typhoid Fever
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ansong, Charles; Yoon, Hyunjin; Norbeck, Angela D.
2008-02-01
Typhoid fever is a potentially fatal disease caused by the bacterial pathogen Salmonella enterica serovar Typhi (S. typhi). S. typhi infection is a complex process that involves numerous bacterially-encoded virulence determinants, and these are thought to confer both stringent human host specificity and a high mortality rate. In the present study we used a liquid chromatography-mass spectrometry (LC-MS) based proteomics strategy to investigate the proteome of logarithmic, stationary phase, and low pH/low magnesium (MgM) S. typhi cultures. This represents the first large scale comprehensive characterization of the S. typhi proteome. Our analysis identified a total of 2066 S. typhi proteins.more » In an effort to identify putative S. typhi-specific virulence factors, we then compared our S. typhi results to those obtained in a previously published study of the S. typhimurium proteome under similar conditions (Adkins J.N. et al (2006) Mol Cell Prot). Comparative proteomic analysis of S. typhi (strain Ty2) and S. typhimurium (strains LT2 and 14028) revealed a subset of highly expressed proteins unique to S. typhi that were exclusively detected under conditions that mimic the infective state in macrophage cells. These proteins included CdtB, HlyE, and a conserved protein encoded by t1476. The differential expression of selected proteins was confirmed by Western blot analysis. Taken together with the current literature, our observations suggest that this subset of proteins may play a role in S. typhi pathogenesis and human host specificity. In addition, we observed products of the biotin (bio) operon displayed a higher abundance in the more virulent strains S. typhi-Ty2 and S. typhimurium-14028 compared to the virulence attenuated S. typhimurium strain LT2, suggesting bio proteins may contribute to Salmonella pathogenesis.« less
Rescuing discarded spectra: Full comprehensive analysis of a minimal proteome.
Lluch-Senar, Maria; Mancuso, Francesco M; Climente-González, Héctor; Peña-Paz, Marcia I; Sabido, Eduard; Serrano, Luis
2016-02-01
A common problem encountered when performing large-scale MS proteome analysis is the loss of information due to the high percentage of unassigned spectra. To determine the causes behind this loss we have analyzed the proteome of one of the smallest living bacteria that can be grown axenically, Mycoplasma pneumoniae (729 ORFs). The proteome of M. pneumoniae cells, grown in defined media, was analyzed by MS. An initial search with both Mascot and a species-specific NCBInr database with common contaminants (NCBImpn), resulted in around 79% of the acquired spectra not having an assignment. The percentage of non-assigned spectra was reduced to 27% after re-analysis of the data with the PEAKS software, thereby increasing the proteome coverage of M. pneumoniae from the initial 60% to over 76%. Nonetheless, 33,413 spectra with assigned amino acid sequences could not be mapped to any NCBInr database protein sequence. Approximately, 1% of these unassigned peptides corresponded to PTMs and 4% to M. pneumoniae protein variants (deamidation and translation inaccuracies). The most abundant peptide sequence variants (Phe-Tyr and Ala-Ser) could be explained by alterations in the editing capacity of the corresponding tRNA synthases. About another 1% of the peptides not associated to any protein had repetitions of the same aromatic/hydrophobic amino acid at the N-terminus, or had Arg/Lys at the C-terminus. Thus, in a model system, we have maximized the number of assigned spectra to 73% (51,453 out of the 70,040 initial acquired spectra). All MS data have been deposited in the ProteomeXchange with identifier PXD002779 (http://proteomecentral.proteomexchange.org/dataset/PXD002779). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Barkla, Bronwyn J; Vera-Estrella, Rosario; Raymond, Carolyn
2016-05-10
Epidermal bladder cells (EBC) are large single-celled, specialized, and modified trichomes found on the aerial parts of the halophyte Mesembryanthemum crystallinum. Recent development of a simple but high throughput technique to extract the contents from these cells has provided an opportunity to conduct detailed single-cell-type analyses of their molecular characteristics at high resolution to gain insight into the role of these cells in the salt tolerance of the plant. In this study, we carry out large-scale complementary quantitative proteomic studies using both a label (DIGE) and label-free (GeLC-MS) approach to identify salt-responsive proteins in the EBC extract. Additionally we perform an ionomics analysis (ICP-MS) to follow changes in the amounts of 27 different elements. Using these methods, we were able to identify 54 proteins and nine elements that showed statistically significant changes in the EBC from salt-treated plants. GO enrichment analysis identified a large number of transport proteins but also proteins involved in photosynthesis, primary metabolism and Crassulacean acid metabolism (CAM). Validation of results by western blot, confocal microscopy and enzyme analysis helped to strengthen findings and further our understanding into the role of these specialized cells. As expected EBC accumulated large quantities of sodium, however, the most abundant element was chloride suggesting the sequestration of this ion into the EBC vacuole is just as important for salt tolerance. This single-cell type omics approach shows that epidermal bladder cells of M. crystallinum are metabolically active modified trichomes, with primary metabolism supporting cell growth, ion accumulation, compatible solute synthesis and CAM. Data are available via ProteomeXchange with identifier PXD004045.
Gu, Xun; Wang, Yufeng; Gu, Jianying
2002-06-01
The classical (two-round) hypothesis of vertebrate genome duplication proposes two successive whole-genome duplication(s) (polyploidizations) predating the origin of fishes, a view now being seriously challenged. As the debate largely concerns the relative merits of the 'big-bang mode' theory (large-scale duplication) and the 'continuous mode' theory (constant creation by small-scale duplications), we tested whether a significant proportion of paralogous genes in the contemporary human genome was indeed generated in the early stage of vertebrate evolution. After an extensive search of major databases, we dated 1,739 gene duplication events from the phylogenetic analysis of 749 vertebrate gene families. We found a pattern characterized by two waves (I, II) and an ancient component. Wave I represents a recent gene family expansion by tandem or segmental duplications, whereas wave II, a rapid paralogous gene increase in the early stage of vertebrate evolution, supports the idea of genome duplication(s) (the big-bang mode). Further analysis indicated that large- and small-scale gene duplications both make a significant contribution during the early stage of vertebrate evolution to build the current hierarchy of the human proteome.
A Matter of Time: Faster Percolator Analysis via Efficient SVM Learning for Large-Scale Proteomics.
Halloran, John T; Rocke, David M
2018-05-04
Percolator is an important tool for greatly improving the results of a database search and subsequent downstream analysis. Using support vector machines (SVMs), Percolator recalibrates peptide-spectrum matches based on the learned decision boundary between targets and decoys. To improve analysis time for large-scale data sets, we update Percolator's SVM learning engine through software and algorithmic optimizations rather than heuristic approaches that necessitate the careful study of their impact on learned parameters across different search settings and data sets. We show that by optimizing Percolator's original learning algorithm, l 2 -SVM-MFN, large-scale SVM learning requires nearly only a third of the original runtime. Furthermore, we show that by employing the widely used Trust Region Newton (TRON) algorithm instead of l 2 -SVM-MFN, large-scale Percolator SVM learning is reduced to nearly only a fifth of the original runtime. Importantly, these speedups only affect the speed at which Percolator converges to a global solution and do not alter recalibration performance. The upgraded versions of both l 2 -SVM-MFN and TRON are optimized within the Percolator codebase for multithreaded and single-thread use and are available under Apache license at bitbucket.org/jthalloran/percolator_upgrade .
Elam, W Austin; Schrank, Travis P; Campagnolo, Andrew J; Hilser, Vincent J
2013-04-01
Intrinsically disordered (ID) proteins function in the absence of a unique stable structure and appear to challenge the classic structure-function paradigm. The extent to which ID proteins take advantage of subtle conformational biases to perform functions, and whether signals for such mechanism can be identified in proteome-wide studies is not well understood. Of particular interest is the polyproline II (PII) conformation, suggested to be highly populated in unfolded proteins. We experimentally determine a complete calorimetric propensity scale for the PII conformation. Projection of the scale into representative eukaryotic proteomes reveals significant PII bias in regions coding for ID proteins. Importantly, enrichment of PII in ID proteins, or protein segments, is also captured by other PII scales, indicating that this enrichment is robustly encoded and universally detectable regardless of the method of PII propensity determination. Gene ontology (GO) terms obtained using our PII scale and other scales demonstrate a consensus for molecular functions performed by high PII proteins across the proteome. Perhaps the most striking result of the GO analysis is conserved enrichment (P < 10(-8) ) of phosphorylation sites in high PII regions found by all PII scales. Subsequent conformational analysis reveals a phosphorylation-dependent modulation of PII, suggestive of a conserved "tunability" within these regions. In summary, the application of an experimentally determined polyproline II (PII) propensity scale to proteome-wide sequence analysis and gene ontology reveals an enrichment of PII bias near disordered phosphorylation sites that is conserved throughout eukaryotes. Copyright © 2013 The Protein Society.
Lee, Kenneth K; Sardiu, Mihaela E; Swanson, Selene K; Gilmore, Joshua M; Torok, Michael; Grant, Patrick A; Florens, Laurence; Workman, Jerry L; Washburn, Michael P
2011-07-05
Despite the availability of several large-scale proteomics studies aiming to identify protein interactions on a global scale, little is known about how proteins interact and are organized within macromolecular complexes. Here, we describe a technique that consists of a combination of biochemistry approaches, quantitative proteomics and computational methods using wild-type and deletion strains to investigate the organization of proteins within macromolecular protein complexes. We applied this technique to determine the organization of two well-studied complexes, Spt-Ada-Gcn5 histone acetyltransferase (SAGA) and ADA, for which no comprehensive high-resolution structures exist. This approach revealed that SAGA/ADA is composed of five distinct functional modules, which can persist separately. Furthermore, we identified a novel subunit of the ADA complex, termed Ahc2, and characterized Sgf29 as an ADA family protein present in all Gcn5 histone acetyltransferase complexes. Finally, we propose a model for the architecture of the SAGA and ADA complexes, which predicts novel functional associations within the SAGA complex and provides mechanistic insights into phenotypical observations in SAGA mutants.
Lee, Kenneth K; Sardiu, Mihaela E; Swanson, Selene K; Gilmore, Joshua M; Torok, Michael; Grant, Patrick A; Florens, Laurence; Workman, Jerry L; Washburn, Michael P
2011-01-01
Despite the availability of several large-scale proteomics studies aiming to identify protein interactions on a global scale, little is known about how proteins interact and are organized within macromolecular complexes. Here, we describe a technique that consists of a combination of biochemistry approaches, quantitative proteomics and computational methods using wild-type and deletion strains to investigate the organization of proteins within macromolecular protein complexes. We applied this technique to determine the organization of two well-studied complexes, Spt–Ada–Gcn5 histone acetyltransferase (SAGA) and ADA, for which no comprehensive high-resolution structures exist. This approach revealed that SAGA/ADA is composed of five distinct functional modules, which can persist separately. Furthermore, we identified a novel subunit of the ADA complex, termed Ahc2, and characterized Sgf29 as an ADA family protein present in all Gcn5 histone acetyltransferase complexes. Finally, we propose a model for the architecture of the SAGA and ADA complexes, which predicts novel functional associations within the SAGA complex and provides mechanistic insights into phenotypical observations in SAGA mutants. PMID:21734642
Advances in Proteomics Data Analysis and Display Using an Accurate Mass and Time Tag Approach
Zimmer, Jennifer S.D.; Monroe, Matthew E.; Qian, Wei-Jun; Smith, Richard D.
2007-01-01
Proteomics has recently demonstrated utility in understanding cellular processes on the molecular level as a component of systems biology approaches and for identifying potential biomarkers of various disease states. The large amount of data generated by utilizing high efficiency (e.g., chromatographic) separations coupled to high mass accuracy mass spectrometry for high-throughput proteomics analyses presents challenges related to data processing, analysis, and display. This review focuses on recent advances in nanoLC-FTICR-MS-based proteomics approaches and the accompanying data processing tools that have been developed to display and interpret the large volumes of data being produced. PMID:16429408
Xu, Shou-Ling; Chalkley, Robert J; Maynard, Jason C; Wang, Wenfei; Ni, Weimin; Jiang, Xiaoyue; Shin, Kihye; Cheng, Ling; Savage, Dasha; Hühmer, Andreas F R; Burlingame, Alma L; Wang, Zhi-Yong
2017-02-21
Genetic studies have shown essential functions of O-linked N -acetylglucosamine (O-GlcNAc) modification in plants. However, the proteins and sites subject to this posttranslational modification are largely unknown. Here, we report a large-scale proteomic identification of O-GlcNAc-modified proteins and sites in the model plant Arabidopsis thaliana Using lectin weak affinity chromatography to enrich modified peptides, followed by mass spectrometry, we identified 971 O-GlcNAc-modified peptides belonging to 262 proteins. The modified proteins are involved in cellular regulatory processes, including transcription, translation, epigenetic gene regulation, and signal transduction. Many proteins have functions in developmental and physiological processes specific to plants, such as hormone responses and flower development. Mass spectrometric analysis of phosphopeptides from the same samples showed that a large number of peptides could be modified by either O-GlcNAcylation or phosphorylation, but cooccurrence of the two modifications in the same peptide molecule was rare. Our study generates a snapshot of the O-GlcNAc modification landscape in plants, indicating functions in many cellular regulation pathways and providing a powerful resource for further dissecting these functions at the molecular level.
Turetschek, Reinhard; Lyon, David; Desalegn, Getinet; Kaul, Hans-Peter; Wienkoop, Stefanie
2016-01-01
The proteomic study of non-model organisms, such as many crop plants, is challenging due to the lack of comprehensive genome information. Changing environmental conditions require the study and selection of adapted cultivars. Mutations, inherent to cultivars, hamper protein identification and thus considerably complicate the qualitative and quantitative comparison in large-scale systems biology approaches. With this workflow, cultivar-specific mutations are detected from high-throughput comparative MS analyses, by extracting sequence polymorphisms with de novo sequencing. Stringent criteria are suggested to filter for confidential mutations. Subsequently, these polymorphisms complement the initially used database, which is ready to use with any preferred database search algorithm. In our example, we thereby identified 26 specific mutations in two cultivars of Pisum sativum and achieved an increased number (17 %) of peptide spectrum matches.
Zhang, Tong; Meng, Li; Kong, Wenwen; Yin, Zepeng; Wang, Yang; Schneider, Jacqueline D; Chen, Sixue
2018-03-20
Jasmonate ZIM-domain (JAZ) proteins are key transcriptional repressors regulating various biological processes. Although many studies have studied JAZ proteins by genetic and biochemical analyses, little is known about JAZ7-associated global protein networks and how JAZ7 contributes to bacterial pathogen defense. In this study, we aim to fill this knowledge gap by conducting unbiased large-scale quantitative proteomics using tandem mass tags (TMT). We compared the proteomes of a JAZ7 knock-out line, a JAZ7 overexpression line, as well as the wild type Arabidopsis plants in the presence and absence of Pseudomonas syringae DC3000 infection. Both pairwise comparison and multi-factor analysis of variance reveal that differential proteins are enriched in biological processes such as primary and secondary metabolism, redox regulation, and response to stress. The differential regulation in these pathways may account for the alterations in plant size, redox homeostasis and accumulation of glucosinolates. In addition, possible interplay between genotype and environment is suggested as the abundance of seven proteins is influenced by the interaction of the two factors. Collectively, we demonstrate a role of JAZ7 in pathogen defense and provide a list of proteins that are uniquely responsive to genetic disruption, pathogen infection, or the interaction between genotypes and environmental factors. We report proteomic changes as a result of genetic perturbation of JAZ7, and the contribution of JAZ7 in plant immunity. Specifically, the similarity between the proteomes of a JAZ7 knockout mutant and the wild type plants confirmed the functional redundancy of JAZs. In contrast, JAZ7 overexpression plants were much different, and proteomic analysis of the JAZ7 overexpression plants under Pst DC3000 infection revealed that JAZ7 may regulate plant immunity via ROS modulation, energy balance and glucosinolate biosynthesis. Multiple variate analysis for this two-factor proteomics experiment suggests that protein abundance is determined by genotype, environment and the interaction between them. Copyright © 2018 Elsevier B.V. All rights reserved.
Thiele, Herbert; Glandorf, Jörg; Hufnagel, Peter
2010-05-27
With the large variety of Proteomics workflows, as well as the large variety of instruments and data-analysis software available, researchers today face major challenges validating and comparing their Proteomics data. Here we present a new generation of the ProteinScape bioinformatics platform, now enabling researchers to manage Proteomics data from the generation and data warehousing to a central data repository with a strong focus on the improved accuracy, reproducibility and comparability demanded by many researchers in the field. It addresses scientists; current needs in proteomics identification, quantification and validation. But producing large protein lists is not the end point in Proteomics, where one ultimately aims to answer specific questions about the biological condition or disease model of the analyzed sample. In this context, a new tool has been developed at the Spanish Centro Nacional de Biotecnologia Proteomics Facility termed PIKE (Protein information and Knowledge Extractor) that allows researchers to control, filter and access specific information from genomics and proteomic databases, to understand the role and relationships of the proteins identified in the experiments. Additionally, an EU funded project, ProDac, has coordinated systematic data collection in public standards-compliant repositories like PRIDE. This will cover all aspects from generating MS data in the laboratory, assembling the whole annotation information and storing it together with identifications in a standardised format.
Proteomics research in India: an update.
Reddy, Panga Jaipal; Atak, Apurva; Ghantasala, Saicharan; Kumar, Saurabh; Gupta, Shabarni; Prasad, T S Keshava; Zingde, Surekha M; Srivastava, Sanjeeva
2015-09-08
After a successful completion of the Human Genome Project, deciphering the mystery surrounding the human proteome posed a major challenge. Despite not being largely involved in the Human Genome Project, the Indian scientific community contributed towards proteomic research along with the global community. Currently, more than 76 research/academic institutes and nearly 145 research labs are involved in core proteomic research across India. The Indian researchers have been major contributors in drafting the "human proteome map" along with international efforts. In addition to this, virtual proteomics labs, proteomics courses and remote triggered proteomics labs have helped to overcome the limitations of proteomics education posed due to expensive lab infrastructure. The establishment of Proteomics Society, India (PSI) has created a platform for the Indian proteomic researchers to share ideas, research collaborations and conduct annual conferences and workshops. Indian proteomic research is really moving forward with the global proteomics community in a quest to solve the mysteries of proteomics. A draft map of the human proteome enhances the enthusiasm among intellectuals to promote proteomic research in India to the world.This article is part of a Special Issue entitled: Proteomics in India. Copyright © 2015 Elsevier B.V. All rights reserved.
A pursuit of lineage-specific and niche-specific proteome features in the world of archaea
2012-01-01
Background Archaea evoke interest among researchers for two enigmatic characteristics –a combination of bacterial and eukaryotic components in their molecular architectures and an enormous diversity in their life-style and metabolic capabilities. Despite considerable research efforts, lineage- specific/niche-specific molecular features of the whole archaeal world are yet to be fully unveiled. The study offers the first large-scale in silico proteome analysis of all archaeal species of known genome sequences with a special emphasis on methanogenic and sulphur-metabolising archaea. Results Overall amino acid usage in archaea is dominated by GC-bias. But the environmental factors like oxygen requirement or thermal adaptation seem to play important roles in selection of residues with no GC-bias at the codon level. All methanogens, irrespective of their thermal/salt adaptation, show higher usage of Cys and have relatively acidic proteomes, while the proteomes of sulphur-metabolisers have higher aromaticity and more positive charges. Despite of exhibiting thermophilic life-style, korarchaeota possesses an acidic proteome. Among the distinct trends prevailing in COGs (Cluster of Orthologous Groups of proteins) distribution profiles, crenarchaeal organisms display higher intra-order variations in COGs repertoire, especially in the metabolic ones, as compared to euryarchaea. All methanogens are characterised by a presence of 22 exclusive COGs. Conclusions Divergences in amino acid usage, aromaticity/charge profiles and COG repertoire among methanogens and sulphur-metabolisers, aerobic and anaerobic archaea or korarchaeota and nanoarchaeota, as elucidated in the present study, point towards the presence of distinct molecular strategies for niche specialization in the archaeal world. PMID:22691113
A pursuit of lineage-specific and niche-specific proteome features in the world of archaea.
Roy Chowdhury, Anindya; Dutta, Chitra
2012-06-12
Archaea evoke interest among researchers for two enigmatic characteristics -a combination of bacterial and eukaryotic components in their molecular architectures and an enormous diversity in their life-style and metabolic capabilities. Despite considerable research efforts, lineage- specific/niche-specific molecular features of the whole archaeal world are yet to be fully unveiled. The study offers the first large-scale in silico proteome analysis of all archaeal species of known genome sequences with a special emphasis on methanogenic and sulphur-metabolising archaea. Overall amino acid usage in archaea is dominated by GC-bias. But the environmental factors like oxygen requirement or thermal adaptation seem to play important roles in selection of residues with no GC-bias at the codon level. All methanogens, irrespective of their thermal/salt adaptation, show higher usage of Cys and have relatively acidic proteomes, while the proteomes of sulphur-metabolisers have higher aromaticity and more positive charges. Despite of exhibiting thermophilic life-style, korarchaeota possesses an acidic proteome. Among the distinct trends prevailing in COGs (Cluster of Orthologous Groups of proteins) distribution profiles, crenarchaeal organisms display higher intra-order variations in COGs repertoire, especially in the metabolic ones, as compared to euryarchaea. All methanogens are characterised by a presence of 22 exclusive COGs. Divergences in amino acid usage, aromaticity/charge profiles and COG repertoire among methanogens and sulphur-metabolisers, aerobic and anaerobic archaea or korarchaeota and nanoarchaeota, as elucidated in the present study, point towards the presence of distinct molecular strategies for niche specialization in the archaeal world.
2010-01-01
snapshot of SM-induced toxicity. Over the past few years, innovations in systems biology and biotechnology have led to important advances in our under...perturbations. SILAC has been used to study tumor metastasis (3, 4), focal adhesion- associated proteins, growth factor signaling, and insulin regula- tion (5...stained with colloidal Coomassie blue. After it was destained, the gel lane was excised into six regions, and each region was cut into 1 mm cubes
Raiszadeh, Michelle M.; Ross, Mark M.; Russo, Paul S.; Schaepper, Mary Ann H.; Zhou, Weidong; Deng, Jianghong; Ng, Daniel; Dickson, April; Dickson, Cindy; Strom, Monica; Osorio, Carolina; Soeprono, Thomas; Wulfkuhle, Julia D.; Kabbani, Nadine; Petricoin, Emanuel F.; Liotta, Lance A.; Kirsch, Wolff M.
2012-01-01
Liquid chromatography tandem mass spectrometry (LC-MS/MS) and multiple reaction monitoring mass spectrometry (MRM-MS) proteomics analyses were performed on eccrine sweat of healthy controls, and the results were compared with those from individuals diagnosed with schizophrenia (SZ). This is the first large scale study of the sweat proteome. First, we performed LC-MS/MS on pooled SZ samples and pooled control samples for global proteomics analysis. Results revealed a high abundance of diverse proteins and peptides in eccrine sweat. Most of the proteins identified from sweat samples were found to be different than the most abundant proteins from serum, which indicates that eccrine sweat is not simply a plasma transudate, and may thereby be a source of unique disease-associated biomolecules. A second independent set of patient and control sweat samples were analyzed by LC-MS/MS and spectral counting to determine qualitative protein differential abundances between the control and disease groups. Differential abundances of selected proteins, initially determined by spectral counting, were verified by MRM-MS analyses. Seventeen proteins showed a differential abundance of approximately two-fold or greater between the SZ pooled sample and the control pooled sample. This study demonstrates the utility of LC-MS/MS and MRM-MS as a viable strategy for the discovery and verification of potential sweat protein disease biomarkers. PMID:22256890
Computational clustering for viral reference proteomes
Chen, Chuming; Huang, Hongzhan; Mazumder, Raja; Natale, Darren A.; McGarvey, Peter B.; Zhang, Jian; Polson, Shawn W.; Wang, Yuqi; Wu, Cathy H.
2016-01-01
Motivation: The enormous number of redundant sequenced genomes has hindered efforts to analyze and functionally annotate proteins. As the taxonomy of viruses is not uniformly defined, viral proteomes pose special challenges in this regard. Grouping viruses based on the similarity of their proteins at proteome scale can normalize against potential taxonomic nomenclature anomalies. Results: We present Viral Reference Proteomes (Viral RPs), which are computed from complete virus proteomes within UniProtKB. Viral RPs based on 95, 75, 55, 35 and 15% co-membership in proteome similarity based clusters are provided. Comparison of our computational Viral RPs with UniProt’s curator-selected Reference Proteomes indicates that the two sets are consistent and complementary. Furthermore, each Viral RP represents a cluster of virus proteomes that was consistent with virus or host taxonomy. We provide BLASTP search and FTP download of Viral RP protein sequences, and a browser to facilitate the visualization of Viral RPs. Availability and implementation: http://proteininformationresource.org/rps/viruses/ Contact: chenc@udel.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153712
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fang, Yilin; Wilkins, Michael J.; Yabusaki, Steven B.
2012-12-12
Biomass and shotgun global proteomics data that reflected relative protein abundances from samples collected during the 2008 experiment at the U.S. Department of Energy Integrated Field-Scale Subsurface Research Challenge site in Rifle, Colorado, provided an unprecedented opportunity to validate a genome-scale metabolic model of Geobacter metallireducens and assess its performance with respect to prediction of metal reduction, biomass yield, and growth rate under dynamic field conditions. Reconstructed from annotated genomic sequence, biochemical, and physiological data, the constraint-based in silico model of G. metallireducens relates an annotated genome sequence to the physiological functions with 697 reactions controlled by 747 enzyme-coding genes.more » Proteomic analysis showed that 180 of the 637 G. metallireducens proteins detected during the 2008 experiment were associated with specific metabolic reactions in the in silico model. When the field-calibrated Fe(III) terminal electron acceptor process reaction in a reactive transport model for the field experiments was replaced with the genome-scale model, the model predicted that the largest metabolic fluxes through the in silico model reactions generally correspond to the highest abundances of proteins that catalyze those reactions. Central metabolism predicted by the model agrees well with protein abundance profiles inferred from proteomic analysis. Model discrepancies with the proteomic data, such as the relatively low fluxes through amino acid transport and metabolism, revealed pathways or flux constraints in the in silico model that could be updated to more accurately predict metabolic processes that occur in the subsurface environment.« less
Richard, François D; Kajava, Andrey V
2014-06-01
The dramatic growth of sequencing data evokes an urgent need to improve bioinformatics tools for large-scale proteome analysis. Over the last two decades, the foremost efforts of computer scientists were devoted to proteins with aperiodic sequences having globular 3D structures. However, a large portion of proteins contain periodic sequences representing arrays of repeats that are directly adjacent to each other (so called tandem repeats or TRs). These proteins frequently fold into elongated fibrous structures carrying different fundamental functions. Algorithms specific to the analysis of these regions are urgently required since the conventional approaches developed for globular domains have had limited success when applied to the TR regions. The protein TRs are frequently not perfect, containing a number of mutations, and some of them cannot be easily identified. To detect such "hidden" repeats several algorithms have been developed. However, the most sensitive among them are time-consuming and, therefore, inappropriate for large scale proteome analysis. To speed up the TR detection we developed a rapid filter that is based on the comparison of composition and order of short strings in the adjacent sequence motifs. Tests show that our filter discards up to 22.5% of proteins which are known to be without TRs while keeping almost all (99.2%) TR-containing sequences. Thus, we are able to decrease the size of the initial sequence dataset enriching it with TR-containing proteins which allows a faster subsequent TR detection by other methods. The program is available upon request. Copyright © 2014 Elsevier Inc. All rights reserved.
pyQms enables universal and accurate quantification of mass spectrometry data.
Leufken, Johannes; Niehues, Anna; Sarin, L Peter; Wessel, Florian; Hippler, Michael; Leidel, Sebastian A; Fufezan, Christian
2017-10-01
Quantitative mass spectrometry (MS) is a key technique in many research areas (1), including proteomics, metabolomics, glycomics, and lipidomics. Because all of the corresponding molecules can be described by chemical formulas, universal quantification tools are highly desirable. Here, we present pyQms, an open-source software for accurate quantification of all types of molecules measurable by MS. pyQms uses isotope pattern matching that offers an accurate quality assessment of all quantifications and the ability to directly incorporate mass spectrometer accuracy. pyQms is, due to its universal design, applicable to every research field, labeling strategy, and acquisition technique. This opens ultimate flexibility for researchers to design experiments employing innovative and hitherto unexplored labeling strategies. Importantly, pyQms performs very well to accurately quantify partially labeled proteomes in large scale and high throughput, the most challenging task for a quantification algorithm. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Farhoud, Murtada H; Wessels, Hans J C T; Wevers, Ron A; van Engelen, Baziel G; van den Heuvel, Lambert P; Smeitink, Jan A
2005-01-01
In 2D-based comparative proteomics of scarce samples, such as limited patient material, established methods for prefractionation and subsequent use of different narrow range IPG strips to increase overall resolution are difficult to apply. Also, a high number of samples, a prerequisite for drawing meaningful conclusions when pathological and control samples are considered, will increase the associated amount of work almost exponentially. Here, we introduce a novel, effective, and economic method designed to obtain maximum 2D resolution while maintaining the high throughput necessary to perform large-scale comparative proteomics studies. The method is based on connecting different IPG strips serially head-to-tail so that a complete line of different IPG strips with sequential pH regions can be focused in the same experiment. We show that when 3 IPG strips (covering together the pH range of 3-11) are connected head-to-tail an optimal resolution is achieved along the whole pH range. Sample consumption, time required, and associated costs are reduced by almost 70%, and the workload is reduced significantly.
2017-01-01
The changes of protein expression that are monitored in proteomic experiments are a type of biological transformation that also involves changes in chemical composition. Accompanying the myriad molecular-level interactions that underlie any proteomic transformation, there is an overall thermodynamic potential that is sensitive to microenvironmental conditions, including local oxidation and hydration potential. Here, up- and down-expressed proteins identified in 71 comparative proteomics studies were analyzed using the average oxidation state of carbon (ZC) and water demand per residue (\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}${\\overline{n}}_{{\\mathrm{H}}_{2}\\mathrm{O}}$\\end{document}n¯H2O), calculated using elemental abundances and stoichiometric reactions to form proteins from basis species. Experimental lowering of oxygen availability (hypoxia) or water activity (hyperosmotic stress) generally results in decreased ZC or \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}${\\overline{n}}_{{\\mathrm{H}}_{2}\\mathrm{O}}$\\end{document}n¯H2O of up-expressed compared to down-expressed proteins. This correspondence of chemical composition with experimental conditions provides evidence for attraction of the proteomes to a low-energy state. An opposite compositional change, toward higher average oxidation or hydration state, is found for proteomic transformations in colorectal and pancreatic cancer, and in two experiments for adipose-derived stem cells. Calculations of chemical affinity were used to estimate the thermodynamic potentials for proteomic transformations as a function of fugacity of O2 and activity of H2O, which serve as scales of oxidation and hydration potential. Diagrams summarizing the relative potential for formation of up- and down-expressed proteins have predicted equipotential lines that cluster around particular values of oxygen fugacity and water activity for similar datasets. The changes in chemical composition of proteomes are likely linked with reactions among other cellular molecules. A redox balance calculation indicates that an increase in the lipid to protein ratio in cancer cells by 20% over hypoxic cells would generate a large enough electron sink for oxidation of the cancer proteomes. The datasets and computer code used here are made available in a new R package, canprot. PMID:28603672
2011-02-01
Thrombocytopenia, Grade 3 in 1 patient • Hypomagnesemia, Grade 3 in 1 patient • Hypokalemia, Grade 3 in 2 patient • Pneumonia , Grade 3 in 7 patients...urgently needed. While the molecular events involved in lung cancer pathogenesis are being unraveled by ongoing large scale genomics, proteomics, and...tumor initiation, progression and metastasis are an important first step leading to the development of new prognostic markers and targets for therapy
Thermosensitivity of growth is determined by chaperone-mediated proteome reallocation
Chen, Ke; Gao, Ye; Mih, Nathan; O’Brien, Edward J.; Yang, Laurence; Palsson, Bernhard O.
2017-01-01
Maintenance of a properly folded proteome is critical for bacterial survival at notably different growth temperatures. Understanding the molecular basis of thermoadaptation has progressed in two main directions, the sequence and structural basis of protein thermostability and the mechanistic principles of protein quality control assisted by chaperones. Yet we do not fully understand how structural integrity of the entire proteome is maintained under stress and how it affects cellular fitness. To address this challenge, we reconstruct a genome-scale protein-folding network for Escherichia coli and formulate a computational model, FoldME, that provides statistical descriptions of multiscale cellular response consistent with many datasets. FoldME simulations show (i) that the chaperones act as a system when they respond to unfolding stress rather than achieving efficient folding of any single component of the proteome, (ii) how the proteome is globally balanced between chaperones for folding and the complex machinery synthesizing the proteins in response to perturbation, (iii) how this balancing determines growth rate dependence on temperature and is achieved through nonspecific regulation, and (iv) how thermal instability of the individual protein affects the overall functional state of the proteome. Overall, these results expand our view of cellular regulation, from targeted specific control mechanisms to global regulation through a web of nonspecific competing interactions that modulate the optimal reallocation of cellular resources. The methodology developed in this study enables genome-scale integration of environment-dependent protein properties and a proteome-wide study of cellular stress responses. PMID:29073085
Trentmann, Oliver; Haferkamp, Ilka
2013-01-01
Vacuoles of plants fulfill various biologically important functions, like turgor generation and maintenance, detoxification, solute sequestration, or protein storage. Different types of plant vacuoles (lytic versus protein storage) are characterized by different functional properties apparently caused by a different composition/abundance and regulation of transport proteins in the surrounding membrane, the tonoplast. Proteome analyses allow the identification of vacuolar proteins and provide an informative basis for assigning observed transport processes to specific carriers or channels. This review summarizes techniques required for vacuolar proteome analyses, like e.g., isolation of the large central vacuole or tonoplast membrane purification. Moreover, an overview about diverse published vacuolar proteome studies is provided. It becomes evident that qualitative proteomes from different plant species represent just the tip of the iceberg. During the past few years, mass spectrometry achieved immense improvement concerning its accuracy, sensitivity, and application. As a consequence, modern tonoplast proteome approaches are suited for detecting alterations in membrane protein abundance in response to changing environmental/physiological conditions and help to clarify the regulation of tonoplast transport processes. PMID:23459586
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yotsui, Izumi, E-mail: izumi.yotsui@riken.jp; Serada, Satoshi, E-mail: serada@nibiohn.go.jp; Naka, Tetsuji, E-mail: tnaka@nibiohn.go.jp
2016-03-18
Desiccation tolerance is an ancestral feature of land plants and is still retained in non-vascular plants such as bryophytes and some vascular plants. However, except for seeds and spores, this trait is absent in vegetative tissues of vascular plants. Although many studies have focused on understanding the molecular basis underlying desiccation tolerance using transcriptome and proteome approaches, the critical molecular differences between desiccation tolerant plants and non-desiccation plants are still not clear. The moss Physcomitrella patens cannot survive rapid desiccation under laboratory conditions, but if cells of the protonemata are treated by the phytohormone abscisic acid (ABA) prior to desiccation,more » it can survive 24 h exposure to desiccation and regrow after rehydration. The desiccation tolerance induced by ABA (AiDT) is specific to this hormone, but also depends on a plant transcription factor ABSCISIC ACID INSENSITIVE3 (ABI3). Here we report the comparative proteomic analysis of AiDT between wild type and ABI3 deleted mutant (Δabi3) of P. patens using iTRAQ (Isobaric Tags for Relative and Absolute Quantification). From a total of 1980 unique proteins that we identified, only 16 proteins are significantly altered in Δabi3 compared to wild type after desiccation following ABA treatment. Among this group, three of the four proteins that were severely affected in Δabi3 tissue were Arabidopsis orthologous genes, which were expressed in maturing seeds under the regulation of ABI3. These included a Group 1 late embryogenesis abundant (LEA) protein, a short-chain dehydrogenase, and a desiccation-related protein. Our results suggest that at least three of these proteins expressed in desiccation tolerant cells of both Arabidopsis and the moss are very likely to play important roles in acquisition of desiccation tolerance in land plants. Furthermore, our results suggest that the regulatory machinery of ABA- and ABI3-mediated gene expression for desiccation tolerance might have evolved in ancestral land plants before the separation of bryophytes and vascular plants. - Highlights: • Large-scale proteomics highlighted proteins related to plant desiccation tolerance. • The proteins were regulated by both the phytohormone ABA and ABI3. • The proteins accumulated in desiccation tolerant cells of both Arabidopsis and moss. • Evolutionary origin of regulatory machinery for desiccation tolerance is proposed.« less
Fang, Yilin; Wilkins, Michael J; Yabusaki, Steven B; Lipton, Mary S; Long, Philip E
2012-12-01
Accurately predicting the interactions between microbial metabolism and the physical subsurface environment is necessary to enhance subsurface energy development, soil and groundwater cleanup, and carbon management. This study was an initial attempt to confirm the metabolic functional roles within an in silico model using environmental proteomic data collected during field experiments. Shotgun global proteomics data collected during a subsurface biostimulation experiment were used to validate a genome-scale metabolic model of Geobacter metallireducens-specifically, the ability of the metabolic model to predict metal reduction, biomass yield, and growth rate under dynamic field conditions. The constraint-based in silico model of G. metallireducens relates an annotated genome sequence to the physiological functions with 697 reactions controlled by 747 enzyme-coding genes. Proteomic analysis showed that 180 of the 637 G. metallireducens proteins detected during the 2008 experiment were associated with specific metabolic reactions in the in silico model. When the field-calibrated Fe(III) terminal electron acceptor process reaction in a reactive transport model for the field experiments was replaced with the genome-scale model, the model predicted that the largest metabolic fluxes through the in silico model reactions generally correspond to the highest abundances of proteins that catalyze those reactions. Central metabolism predicted by the model agrees well with protein abundance profiles inferred from proteomic analysis. Model discrepancies with the proteomic data, such as the relatively low abundances of proteins associated with amino acid transport and metabolism, revealed pathways or flux constraints in the in silico model that could be updated to more accurately predict metabolic processes that occur in the subsurface environment.
Tipton, Jeremiah D; Tran, John C; Catherman, Adam D; Ahlf, Dorothy R; Durbin, Kenneth R; Lee, Ji Eun; Kellie, John F; Kelleher, Neil L; Hendrickson, Christopher L; Marshall, Alan G
2012-03-06
Current high-throughput top-down proteomic platforms provide routine identification of proteins less than 25 kDa with 4-D separations. This short communication reports the application of technological developments over the past few years that improve protein identification and characterization for masses greater than 25 kDa. Advances in separation science have allowed increased numbers of proteins to be identified, especially by nanoliquid chromatography (nLC) prior to mass spectrometry (MS) analysis. Further, a goal of high-throughput top-down proteomics is to extend the mass range for routine nLC MS analysis up to 80 kDa because gene sequence analysis predicts that ~70% of the human proteome is transcribed to be less than 80 kDa. Normally, large proteins greater than 50 kDa are identified and characterized by top-down proteomics through fraction collection and direct infusion at relatively low throughput. Further, other MS-based techniques provide top-down protein characterization, however at low resolution for intact mass measurement. Here, we present analysis of standard (up to 78 kDa) and whole cell lysate proteins by Fourier transform ion cyclotron resonance mass spectrometry (nLC electrospray ionization (ESI) FTICR MS). The separation platform reduced the complexity of the protein matrix so that, at 14.5 T, proteins from whole cell lysate up to 72 kDa are baseline mass resolved on a nano-LC chromatographic time scale. Further, the results document routine identification of proteins at improved throughput based on accurate mass measurement (less than 10 ppm mass error) of precursor and fragment ions for proteins up to 50 kDa.
Persi, Erez; Horn, David
2013-01-01
We present a novel analysis of compositional order (CO) based on the occurrence of Frequent amino-acid Triplets (FTs) that appear much more than random in protein sequences. The method captures all types of proteomic compositional order including single amino-acid runs, tandem repeats, periodic structure of motifs and otherwise low complexity amino-acid regions. We introduce new order measures, distinguishing between ‘regularity’, ‘periodicity’ and ‘vocabulary’, to quantify these phenomena and to facilitate the identification of evolutionary effects. Detailed analysis of representative species across the tree-of-life demonstrates that CO proteins exhibit numerous functional enrichments, including a wide repertoire of particular patterns of dependencies on regularity and periodicity. Comparison between human and mouse proteomes further reveals the interplay of CO with evolutionary trends, such as faster substitution rate in mouse leading to decrease of periodicity, while innovation along the human lineage leads to larger regularity. Large-scale analysis of 94 proteomes leads to systematic ordering of all major taxonomic groups according to FT-vocabulary size. This is measured by the count of Different Frequent Triplets (DFT) in proteomes. The latter provides a clear hierarchical delineation of vertebrates, invertebrates, plants, fungi and prokaryotes, with thermophiles showing the lowest level of FT-vocabulary. Among eukaryotes, this ordering correlates with phylogenetic proximity. Interestingly, in all kingdoms CO accumulation in the proteome has universal characteristics. We suggest that CO is a genomic-information correlate of both macroevolution and various protein functions. The results indicate a mechanism of genomic ‘innovation’ at the peptide level, involved in protein elongation, shaped in a universal manner by mutational and selective forces. PMID:24278003
Murthy, Krishna R; Dammalli, Manjunath; Pinto, Sneha M; Murthy, Kalpana Babu; Nirujogi, Raja Sekhar; Madugundu, Anil K; Dey, Gourav; Subbannayya, Yashwanth; Mishra, Uttam Kumar; Nair, Bipin; Gowda, Harsha; Prasad, T S Keshava
2016-09-01
The annual economic burden of visual disorders in the United States was estimated at $139 billion. Ophthalmology is therefore one of the salient application fields of postgenomics biotechnologies such as proteomics in the pursuit of global precision medicine. Interestingly, the protein composition of the human iris tissue still remains largely unexplored. In this context, the uveal tract constitutes the vascular middle coat of the eye and is formed by the choroid, ciliary body, and iris. The iris forms the anterior most part of the uvea. It is a thin muscular diaphragm with a central perforation called pupil. Inflammation of the uvea is termed uveitis and causes reduced vision or blindness. However, the pathogenesis of the spectrum of diseases causing uveitis is still not very well understood. We investigated the proteome of the iris tissue harvested from healthy donor eyes that were enucleated within 6 h of death using high-resolution Fourier transform mass spectrometry. A total of 4959 nonredundant proteins were identified in the human iris, which included proteins involved in signaling, cell communication, metabolism, immune response, and transport. This study is the first attempt to comprehensively profile the global proteome of the human iris tissue and, thus, offers the potential to facilitate biomedical research into pathological diseases of the uvea such as Behcet's disease, Vogt Koyonagi Harada's disease, and juvenile rheumatoid arthritis. Finally, we make a call to the broader visual health and ophthalmology community that proteomics offers a veritable prospect to obtain a systems scale, functional, and dynamic picture of the eye tissue in health and disease. This knowledge is ultimately pertinent for precision medicine diagnostics and therapeutics innovation to address the pressing needs of the 21st century visual health.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jacobs, Jon M.; Diamond, Deborah L.; Chan, Eric Y.
2005-06-01
The development of a reproducible model system for the study of Hepatitis C virus (HCV) infection has the potential to significantly enhance the study of virus-host interactions and provide future direction for modeling the pathogenesis of HCV. While there are studies describing global gene expression changes associated with HCV infection, changes in the proteome have not been characterized. We report the first large scale proteome analysis of the highly permissive Huh-7.5 cell line containing a full length HCV replicon. We detected > 4,400 proteins in this cell line, including HCV replicon proteins, using multidimensional liquid chromatographic (LC) separations coupled tomore » mass spectrometry (MS). The set of Huh-7.5 proteins confidently identified is, to our knowledge, the most comprehensive yet reported for a human cell line. Consistent with the literature, a comparison of Huh-7.5 cells (+) and (-) the HCV replicon identified expression changes of proteins involved in lipid metabolism. We extended these analyses to liver biopsy material from HCV-infected patients where > 1,500 proteins were detected from 2 {micro}g protein lysate using the Huh-7.5 protein database and the accurate mass and time (AMT) tag strategy. These findings demonstrate the utility of multidimensional proteome analysis of the HCV replicon model system for assisting the determination of proteins/pathways affected by HCV infection. Our ability to extend these analyses to the highly complex proteome of small liver biopsies with limiting protein yields offers the unique opportunity to begin evaluating the clinical significance of protein expression changes associated with HCV infection.« less
[ProteoСat: a tool for planning of proteomic experiments].
Skvortsov, V S; Alekseychuk, N N; Khudyakov, D V; Mikurova, A V; Rybina, A V; Novikova, S E; Tikhonova, O V
2015-01-01
ProteoCat is a computer program has been designed to help researchers in the planning of large-scale proteomic experiments. The central part of this program is the subprogram of hydrolysis simulation that supports 4 proteases (trypsin, lysine C, endoproteinases AspN and GluC). For the peptides obtained after virtual hydrolysis or loaded from data file a number of properties important in mass-spectrometric experiments can be calculated or predicted. The data can be analyzed or filtered to reduce a set of peptides. The program is using new and improved modification of our methods developed to predict pI and probability of peptide detection; pI can also be predicted for a number of popular pKa's scales, proposed by other investigators. The algorithm for prediction of peptide retention time was realized similar to the algorithm used in the program SSRCalc. ProteoCat can estimate the coverage of amino acid sequences of proteins under defined limitation on peptides detection, as well as the possibility of assembly of peptide fragments with user-defined size of "sticky" ends. The program has a graphical user interface, written on JAVA and available at http://www.ibmc.msk.ru/LPCIT/ProteoCat.
Kennedy, Jacob J.; Abbatiello, Susan E.; Kim, Kyunggon; Yan, Ping; Whiteaker, Jeffrey R.; Lin, Chenwei; Kim, Jun Seok; Zhang, Yuzheng; Wang, Xianlong; Ivey, Richard G.; Zhao, Lei; Min, Hophil; Lee, Youngju; Yu, Myeong-Hee; Yang, Eun Gyeong; Lee, Cheolju; Wang, Pei; Rodriguez, Henry; Kim, Youngsoo; Carr, Steven A.; Paulovich, Amanda G.
2014-01-01
The successful application of MRM in biological specimens raises the exciting possibility that assays can be configured to measure all human proteins, resulting in an assay resource that would promote advances in biomedical research. We report the results of a pilot study designed to test the feasibility of a large-scale, international effort in MRM assay generation. We have configured, validated across three laboratories, and made publicly available as a resource to the community 645 novel MRM assays representing 319 proteins expressed in human breast cancer. Assays were multiplexed in groups of >150 peptides and deployed to quantify endogenous analyte in a panel of breast cancer-related cell lines. Median assay precision was 5.4%, with high inter-laboratory correlation (R2 >0.96). Peptide measurements in breast cancer cell lines were able to discriminate amongst molecular subtypes and identify genome-driven changes in the cancer proteome. These results establish the feasibility of a scaled, international effort. PMID:24317253
Ali, Mehreen; Khan, Suleiman A; Wennerberg, Krister; Aittokallio, Tero
2018-04-15
Proteomics profiling is increasingly being used for molecular stratification of cancer patients and cell-line panels. However, systematic assessment of the predictive power of large-scale proteomic technologies across various drug classes and cancer types is currently lacking. To that end, we carried out the first pan-cancer, multi-omics comparative analysis of the relative performance of two proteomic technologies, targeted reverse phase protein array (RPPA) and global mass spectrometry (MS), in terms of their accuracy for predicting the sensitivity of cancer cells to both cytotoxic chemotherapeutics and molecularly targeted anticancer compounds. Our results in two cell-line panels demonstrate how MS profiling improves drug response predictions beyond that of the RPPA or the other omics profiles when used alone. However, frequent missing MS data values complicate its use in predictive modeling and required additional filtering, such as focusing on completely measured or known oncoproteins, to obtain maximal predictive performance. Rather strikingly, the two proteomics profiles provided complementary predictive signal both for the cytotoxic and targeted compounds. Further, information about the cellular-abundance of primary target proteins was found critical for predicting the response of targeted compounds, although the non-target features also contributed significantly to the predictive power. The clinical relevance of the selected protein markers was confirmed in cancer patient data. These results provide novel insights into the relative performance and optimal use of the widely applied proteomic technologies, MS and RPPA, which should prove useful in translational applications, such as defining the best combination of omics technologies and marker panels for understanding and predicting drug sensitivities in cancer patients. Processed datasets, R as well as Matlab implementations of the methods are available at https://github.com/mehr-een/bemkl-rbps. mehreen.ali@helsinki.fi or tero.aittokallio@fimm.fi. Supplementary data are available at Bioinformatics online.
Grigoryan, Marine; Shamshurin, Dmitry; Spicer, Victor; Krokhin, Oleg V
2013-11-19
As an initial step in our efforts to unify the expression of peptide retention times in proteomic liquid chromatography-mass spectrometry (LC-MS) experiments, we aligned the chromatographic properties of a number of peptide retention standards against a collection of peptides commonly observed in proteomic experiments. The standard peptide mixtures and tryptic digests of samples of different origins were separated under the identical chromatographic condition most commonly employed in proteomics: 100 Å C18 sorbent with 0.1% formic acid as an ion-pairing modifier. Following our original approach (Krokhin, O. V.; Spicer, V. Anal. Chem. 2009, 81, 9522-9530) the retention characteristics of these standards and collection of tryptic peptides were mapped into hydrophobicity index (HI) or acetonitrile percentage units. This scale allows for direct visualization of the chromatographic outcome of LC-MS acquisitions, monitors the performance of the gradient LC system, and simplifies method development and interlaboratory data alignment. Wide adoption of this approach would significantly aid understanding the basic principles of gradient peptide RP-HPLC and solidify our collective efforts in acquiring confident peptide retention libraries, a key component in the development of targeted proteomic approaches.
Advanced proteomic liquid chromatography
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xie, Fang; Smith, Richard D.; Shen, Yufeng
2012-10-26
Liquid chromatography coupled with mass spectrometry is the predominant platform used to analyze proteomics samples consisting of large numbers of proteins and their proteolytic products (e.g., truncated polypeptides) and spanning a wide range of relative concentrations. This review provides an overview of advanced capillary liquid chromatography techniques and methodologies that greatly improve separation resolving power and proteomics analysis coverage, sensitivity, and throughput.
An automated method for detecting alternatively spliced protein domains.
Coelho, Vitor; Sammeth, Michael
2018-06-01
Alternative splicing (AS) has been demonstrated to play a role in shaping eukaryotic gene diversity at the transcriptional level. However, the impact of AS on the proteome is still controversial. Studies that seek to explore the effect of AS at the proteomic level are hampered by technical difficulties in the cumbersome process of casting forth and back between genome, transcriptome and proteome space coordinates, and the naïve prediction of protein domains in the presence of AS suffers many redundant sequence scans that emerge from constitutively spliced regions that are shared between alternative products of a gene. We developed the AstaFunk pipeline that computes for every generic transcriptome all domains that are altered by AS events in a systematic and efficient manner. In a nutshell, our method employs Viterbi dynamic programming, which guarantees to find all score-optimal hits of the domains under consideration, while complementary optimisations at different levels avoid redundant and other irrelevant computations. We evaluate AstaFunk qualitatively and quantitatively using RNAseq in well-studied genes with AS, and on large-scale employing entire transcriptomes. Our study confirms complementary reports that the effect of most AS events on the proteome seems to be rather limited, but our results also pinpoint several cases where AS could have a major impact on the function of a protein domain. The JAVA implementation of AstaFunk is available as an open source project on http://astafunk.sammeth.net. micha@sammeth.net. Supplementary data are available at Bioinformatics online.
Epigenetics and Proteomics Join Transcriptomics in the Quest for Tuberculosis Biomarkers
Esterhuyse, Maria M.; Weiner, January; Caron, Etienne; Loxton, Andre G.; Iannaccone, Marco; Wagman, Chandre; Saikali, Philippe; Stanley, Kim; Wolski, Witold E.; Mollenkopf, Hans-Joachim; Schick, Matthias; Aebersold, Ruedi; Linhart, Heinz; Walzl, Gerhard
2015-01-01
ABSTRACT An estimated one-third of the world’s population is currently latently infected with Mycobacterium tuberculosis. Latent M. tuberculosis infection (LTBI) progresses into active tuberculosis (TB) disease in ~5 to 10% of infected individuals. Diagnostic and prognostic biomarkers to monitor disease progression are urgently needed to ensure better care for TB patients and to decrease the spread of TB. Biomarker development is primarily based on transcriptomics. Our understanding of biology combined with evolving technical advances in high-throughput techniques led us to investigate the possibility of additional platforms (epigenetics and proteomics) in the quest to (i) understand the biology of the TB host response and (ii) search for multiplatform biosignatures in TB. We engaged in a pilot study to interrogate the DNA methylome, transcriptome, and proteome in selected monocytes and granulocytes from TB patients and healthy LTBI participants. Our study provides first insights into the levels and sources of diversity in the epigenome and proteome among TB patients and LTBI controls, despite limitations due to small sample size. Functionally the differences between the infection phenotypes (LTBI versus active TB) observed in the different platforms were congruent, thereby suggesting regulation of function not only at the transcriptional level but also by DNA methylation and microRNA. Thus, our data argue for the development of a large-scale study of the DNA methylome, with particular attention to study design in accounting for variation based on gender, age, and cell type. PMID:26374119
Genome-scale prediction of proteins with long intrinsically disordered regions.
Peng, Zhenling; Mizianty, Marcin J; Kurgan, Lukasz
2014-01-01
Proteins with long disordered regions (LDRs), defined as having 30 or more consecutive disordered residues, are abundant in eukaryotes, and these regions are recognized as a distinct class of biologically functional domains. LDRs facilitate various cellular functions and are important for target selection in structural genomics. Motivated by the lack of methods that directly predict proteins with LDRs, we designed Super-fast predictor of proteins with Long Intrinsically DisordERed regions (SLIDER). SLIDER utilizes logistic regression that takes an empirically chosen set of numerical features, which consider selected physicochemical properties of amino acids, sequence complexity, and amino acid composition, as its inputs. Empirical tests show that SLIDER offers competitive predictive performance combined with low computational cost. It outperforms, by at least a modest margin, a comprehensive set of modern disorder predictors (that can indirectly predict LDRs) and is 16 times faster compared to the best currently available disorder predictor. Utilizing our time-efficient predictor, we characterized abundance and functional roles of proteins with LDRs over 110 eukaryotic proteomes. Similar to related studies, we found that eukaryotes have many (on average 30.3%) proteins with LDRs with majority of proteomes having between 25 and 40%, where higher abundance is characteristic to proteomes that have larger proteins. Our first-of-its-kind large-scale functional analysis shows that these proteins are enriched in a number of cellular functions and processes including certain binding events, regulation of catalytic activities, cellular component organization, biogenesis, biological regulation, and some metabolic and developmental processes. A webserver that implements SLIDER is available at http://biomine.ece.ualberta.ca/SLIDER/. Copyright © 2013 Wiley Periodicals, Inc.
BioPlex Display: An Interactive Suite for Large-Scale AP-MS Protein-Protein Interaction Data.
Schweppe, Devin K; Huttlin, Edward L; Harper, J Wade; Gygi, Steven P
2018-01-05
The development of large-scale data sets requires a new means to display and disseminate research studies to large audiences. Knowledge of protein-protein interaction (PPI) networks has become a principle interest of many groups within the field of proteomics. At the confluence of technologies, such as cross-linking mass spectrometry, yeast two-hybrid, protein cofractionation, and affinity purification mass spectrometry (AP-MS), detection of PPIs can uncover novel biological inferences at a high-throughput. Thus new platforms to provide community access to large data sets are necessary. To this end, we have developed a web application that enables exploration and dissemination of the growing BioPlex interaction network. BioPlex is a large-scale interactome data set based on AP-MS of baits from the human ORFeome. The latest BioPlex data set release (BioPlex 2.0) contains 56 553 interactions from 5891 AP-MS experiments. To improve community access to this vast compendium of interactions, we developed BioPlex Display, which integrates individual protein querying, access to empirical data, and on-the-fly annotation of networks within an easy-to-use and mobile web application. BioPlex Display enables rapid acquisition of data from BioPlex and development of hypotheses based on protein interactions.
Resources for Functional Genomics Studies in Drosophila melanogaster
Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert
2014-01-01
Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003
Spatial and temporal dynamics of the cardiac mitochondrial proteome.
Lau, Edward; Huang, Derrick; Cao, Quan; Dincer, T Umut; Black, Caitie M; Lin, Amanda J; Lee, Jessica M; Wang, Ding; Liem, David A; Lam, Maggie P Y; Ping, Peipei
2015-04-01
Mitochondrial proteins alter in their composition and quantity drastically through time and space in correspondence to changing energy demands and cellular signaling events. The integrity and permutations of this dynamism are increasingly recognized to impact the functions of the cardiac proteome in health and disease. This article provides an overview on recent advances in defining the spatial and temporal dynamics of mitochondrial proteins in the heart. Proteomics techniques to characterize dynamics on a proteome scale are reviewed and the physiological consequences of altered mitochondrial protein dynamics are discussed. Lastly, we offer our perspectives on the unmet challenges in translating mitochondrial dynamics markers into the clinic.
The Office of Cancer Clinical Proteomics Research at the National Cancer Institute, part of the United States National Institutes of Health, is spearheading the preparationand training of the proteogenomic research workforce on an international scale.
Elucidating the fungal stress response by proteomics.
Kroll, Kristin; Pähtz, Vera; Kniemeyer, Olaf
2014-01-31
Fungal species need to cope with stress, both in the natural environment and during interaction of human- or plant pathogenic fungi with their host. Many regulatory circuits governing the fungal stress response have already been discovered. However, there are still large gaps in the knowledge concerning the changes of the proteome during adaptation to environmental stress conditions. With the application of proteomic methods, particularly 2D-gel and gel-free, LC/MS-based methods, first insights into the composition and dynamic changes of the fungal stress proteome could be obtained. Here, we review the recent proteome data generated for filamentous fungi and yeasts. This article is part of a Special Issue entitled: Trends in Microbial Proteomics. Copyright © 2013 Elsevier B.V. All rights reserved.
Ji, Jun; Ling, Jeffrey; Jiang, Helen; Wen, Qiaojun; Whitin, John C; Tian, Lu; Cohen, Harvey J; Ling, Xuefeng B
2013-03-23
Mass spectrometry (MS) has evolved to become the primary high throughput tool for proteomics based biomarker discovery. Until now, multiple challenges in protein MS data analysis remain: large-scale and complex data set management; MS peak identification, indexing; and high dimensional peak differential analysis with the concurrent statistical tests based false discovery rate (FDR). "Turnkey" solutions are needed for biomarker investigations to rapidly process MS data sets to identify statistically significant peaks for subsequent validation. Here we present an efficient and effective solution, which provides experimental biologists easy access to "cloud" computing capabilities to analyze MS data. The web portal can be accessed at http://transmed.stanford.edu/ssa/. Presented web application supplies large scale MS data online uploading and analysis with a simple user interface. This bioinformatic tool will facilitate the discovery of the potential protein biomarkers using MS.
Wu, Xiaolin
2016-01-01
The onion (Allium cepa L.) is widely planted worldwide as a valuable vegetable crop. The scales of an onion bulb are a modified type of leaf. The one-layer-cell epidermis of onion scales is commonly used as a model experimental material in botany and molecular biology. The lower epidermis (LE) and upper epidermis (UE) of onion scales display obvious differences in microscopic structure, cell differentiation and pigment synthesis; however, associated proteomic differences are unclear. LE and UE can be easily sampled as single-layer-cell tissues for comparative proteomic analysis. In this study, a proteomic approach based on 2-DE and mass spectrometry (MS) was applied to compare LE and UE of fleshy scales from yellow and red onions. We identified 47 differential abundant protein spots (representing 31 unique proteins) between LE and UE in red and yellow onions. These proteins are mainly involved in pigment synthesis, stress response, and cell division. Particularly, the differentially accumulated chalcone-flavanone isomerase and flavone O-methyltransferase 1-like in LE may result in the differences in the onion scale color between red and yellow onions. Moreover, stress-related proteins abundantly accumulated in both LE and UE. In addition, the differential accumulation of UDP-arabinopyranose mutase 1-like protein and β-1,3-glucanase in the LE may be related to the different cell sizes between LE and UE of the two types of onion. The data derived from this study provides new insight into the differences in differentiation and developmental processes between onion epidermises. This study may also make a contribution to onion breeding, such as improving resistances and changing colors. PMID:28036352
Wu, Si; Ning, Fen; Wu, Xiaolin; Wang, Wei
2016-01-01
The onion (Allium cepa L.) is widely planted worldwide as a valuable vegetable crop. The scales of an onion bulb are a modified type of leaf. The one-layer-cell epidermis of onion scales is commonly used as a model experimental material in botany and molecular biology. The lower epidermis (LE) and upper epidermis (UE) of onion scales display obvious differences in microscopic structure, cell differentiation and pigment synthesis; however, associated proteomic differences are unclear. LE and UE can be easily sampled as single-layer-cell tissues for comparative proteomic analysis. In this study, a proteomic approach based on 2-DE and mass spectrometry (MS) was applied to compare LE and UE of fleshy scales from yellow and red onions. We identified 47 differential abundant protein spots (representing 31 unique proteins) between LE and UE in red and yellow onions. These proteins are mainly involved in pigment synthesis, stress response, and cell division. Particularly, the differentially accumulated chalcone-flavanone isomerase and flavone O-methyltransferase 1-like in LE may result in the differences in the onion scale color between red and yellow onions. Moreover, stress-related proteins abundantly accumulated in both LE and UE. In addition, the differential accumulation of UDP-arabinopyranose mutase 1-like protein and β-1,3-glucanase in the LE may be related to the different cell sizes between LE and UE of the two types of onion. The data derived from this study provides new insight into the differences in differentiation and developmental processes between onion epidermises. This study may also make a contribution to onion breeding, such as improving resistances and changing colors.
USDA-ARS?s Scientific Manuscript database
In addition to microarray technology, which provides a robust method to study protein function in a rapid, economical, and proteome-wide fashion, plasmid-based functional proteomics is an important technology for rapidly obtaining large quantities of protein and determining protein function across a...
Advanced proteomic liquid chromatography
Xie, Fang; Smith, Richard D.; Shen, Yufeng
2012-01-01
Liquid chromatography coupled with mass spectrometry is the predominant platform used to analyze proteomics samples consisting of large numbers of proteins and their proteolytic products (e.g., truncated polypeptides) and spanning a wide range of relative concentrations. This review provides an overview of advanced capillary liquid chromatography techniques and methodologies that greatly improve separation resolving power and proteomics analysis coverage, sensitivity, and throughput. PMID:22840822
Why proteomics is not the new genomics and the future of mass spectrometry in cell biology.
Sidoli, Simone; Kulej, Katarzyna; Garcia, Benjamin A
2017-01-02
Mass spectrometry (MS) is an essential part of the cell biologist's proteomics toolkit, allowing analyses at molecular and system-wide scales. However, proteomics still lag behind genomics in popularity and ease of use. We discuss key differences between MS-based -omics and other booming -omics technologies and highlight what we view as the future of MS and its role in our increasingly deep understanding of cell biology. © 2017 Sidoli et al.
Zhao, Yunhe; Cui, Kaidi; Xu, Chunmei; Wang, Qiuhong; Wang, Yao; Zhang, Zhengqun; Liu, Feng; Mu, Wei
2016-11-24
Benzothiazole, a microbial secondary metabolite, has been demonstrated to possess fumigant activity against Sclerotinia sclerotiorum, Ditylenchus destructor and Bradysia odoriphaga. However, to facilitate the development of novel microbial pesticides, the mode of action of benzothiazole needs to be elucidated. Here, we employed iTRAQ-based quantitative proteomics analysis to investigate the effects of benzothiazole on the proteomic expression of B. odoriphaga. In response to benzothiazole, 92 of 863 identified proteins in B. odoriphaga exhibited altered levels of expression, among which 14 proteins were related to the action mechanism of benzothiazole, 11 proteins were involved in stress responses, and 67 proteins were associated with the adaptation of B. odoriphaga to benzothiazole. Further bioinformatics analysis indicated that the reduction in energy metabolism, inhibition of the detoxification process and interference with DNA and RNA synthesis were potentially associated with the mode of action of benzothiazole. The myosin heavy chain, succinyl-CoA synthetase and Ca + -transporting ATPase proteins may be related to the stress response. Increased expression of proteins involved in carbohydrate metabolism, energy production and conversion pathways was responsible for the adaptive response of B. odoriphaga. The results of this study provide novel insight into the molecular mechanisms of benzothiazole at a large-scale translation level and will facilitate the elucidation of the mechanism of action of benzothiazole.
Lavallée-Adam, Mathieu; Rauniyar, Navin; McClatchy, Daniel B; Yates, John R
2014-12-05
The majority of large-scale proteomics quantification methods yield long lists of quantified proteins that are often difficult to interpret and poorly reproduced. Computational approaches are required to analyze such intricate quantitative proteomics data sets. We propose a statistical approach to computationally identify protein sets (e.g., Gene Ontology (GO) terms) that are significantly enriched with abundant proteins with reproducible quantification measurements across a set of replicates. To this end, we developed PSEA-Quant, a protein set enrichment analysis algorithm for label-free and label-based protein quantification data sets. It offers an alternative approach to classic GO analyses, models protein annotation biases, and allows the analysis of samples originating from a single condition, unlike analogous approaches such as GSEA and PSEA. We demonstrate that PSEA-Quant produces results complementary to GO analyses. We also show that PSEA-Quant provides valuable information about the biological processes involved in cystic fibrosis using label-free protein quantification of a cell line expressing a CFTR mutant. Finally, PSEA-Quant highlights the differences in the mechanisms taking place in the human, rat, and mouse brain frontal cortices based on tandem mass tag quantification. Our approach, which is available online, will thus improve the analysis of proteomics quantification data sets by providing meaningful biological insights.
2015-01-01
The majority of large-scale proteomics quantification methods yield long lists of quantified proteins that are often difficult to interpret and poorly reproduced. Computational approaches are required to analyze such intricate quantitative proteomics data sets. We propose a statistical approach to computationally identify protein sets (e.g., Gene Ontology (GO) terms) that are significantly enriched with abundant proteins with reproducible quantification measurements across a set of replicates. To this end, we developed PSEA-Quant, a protein set enrichment analysis algorithm for label-free and label-based protein quantification data sets. It offers an alternative approach to classic GO analyses, models protein annotation biases, and allows the analysis of samples originating from a single condition, unlike analogous approaches such as GSEA and PSEA. We demonstrate that PSEA-Quant produces results complementary to GO analyses. We also show that PSEA-Quant provides valuable information about the biological processes involved in cystic fibrosis using label-free protein quantification of a cell line expressing a CFTR mutant. Finally, PSEA-Quant highlights the differences in the mechanisms taking place in the human, rat, and mouse brain frontal cortices based on tandem mass tag quantification. Our approach, which is available online, will thus improve the analysis of proteomics quantification data sets by providing meaningful biological insights. PMID:25177766
A novel spectral library workflow to enhance protein identifications.
Li, Haomin; Zong, Nobel C; Liang, Xiangbo; Kim, Allen K; Choi, Jeong Ho; Deng, Ning; Zelaya, Ivette; Lam, Maggie; Duan, Huilong; Ping, Peipei
2013-04-09
The innovations in mass spectrometry-based investigations in proteome biology enable systematic characterization of molecular details in pathophysiological phenotypes. However, the process of delineating large-scale raw proteomic datasets into a biological context requires high-throughput data acquisition and processing. A spectral library search engine makes use of previously annotated experimental spectra as references for subsequent spectral analyses. This workflow delivers many advantages, including elevated analytical efficiency and specificity as well as reduced demands in computational capacity. In this study, we created a spectral matching engine to address challenges commonly associated with a library search workflow. Particularly, an improved sliding dot product algorithm, that is robust to systematic drifts of mass measurement in spectra, is introduced. Furthermore, a noise management protocol distinguishes spectra correlation attributed from noise and peptide fragments. It enables elevated separation between target spectral matches and false matches, thereby suppressing the possibility of propagating inaccurate peptide annotations from library spectra to query spectra. Moreover, preservation of original spectra also accommodates user contributions to further enhance the quality of the library. Collectively, this search engine supports reproducible data analyses using curated references, thereby broadening the accessibility of proteomics resources to biomedical investigators. This article is part of a Special Issue entitled: From protein structures to clinical applications. Copyright © 2013 Elsevier B.V. All rights reserved.
2013-01-01
Armillaria mellea is a major plant pathogen. Yet, no large-scale “-omics” data are available to enable new studies, and limited experimental models are available to investigate basidiomycete pathogenicity. Here we reveal that the A. mellea genome comprises 58.35 Mb, contains 14473 gene models, of average length 1575 bp (4.72 introns/gene). Tandem mass spectrometry identified 921 mycelial (n = 629 unique) and secreted (n = 183 unique) proteins. Almost 100 mycelial proteins were either species-specific or previously unidentified at the protein level. A number of proteins (n = 111) was detected in both mycelia and culture supernatant extracts. Signal sequence occurrence was 4-fold greater for secreted (50.2%) compared to mycelial (12%) proteins. Analyses revealed a rich reservoir of carbohydrate degrading enzymes, laccases, and lignin peroxidases in the A. mellea proteome, reminiscent of both basidiomycete and ascomycete glycodegradative arsenals. We discovered that A. mellea exhibits a specific killing effect against Candida albicans during coculture. Proteomic investigation of this interaction revealed the unique expression of defensive and potentially offensive A. mellea proteins (n = 30). Overall, our data reveal new insights into the origin of basidiomycete virulence and we present a new model system for further studies aimed at deciphering fungal pathogenic mechanisms. PMID:23656496
Chen, Chen; Liu, Xiaohui; Zheng, Weimin; Zhang, Lei; Yao, Jun; Yang, Pengyuan
2014-04-04
To completely annotate the human genome, the task of identifying and characterizing proteins that currently lack mass spectrometry (MS) evidence is inevitable and urgent. In this study, as the first effort to screen missing proteins in large scale, we developed an approach based on SDS-PAGE followed by liquid chromatography-multiple reaction monitoring (LC-MRM), for screening of those missing proteins with only a single peptide hit in the previous liver proteome data set. Proteins extracted from normal human liver were separated in SDS-PAGE and digested in split gel slice, and the resulting digests were then subjected to LC-schedule MRM analysis. The MRM assays were developed through synthesized crude peptides for target peptides. In total, the expressions of 57 target proteins were confirmed from 185 MRM assays in normal human liver tissues. Among the proved 57 one-hit wonders, 50 proteins are of the minimally redundant set in the PeptideAtlas database, 7 proteins even have none MS-based information previously in various biological processes. We conclude that our SDS-PAGE-MRM workflow can be a powerful approach to screen missing or poorly characterized proteins in different samples and to provide their quantity if detected. The MRM raw data have been uploaded to ISB/SRM Atlas/PASSEL (PXD000648).
NASA Astrophysics Data System (ADS)
Pfammatter, Sibylle; Bonneil, Eric; McManus, Francis P.; Thibault, Pierre
2018-04-01
The small ubiquitin-like modifier (SUMO) is a member of the family of ubiquitin-like modifiers (UBLs) and is involved in important cellular processes, including DNA damage response, meiosis and cellular trafficking. The large-scale identification of SUMO peptides in a site-specific manner is challenging not only because of the low abundance and dynamic nature of this modification, but also due to the branched structure of the corresponding peptides that further complicate their identification using conventional search engines. Here, we exploited the unusual structure of SUMO peptides to facilitate their separation by high-field asymmetric waveform ion mobility spectrometry (FAIMS) and increase the coverage of SUMO proteome analysis. Upon trypsin digestion, branched peptides contain a SUMO remnant side chain and predominantly form triply protonated ions that facilitate their gas-phase separation using FAIMS. We evaluated the mobility characteristics of synthetic SUMO peptides and further demonstrated the application of FAIMS to profile the changes in protein SUMOylation of HEK293 cells following heat shock, a condition known to affect this modification. FAIMS typically provided a 10-fold improvement of detection limit of SUMO peptides, and enabled a 36% increase in SUMO proteome coverage compared to the same LC-MS/MS analyses performed without FAIMS. [Figure not available: see fulltext.
Kremer, Lukas P M; Leufken, Johannes; Oyunchimeg, Purevdulam; Schulze, Stefan; Fufezan, Christian
2016-03-04
Proteomics data integration has become a broad field with a variety of programs offering innovative algorithms to analyze increasing amounts of data. Unfortunately, this software diversity leads to many problems as soon as the data is analyzed using more than one algorithm for the same task. Although it was shown that the combination of multiple peptide identification algorithms yields more robust results, it is only recently that unified approaches are emerging; however, workflows that, for example, aim to optimize search parameters or that employ cascaded style searches can only be made accessible if data analysis becomes not only unified but also and most importantly scriptable. Here we introduce Ursgal, a Python interface to many commonly used bottom-up proteomics tools and to additional auxiliary programs. Complex workflows can thus be composed using the Python scripting language using a few lines of code. Ursgal is easily extensible, and we have made several database search engines (X!Tandem, OMSSA, MS-GF+, Myrimatch, MS Amanda), statistical postprocessing algorithms (qvality, Percolator), and one algorithm that combines statistically postprocessed outputs from multiple search engines ("combined FDR") accessible as an interface in Python. Furthermore, we have implemented a new algorithm ("combined PEP") that combines multiple search engines employing elements of "combined FDR", PeptideShaker, and Bayes' theorem.
Covering complete proteomes with X-ray structures: A current snapshot
Mizianty, Marcin J.; Fan, Xiao; Yan, Jing; ...
2014-10-23
Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtainedmore » through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-ray structure determination were determined.« less
The role of internal duplication in the evolution of multi-domain proteins.
Nacher, J C; Hayashida, M; Akutsu, T
2010-08-01
Many proteins consist of several structural domains. These multi-domain proteins have likely been generated by selective genome growth dynamics during evolution to perform new functions as well as to create structures that fold on a biologically feasible time scale. Domain units frequently evolved through a variety of genetic shuffling mechanisms. Here we examine the protein domain statistics of more than 1000 organisms including eukaryotic, archaeal and bacterial species. The analysis extends earlier findings on asymmetric statistical laws for proteome to a wider variety of species. While proteins are composed of a wide range of domains, displaying a power-law decay, the computation of domain families for each protein reveals an exponential distribution, characterizing a protein universe composed of a thin number of unique families. Structural studies in proteomics have shown that domain repeats, or internal duplicated domains, represent a small but significant fraction of genome. In spite of its importance, this observation has been largely overlooked until recently. We model the evolutionary dynamics of proteome and demonstrate that these distinct distributions are in fact rooted in an internal duplication mechanism. This process generates the contemporary protein structural domain universe, determines its reduced thickness, and tames its growth. These findings have important implications, ranging from protein interaction network modeling to evolutionary studies based on fundamental mechanisms governing genome expansion.
Hajdú, István; Flachner, Beáta; Bognár, Melinda; Végh, Barbara M; Dobi, Krisztina; Lőrincz, Zsolt; Lázár, József; Cseh, Sándor; Takács, László; Kurucz, István
2014-08-01
Monoclonal antibody proteomics uses nascent libraries or cloned (Plasmascan™, QuantiPlasma™) libraries of mAbs that react with individual epitopes of proteins in the human plasma. At the initial phase of library creation, cognate protein antigen and the epitope interacting with the antibodies are not known. Scouting for monoclonal antibodies (mAbs) with the best binding characteristics is of high importance for mAb based biomarker assay development. However, in the absence of the identity of the cognate antigen the task represents a challenge. We combined phage display, and surface plasmon resonance (Biacore) experiments to test whether specific phages and the respective mimotope peptides obtained from large scale studies are applicable to determine key features of antibodies for scouting. We show here that mAb captured phage-mimotope heterogeneity that is the diversity of the selected peptide sequences, is inversely correlated with an important binding descriptor; the off-rate of the antibodies and that represents clues for driving the selection of useful mAbs for biomarker assay development. Carefully chosen synthetic mimotope peptides are suitable for specificity testing in competitive assays using the target proteome, in our case the human plasma. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Wang, Yajuan; Yuan, Yanting; Liu, Jinwen; Su, Longxiang; Chang, De; Guo, Yinghua; Chen, Zhenhong; Fang, Xiangqun; Wang, Junfeng; Li, Tianzhi; Zhou, Lisha; Fang, Chengxiang; Yang, Ruifu; Liu, Changting
2014-04-01
The microgravity environment of spaceflight expeditions has been associated with altered microbial responses. This study explores the characterization of Serratia marcescensis grown in a spaceflight environment at the phenotypic, transcriptomic and proteomic levels. From November 1, 2011 to November 17, 2011, a strain of S. marcescensis was sent into space for 398 h on the Shenzhou VIII spacecraft, and ground simulation was performed as a control (LCT-SM213). After the flight, two mutant strains (LCT-SM166 and LCT-SM262) were selected for further analysis. Although no changes in the morphology, post-culture growth kinetics, hemolysis or antibiotic sensitivity were observed, the two mutant strains exhibited significant changes in their metabolic profiles after exposure to spaceflight. Enrichment analysis of the transcriptome showed that the differentially expressed genes of the two spaceflight strains and the ground control strain mainly included those involved in metabolism and degradation. The proteome revealed that changes at the protein level were also associated with metabolic functions, such as glycolysis/gluconeogenesis, pyruvate metabolism, arginine and proline metabolism and the degradation of valine, leucine and isoleucine. In summary S. marcescens showed alterations primarily in genes and proteins that were associated with metabolism under spaceflight conditions, which gave us valuable clues for future research.
Peroxisome Biogenesis and Function
Kaur, Navneet; Reumann, Sigrun; Hu, Jianping
2009-01-01
Peroxisomes are small and single membrane-delimited organelles that execute numerous metabolic reactions and have pivotal roles in plant growth and development. In recent years, forward and reverse genetic studies along with biochemical and cell biological analyses in Arabidopsis have enabled researchers to identify many peroxisome proteins and elucidate their functions. This review focuses on the advances in our understanding of peroxisome biogenesis and metabolism, and further explores the contribution of large-scale analysis, such as in sillco predictions and proteomics, in augmenting our knowledge of peroxisome function In Arabidopsis. PMID:22303249
Parasites, proteomes and systems: has Descartes' clock run out of time?
Wastling, J M; Armstrong, S D; Krishna, R; Xia, D
2012-08-01
Systems biology aims to integrate multiple biological data types such as genomics, transcriptomics and proteomics across different levels of structure and scale; it represents an emerging paradigm in the scientific process which challenges the reductionism that has dominated biomedical research for hundreds of years. Systems biology will nevertheless only be successful if the technologies on which it is based are able to deliver the required type and quality of data. In this review we discuss how well positioned is proteomics to deliver the data necessary to support meaningful systems modelling in parasite biology. We summarise the current state of identification proteomics in parasites, but argue that a new generation of quantitative proteomics data is now needed to underpin effective systems modelling. We discuss the challenges faced to acquire more complete knowledge of protein post-translational modifications, protein turnover and protein-protein interactions in parasites. Finally we highlight the central role of proteome-informatics in ensuring that proteomics data is readily accessible to the user-community and can be translated and integrated with other relevant data types.
Parasites, proteomes and systems: has Descartes’ clock run out of time?
WASTLING, J. M.; ARMSTRONG, S. D.; KRISHNA, R.; XIA, D.
2012-01-01
SUMMARY Systems biology aims to integrate multiple biological data types such as genomics, transcriptomics and proteomics across different levels of structure and scale; it represents an emerging paradigm in the scientific process which challenges the reductionism that has dominated biomedical research for hundreds of years. Systems biology will nevertheless only be successful if the technologies on which it is based are able to deliver the required type and quality of data. In this review we discuss how well positioned is proteomics to deliver the data necessary to support meaningful systems modelling in parasite biology. We summarise the current state of identification proteomics in parasites, but argue that a new generation of quantitative proteomics data is now needed to underpin effective systems modelling. We discuss the challenges faced to acquire more complete knowledge of protein post-translational modifications, protein turnover and protein-protein interactions in parasites. Finally we highlight the central role of proteome-informatics in ensuring that proteomics data is readily accessible to the user-community and can be translated and integrated with other relevant data types. PMID:22828391
Litichevskiy, Lev; Peckner, Ryan; Abelin, Jennifer G; Asiedu, Jacob K; Creech, Amanda L; Davis, John F; Davison, Desiree; Dunning, Caitlin M; Egertson, Jarrett D; Egri, Shawn; Gould, Joshua; Ko, Tak; Johnson, Sarah A; Lahr, David L; Lam, Daniel; Liu, Zihan; Lyons, Nicholas J; Lu, Xiaodong; MacLean, Brendan X; Mungenast, Alison E; Officer, Adam; Natoli, Ted E; Papanastasiou, Malvina; Patel, Jinal; Sharma, Vagisha; Toder, Courtney; Tubelli, Andrew A; Young, Jennie Z; Carr, Steven A; Golub, Todd R; Subramanian, Aravind; MacCoss, Michael J; Tsai, Li-Huei; Jaffe, Jacob D
2018-04-25
Although the value of proteomics has been demonstrated, cost and scale are typically prohibitive, and gene expression profiling remains dominant for characterizing cellular responses to perturbations. However, high-throughput sentinel assays provide an opportunity for proteomics to contribute at a meaningful scale. We present a systematic library resource (90 drugs × 6 cell lines) of proteomic signatures that measure changes in the reduced-representation phosphoproteome (P100) and changes in epigenetic marks on histones (GCP). A majority of these drugs elicited reproducible signatures, but notable cell line- and assay-specific differences were observed. Using the "connectivity" framework, we compared signatures across cell types and integrated data across assays, including a transcriptional assay (L1000). Consistent connectivity among cell types revealed cellular responses that transcended lineage, and consistent connectivity among assays revealed unexpected associations between drugs. We further leveraged the resource against public data to formulate hypotheses for treatment of multiple myeloma and acute lymphocytic leukemia. This resource is publicly available at https://clue.io/proteomics. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.
Liu, Tingwu; Jiang, Xinwu; Shi, Wuliang; Chen, Juan; Pei, Zhenming; Zheng, Hailei
2011-05-01
Acid rain is a worldwide environmental issue that has seriously destroyed forest ecosystems. As a highly effective and broad-spectrum plant resistance-inducing agent, β-aminobutyric acid could elevate the tolerance of Arabidopsis when subjected to simulated acid rain. Using comparative proteomic strategies, we analyzed 203 significantly varied proteins of which 175 proteins were identified responding to β-aminobutyric acid in the absence and presence of simulated acid rain. They could be divided into ten groups according to their biological functions. Among them, the majority was cell rescue, development and defense-related proteins, followed by transcription, protein synthesis, folding, modification and destination-associated proteins. Our conclusion is β-aminobutyric acid can lead to a large-scale primary metabolism change and simultaneously activate antioxidant system and salicylic acid, jasmonic acid, abscisic acid signaling pathways. In addition, β-aminobutyric acid can reinforce physical barriers to defend simulated acid rain stress. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Bai, Xiaocui; Song, Hao; Lavoie, Michel; Zhu, Kun; Su, Yiyuan; Ye, Hanqi; Chen, Si; Fu, Zhengwei; Qian, Haifeng
2016-01-01
Microalgae biosynthesize high amount of lipids and show high potential for renewable biodiesel production. However, the production cost of microalgae-derived biodiesel hampers large-scale biodiesel commercialization and new strategies for increasing lipid production efficiency from algae are urgently needed. Here we submitted the marine algae Phaeodactylum tricornutum to a 4-day dark stress, a condition increasing by 2.3-fold the total lipid cell quotas, and studied the cellular mechanisms leading to lipid accumulation using a combination of physiological, proteomic (iTRAQ) and genomic (qRT-PCR) approaches. Our results show that the expression of proteins in the biochemical pathways of glycolysis and the synthesis of fatty acids were induced in the dark, potentially using excess carbon and nitrogen produced from protein breakdown. Treatment of algae in the dark, which increased algal lipid cell quotas at low cost, combined with optimal growth treatment could help optimizing biodiesel production. PMID:27147218
Chen, Jin-Qiu; Wakefield, Lalage M; Goldstein, David J
2015-06-06
There is an emerging demand for the use of molecular profiling to facilitate biomarker identification and development, and to stratify patients for more efficient treatment decisions with reduced adverse effects. In the past decade, great strides have been made to advance genomic, transcriptomic and proteomic approaches to address these demands. While there has been much progress with these large scale approaches, profiling at the protein level still faces challenges due to limitations in clinical sample size, poor reproducibility, unreliable quantitation, and lack of assay robustness. A novel automated capillary nano-immunoassay (CNIA) technology has been developed. This technology offers precise and accurate measurement of proteins and their post-translational modifications using either charge-based or size-based separation formats. The system not only uses ultralow nanogram levels of protein but also allows multi-analyte analysis using a parallel single-analyte format for increased sensitivity and specificity. The high sensitivity and excellent reproducibility of this technology make it particularly powerful for analysis of clinical samples. Furthermore, the system can distinguish and detect specific protein post-translational modifications that conventional Western blot and other immunoassays cannot easily capture. This review will summarize and evaluate the latest progress to optimize the CNIA system for comprehensive, quantitative protein and signaling event characterization. It will also discuss how the technology has been successfully applied in both discovery research and clinical studies, for signaling pathway dissection, proteomic biomarker assessment, targeted treatment evaluation and quantitative proteomic analysis. Lastly, a comparison of this novel system with other conventional immuno-assay platforms is performed.
Activity-based protein profiling: from enzyme chemistry to proteomic chemistry.
Cravatt, Benjamin F; Wright, Aaron T; Kozarich, John W
2008-01-01
Genome sequencing projects have provided researchers with a complete inventory of the predicted proteins produced by eukaryotic and prokaryotic organisms. Assignment of functions to these proteins represents one of the principal challenges for the field of proteomics. Activity-based protein profiling (ABPP) has emerged as a powerful chemical proteomic strategy to characterize enzyme function directly in native biological systems on a global scale. Here, we review the basic technology of ABPP, the enzyme classes addressable by this method, and the biological discoveries attributable to its application.
de Jong, Luitzen; de Koning, Edward A; Roseboom, Winfried; Buncherd, Hansuk; Wanner, Martin J; Dapic, Irena; Jansen, Petra J; van Maarseveen, Jan H; Corthals, Garry L; Lewis, Peter J; Hamoen, Leendert W; de Koster, Chris G
2017-07-07
Identification of dynamic protein-protein interactions at the peptide level on a proteomic scale is a challenging approach that is still in its infancy. We have developed a system to cross-link cells directly in culture with the special lysine cross-linker bis(succinimidyl)-3-azidomethyl-glutarate (BAMG). We used the Gram-positive model bacterium Bacillus subtilis as an exemplar system. Within 5 min extensive intracellular cross-linking was detected, while intracellular cross-linking in a Gram-negative species, Escherichia coli, was still undetectable after 30 min, in agreement with the low permeability in this organism for lipophilic compounds like BAMG. We were able to identify 82 unique interprotein cross-linked peptides with <1% false discovery rate by mass spectrometry and genome-wide database searching. Nearly 60% of the interprotein cross-links occur in assemblies involved in transcription and translation. Several of these interactions are new, and we identified a binding site between the δ and β' subunit of RNA polymerase close to the downstream DNA channel, providing a clue into how δ might regulate promoter selectivity and promote RNA polymerase recycling. Our methodology opens new avenues to investigate the functional dynamic organization of complex protein assemblies involved in bacterial growth. Data are available via ProteomeXchange with identifier PXD006287.
Sleddering, Maria A; Markvoort, Albert J; Dharuri, Harish K; Jeyakar, Skhandhan; Snel, Marieke; Juhasz, Peter; Lynch, Moira; Hines, Wade; Li, Xiaohong; Jazet, Ingrid M; Adourian, Aram; Hilbers, Peter A J; Smit, Johannes W A; Van Dijk, Ko Willems
2014-01-01
Very low calorie diets (VLCD) with and without exercise programs lead to major metabolic improvements in obese type 2 diabetes patients. The mechanisms underlying these improvements have so far not been elucidated fully. To further investigate the mechanisms of a VLCD with or without exercise and to uncover possible biomarkers associated with these interventions, blood samples were collected from 27 obese type 2 diabetes patients before and after a 16-week VLCD (Modifast ∼ 450 kcal/day). Thirteen of these patients followed an exercise program in addition to the VCLD. Plasma was obtained from 27 lean and 27 obese controls as well. Proteomic analysis was performed using mass spectrometry (MS) and targeted multiple reaction monitoring (MRM) and a large scale isobaric tags for relative and absolute quantitation (iTRAQ) approach. After the 16-week VLCD, there was a significant decrease in body weight and HbA1c in all patients, without differences between the two intervention groups. Targeted MRM analysis revealed differences in several proteins, which could be divided in diabetes-associated (fibrinogen, transthyretin), obesity-associated (complement C3), and diet-associated markers (apolipoproteins, especially apolipoprotein A-IV). To further investigate the effects of exercise, large scale iTRAQ analysis was performed. However, no proteins were found showing an exercise effect. Thus, in this study, specific proteins were found to be differentially expressed in type 2 diabetes patients versus controls and before and after a VLCD. These proteins are potential disease state and intervention specific biomarkers. Controlled-Trials.com ISRCTN76920690.
Dharuri, Harish K.; Jeyakar, Skhandhan; Snel, Marieke; Juhasz, Peter; Lynch, Moira; Hines, Wade; Li, Xiaohong; Jazet, Ingrid M.; Adourian, Aram; Hilbers, Peter A. J.; Smit, Johannes W. A.; Van Dijk, Ko Willems
2014-01-01
Very low calorie diets (VLCD) with and without exercise programs lead to major metabolic improvements in obese type 2 diabetes patients. The mechanisms underlying these improvements have so far not been elucidated fully. To further investigate the mechanisms of a VLCD with or without exercise and to uncover possible biomarkers associated with these interventions, blood samples were collected from 27 obese type 2 diabetes patients before and after a 16-week VLCD (Modifast ∼450 kcal/day). Thirteen of these patients followed an exercise program in addition to the VCLD. Plasma was obtained from 27 lean and 27 obese controls as well. Proteomic analysis was performed using mass spectrometry (MS) and targeted multiple reaction monitoring (MRM) and a large scale isobaric tags for relative and absolute quantitation (iTRAQ) approach. After the 16-week VLCD, there was a significant decrease in body weight and HbA1c in all patients, without differences between the two intervention groups. Targeted MRM analysis revealed differences in several proteins, which could be divided in diabetes-associated (fibrinogen, transthyretin), obesity-associated (complement C3), and diet-associated markers (apolipoproteins, especially apolipoprotein A-IV). To further investigate the effects of exercise, large scale iTRAQ analysis was performed. However, no proteins were found showing an exercise effect. Thus, in this study, specific proteins were found to be differentially expressed in type 2 diabetes patients versus controls and before and after a VLCD. These proteins are potential disease state and intervention specific biomarkers. Trial Registration Controlled-Trials.com ISRCTN76920690 PMID:25415563
Mapping HLA-A2, -A3 and -B7 supertype-restricted T-cell epitopes in the ebolavirus proteome.
Lim, Wan Ching; Khan, Asif M
2018-01-19
Ebolavirus (EBOV) is responsible for one of the most fatal diseases encountered by mankind. Cellular T-cell responses have been implicated to be important in providing protection against the virus. Antigenic variation can result in viral escape from immune recognition. Mapping targets of immune responses among the sequence of viral proteins is, thus, an important first step towards understanding the immune responses to viral variants and can aid in the identification of vaccine targets. Herein, we performed a large-scale, proteome-wide mapping and diversity analyses of putative HLA supertype-restricted T-cell epitopes of Zaire ebolavirus (ZEBOV), the most pathogenic species among the EBOV family. All publicly available ZEBOV sequences (14,098) for each of the nine viral proteins were retrieved, removed of irrelevant and duplicate sequences, and aligned. The overall proteome diversity of the non-redundant sequences was studied by use of Shannon's entropy. The sequences were predicted, by use of the NetCTLpan server, for HLA-A2, -A3, and -B7 supertype-restricted epitopes, which are relevant to African and other ethnicities and provide for large (~86%) population coverage. The predicted epitopes were mapped to the alignment of each protein for analyses of antigenic sequence diversity and relevance to structure and function. The putative epitopes were validated by comparison with experimentally confirmed epitopes. ZEBOV proteome was generally conserved, with an average entropy of 0.16. The 185 HLA supertype-restricted T-cell epitopes predicted (82 (A2), 37 (A3) and 66 (B7)) mapped to 125 alignment positions and covered ~24% of the proteome length. Many of the epitopes showed a propensity to co-localize at select positions of the alignment. Thirty (30) of the mapped positions were completely conserved and may be attractive for vaccine design. The remaining (95) positions had one or more epitopes, with or without non-epitope variants. A significant number (24) of the putative epitopes matched reported experimentally validated HLA ligands/T-cell epitopes of A2, A3 and/or B7 supertype representative allele restrictions. The epitopes generally corresponded to functional motifs/domains and there was no correlation to localization on the protein 3D structure. These data and the epitope map provide important insights into the interaction between EBOV and the host immune system.
Riffle, Michael; Eng, Jimmy K.
2010-01-01
The field of proteomics, particularly the application of mass spectrometry analysis to protein samples, is well-established and growing rapidly. Proteomics studies generate large volumes of raw experimental data and inferred biological results. To facilitate the dissemination of these data, centralized data repositories have been developed that make the data and results accessible to proteomics researchers and biologists alike. This review of proteomics data repositories focuses exclusively on freely-available, centralized data resources that disseminate or store experimental mass spectrometry data and results. The resources chosen reflect a current “snapshot” of the state of resources available with an emphasis placed on resources that may be of particular interest to yeast researchers. Resources are described in terms of their intended purpose and the features and functionality provided to users. PMID:19795424
Abdallah, Cosette; Valot, Benoit; Guillier, Christelle; Mounier, Arnaud; Balliau, Thierry; Zivy, Michel; van Tuinen, Diederik; Renaut, Jenny; Wipf, Daniel; Dumas-Gaudot, Eliane; Recorbet, Ghislaine
2014-08-28
Arbuscular mycorrhizal (AM) symbiosis that associates roots of most land plants with soil-borne fungi (Glomeromycota), is characterized by reciprocal nutritional benefits. Fungal colonization of plant roots induces massive changes in cortical cells where the fungus differentiates an arbuscule, which drives proliferation of the plasma membrane. Despite the recognized importance of membrane proteins in sustaining AM symbiosis, the root microsomal proteome elicited upon mycorrhiza still remains to be explored. In this study, we first examined the qualitative composition of the root membrane proteome of Medicago truncatula after microsome enrichment and subsequent in depth analysis by GeLC-MS/MS. The results obtained highlighted the identification of 1226 root membrane protein candidates whose cellular and functional classifications predispose plastids and protein synthesis as prevalent organelle and function, respectively. Changes at the protein abundance level between the membrane proteomes of mycorrhizal and nonmycorrhizal roots were further monitored by spectral counting, which retrieved a total of 96 proteins that displayed a differential accumulation upon AM symbiosis. Besides the canonical markers of the periarbuscular membrane, new candidates supporting the importance of membrane trafficking events during mycorrhiza establishment/functioning were identified, including flotillin-like proteins. The data have been deposited to the ProteomeXchange with identifier PXD000875. During arbuscular mycorrhizal symbiosis, one of the most widespread mutualistic associations in nature, the endomembrane system of plant roots is believed to undergo qualitative and quantitative changes in order to sustain both the accommodation process of the AM fungus within cortical cells and the exchange of nutrients between symbionts. Large-scale GeLC-MS/MS proteomic analysis of the membrane fractions from mycorrhizal and nonmycorrhizal roots of M. truncatula coupled to spectral counting retrieved around one hundred proteins that displayed changes in abundance upon mycorrhizal establishment. The symbiosis-related membrane proteins that were identified mostly function in signaling/membrane trafficking and nutrient uptake regulation. Besides extending the coverage of the root membrane proteome of M. truncatula, new candidates involved in the symbiotic program emerged from the current study, which pointed out a dynamic reorganization of microsomal proteins during the accommodation of AM fungi within cortical cells. Copyright © 2014 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.
Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the “generalist” (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions, prediction errors for growth rate and metabolic fluxes were 69% and 14% lower, respectively. The sector-constrained ME model thusmore » represents a generalist ME model reflecting both growth rate maximization and “hedging” against uncertain environments and stresses, as indicated by significant enrichment of these sectors for the general stress response sigma factor σS. Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally-related protein groups) as demonstrated here. Furthermore, this flexible formalism provides an accessible approach for narrowing the gap between the complexity captured by omics data and governing principles of proteome allocation described by systems-level models.« less
Ivarsson, Ylva; Arnold, Roland; McLaughlin, Megan; Nim, Satra; Joshi, Rakesh; Ray, Debashish; Liu, Bernard; Teyra, Joan; Pawson, Tony; Moffat, Jason; Li, Shawn Shun-Cheng; Sidhu, Sachdev S; Kim, Philip M
2014-02-18
The human proteome contains a plethora of short linear motifs (SLiMs) that serve as binding interfaces for modular protein domains. Such interactions are crucial for signaling and other cellular processes, but are difficult to detect because of their low to moderate affinities. Here we developed a dedicated approach, proteomic peptide-phage display (ProP-PD), to identify domain-SLiM interactions. Specifically, we generated phage libraries containing all human and viral C-terminal peptides using custom oligonucleotide microarrays. With these libraries we screened the nine PSD-95/Dlg/ZO-1 (PDZ) domains of human Densin-180, Erbin, Scribble, and Disks large homolog 1 for peptide ligands. We identified several known and putative interactions potentially relevant to cellular signaling pathways and confirmed interactions between full-length Scribble and the target proteins β-PIX, plakophilin-4, and guanylate cyclase soluble subunit α-2 using colocalization and coimmunoprecipitation experiments. The affinities of recombinant Scribble PDZ domains and the synthetic peptides representing the C termini of these proteins were in the 1- to 40-μM range. Furthermore, we identified several well-established host-virus protein-protein interactions, and confirmed that PDZ domains of Scribble interact with the C terminus of Tax-1 of human T-cell leukemia virus with micromolar affinity. Previously unknown putative viral protein ligands for the PDZ domains of Scribble and Erbin were also identified. Thus, we demonstrate that our ProP-PD libraries are useful tools for probing PDZ domain interactions. The method can be extended to interrogate all potential eukaryotic, bacterial, and viral SLiMs and we suggest it will be a highly valuable approach for studying cellular and pathogen-host protein-protein interactions.
Yang, Xiaoli; Li, Hongtao; Zhang, Chengdong; Lin, Zhidi; Zhang, Xinhua; Zhang, Youjie; Yu, Yanbao; Liu, Kun; Li, Muyan; Zhang, Yuening; Lv, Wenxin; Xie, Yuanliang; Lu, Zheng; Wu, Chunlei; Teng, Ruobing; Lu, Shaoming; He, Min; Mo, Zengnan
2015-10-01
Prostatitis is one of the most common urological problems afflicting adult men. The etiology and pathogenesis of nonbacterial prostatitis, which accounts for 90-95% of cases, is largely unknown. As serum proteins often indicate the overall pathologic status of patients, we hypothesized that protein biomarkers of prostatitis might be identified by comparing the serum proteomes of patients with and without nonbacterial prostatitis. All untreated samples were collected from subjects attending the Fangchenggang Area Male Health and Examination Survey (FAMHES). We profiled pooled serum samples from four carefully selected groups of patients (n = 10/group) representing the various categories of nonbacterial prostatitis (IIIa, IIIb, and IV) and matched healthy controls using a mass spectrometry-based 4-plex iTRAQ proteomic approach. More than 160 samples were validated by ELISA. Overall, 69 proteins were identified. Among them, 42, 52, and 37 proteins were identified with differential expression in Category IIIa, IIIb, and IV prostatitis, respectively. The 19 common proteins were related to immunity and defense, ion binding, transport, and proteolysis. Two zinc-binding proteins, superoxide dismutase 3 (SOD3), and carbonic anhydrase I (CA1), were significantly higher in all types of prostatitis than in the control. A receiver operating characteristic curve estimated sensitivities of 50.4 and 68.1% and specificities of 92.1 and 83.8% for CA1 and SOD3, respectively, in detecting nonbacterial prostatitis. The serum CA1 concentration was inversely correlated to the zinc concentration in expressed-prostatic secretions. Our findings suggest that SOD3 and CA1 are potential diagnostic markers of nonbacterial prostatitis, although further large-scale studies are required. The molecular profiles of nonbacterial prostatitis pathogenesis may lay a foundation for discovery of new therapies. © 2015 Wiley Periodicals, Inc.
Expert system for computer-assisted annotation of MS/MS spectra.
Neuhauser, Nadin; Michalski, Annette; Cox, Jürgen; Mann, Matthias
2012-11-01
An important step in mass spectrometry (MS)-based proteomics is the identification of peptides by their fragment spectra. Regardless of the identification score achieved, almost all tandem-MS (MS/MS) spectra contain remaining peaks that are not assigned by the search engine. These peaks may be explainable by human experts but the scale of modern proteomics experiments makes this impractical. In computer science, Expert Systems are a mature technology to implement a list of rules generated by interviews with practitioners. We here develop such an Expert System, making use of literature knowledge as well as a large body of high mass accuracy and pure fragmentation spectra. Interestingly, we find that even with high mass accuracy data, rule sets can quickly become too complex, leading to over-annotation. Therefore we establish a rigorous false discovery rate, calculated by random insertion of peaks from a large collection of other MS/MS spectra, and use it to develop an optimized knowledge base. This rule set correctly annotates almost all peaks of medium or high abundance. For high resolution HCD data, median intensity coverage of fragment peaks in MS/MS spectra increases from 58% by search engine annotation alone to 86%. The resulting annotation performance surpasses a human expert, especially on complex spectra such as those of larger phosphorylated peptides. Our system is also applicable to high resolution collision-induced dissociation data. It is available both as a part of MaxQuant and via a webserver that only requires an MS/MS spectrum and the corresponding peptides sequence, and which outputs publication quality, annotated MS/MS spectra (www.biochem.mpg.de/mann/tools/). It provides expert knowledge to beginners in the field of MS-based proteomics and helps advanced users to focus on unusual and possibly novel types of fragment ions.
Expert System for Computer-assisted Annotation of MS/MS Spectra*
Neuhauser, Nadin; Michalski, Annette; Cox, Jürgen; Mann, Matthias
2012-01-01
An important step in mass spectrometry (MS)-based proteomics is the identification of peptides by their fragment spectra. Regardless of the identification score achieved, almost all tandem-MS (MS/MS) spectra contain remaining peaks that are not assigned by the search engine. These peaks may be explainable by human experts but the scale of modern proteomics experiments makes this impractical. In computer science, Expert Systems are a mature technology to implement a list of rules generated by interviews with practitioners. We here develop such an Expert System, making use of literature knowledge as well as a large body of high mass accuracy and pure fragmentation spectra. Interestingly, we find that even with high mass accuracy data, rule sets can quickly become too complex, leading to over-annotation. Therefore we establish a rigorous false discovery rate, calculated by random insertion of peaks from a large collection of other MS/MS spectra, and use it to develop an optimized knowledge base. This rule set correctly annotates almost all peaks of medium or high abundance. For high resolution HCD data, median intensity coverage of fragment peaks in MS/MS spectra increases from 58% by search engine annotation alone to 86%. The resulting annotation performance surpasses a human expert, especially on complex spectra such as those of larger phosphorylated peptides. Our system is also applicable to high resolution collision-induced dissociation data. It is available both as a part of MaxQuant and via a webserver that only requires an MS/MS spectrum and the corresponding peptides sequence, and which outputs publication quality, annotated MS/MS spectra (www.biochem.mpg.de/mann/tools/). It provides expert knowledge to beginners in the field of MS-based proteomics and helps advanced users to focus on unusual and possibly novel types of fragment ions. PMID:22888147
Woo, Sunghee; Cha, Seong Won; Na, Seungjin; ...
2014-11-17
Cancer is driven by the acquisition of somatic DNA lesions. Distinguishing the early driver mutations from subsequent passenger mutations is key to molecular sub-typing of cancers, and the discovery of novel biomarkers. The availability of genomics technologies (mainly wholegenome and exome sequencing, and transcript sampling via RNA-seq, collectively referred to as NGS) have fueled recent studies on somatic mutation discovery. However, the vision is challenged by the complexity, redundancy, and errors in genomic data, and the difficulty of investigating the proteome using only genomic approaches. Recently, combination of proteomic and genomic technologies are increasingly employed. However, the complexity and redundancymore » of NGS data remains a challenge for proteogenomics, and various trade-offs must be made to allow for the searches to take place. This paperprovides a discussion of two such trade-offs, relating to large database search, and FDR calculations, and their implication to cancer proteogenomics. Moreover, it extends and develops the idea of a unified genomic variant database that can be searched by any mass spectrometry sample. A total of 879 BAM files downloaded from TCGA repository were used to create a 4.34 GB unified FASTA database which contained 2,787,062 novel splice junctions, 38,464 deletions, 1105 insertions, and 182,302 substitutions. Proteomic data from a single ovarian carcinoma sample (439,858 spectra) was searched against the database. By applying the most conservative FDR measure, we have identified 524 novel peptides and 65,578 known peptides at 1% FDR threshold. The novel peptides include interesting examples of doubly mutated peptides, frame-shifts, and non-sample-recruited mutations, which emphasize the strength of our approach.« less
Analysis of the pumpkin phloem proteome provides insights into angiosperm sieve tube function.
Lin, Ming-Kuem; Lee, Young-Jin; Lough, Tony J; Phinney, Brett S; Lucas, William J
2009-02-01
Increasing evidence suggests that proteins present in the angiosperm sieve tube system play an important role in the long distance signaling system of plants. To identify the nature of these putatively non-cell-autonomous proteins, we adopted a large scale proteomics approach to analyze pumpkin phloem exudates. Phloem proteins were fractionated by fast protein liquid chromatography using both anion and cation exchange columns and then either in-solution or in-gel digested following further separation by SDS-PAGE. A total of 345 LC-MS/MS data sets were analyzed using a combination of Mascot and X!Tandem against the NCBI non-redundant green plant database and an extensive Cucurbit maxima expressed sequence tag database. In this analysis, 1,209 different consensi were obtained of which 1,121 could be annotated from GenBank and BLAST search analyses against three plant species, Arabidopsis thaliana, rice (Oryza sativa), and poplar (Populus trichocarpa). Gene ontology (GO) enrichment analyses identified sets of phloem proteins that function in RNA binding, mRNA translation, ubiquitin-mediated proteolysis, and macromolecular and vesicle trafficking. Our findings indicate that protein synthesis and turnover, processes that were thought to be absent in enucleate sieve elements, likely occur within the angiosperm phloem translocation stream. In addition, our GO analysis identified a set of phloem proteins that are associated with the GO term "embryonic development ending in seed dormancy"; this finding raises the intriguing question as to whether the phloem may exert some level of control over seed development. The universal significance of the phloem proteome was highlighted by conservation of the phloem proteome in species as diverse as monocots (rice), eudicots (Arabidopsis and pumpkin), and trees (poplar). These results are discussed from the perspective of the role played by the phloem proteome as an integral component of the whole plant communication system.
Wimmer, Helge; Gundacker, Nina C; Griss, Johannes; Haudek, Verena J; Stättner, Stefan; Mohr, Thomas; Zwickl, Hannes; Paulitschke, Verena; Baron, David M; Trittner, Wolfgang; Kubicek, Markus; Bayer, Editha; Slany, Astrid; Gerner, Christopher
2009-06-01
Interpretation of proteome data with a focus on biomarker discovery largely relies on comparative proteome analyses. Here, we introduce a database-assisted interpretation strategy based on proteome profiles of primary cells. Both 2-D-PAGE and shotgun proteomics are applied. We obtain high data concordance with these two different techniques. When applying mass analysis of tryptic spot digests from 2-D gels of cytoplasmic fractions, we typically identify several hundred proteins. Using the same protein fractions, we usually identify more than thousand proteins by shotgun proteomics. The data consistency obtained when comparing these independent data sets exceeds 99% of the proteins identified in the 2-D gels. Many characteristic differences in protein expression of different cells can thus be independently confirmed. Our self-designed SQL database (CPL/MUW - database of the Clinical Proteomics Laboratories at the Medical University of Vienna accessible via www.meduniwien.ac.at/proteomics/database) facilitates (i) quality management of protein identification data, which are based on MS, (ii) the detection of cell type-specific proteins and (iii) of molecular signatures of specific functional cell states. Here, we demonstrate, how the interpretation of proteome profiles obtained from human liver tissue and hepatocellular carcinoma tissue is assisted by the Clinical Proteomics Laboratories at the Medical University of Vienna-database. Therefore, we suggest that the use of reference experiments supported by a tailored database may substantially facilitate data interpretation of proteome profiling experiments.
A scalable strategy for high-throughput GFP tagging of endogenous human proteins.
Leonetti, Manuel D; Sekine, Sayaka; Kamiyama, Daichi; Weissman, Jonathan S; Huang, Bo
2016-06-21
A central challenge of the postgenomic era is to comprehensively characterize the cellular role of the ∼20,000 proteins encoded in the human genome. To systematically study protein function in a native cellular background, libraries of human cell lines expressing proteins tagged with a functional sequence at their endogenous loci would be very valuable. Here, using electroporation of Cas9 nuclease/single-guide RNA ribonucleoproteins and taking advantage of a split-GFP system, we describe a scalable method for the robust, scarless, and specific tagging of endogenous human genes with GFP. Our approach requires no molecular cloning and allows a large number of cell lines to be processed in parallel. We demonstrate the scalability of our method by targeting 48 human genes and show that the resulting GFP fluorescence correlates with protein expression levels. We next present how our protocols can be easily adapted for the tagging of a given target with GFP repeats, critically enabling the study of low-abundance proteins. Finally, we show that our GFP tagging approach allows the biochemical isolation of native protein complexes for proteomic studies. Taken together, our results pave the way for the large-scale generation of endogenously tagged human cell lines for the proteome-wide analysis of protein localization and interaction networks in a native cellular context.
New Markers for Predicting Fertility of the Male Gametes in the Post Genomic Age.
Dipresa, Savina; De Toni, Luca; Foresta, Carlo; Garolla, Andrea
2018-04-18
A number of test have been proposed to assess male fertility potential, ranging from routine testing by light microscopic method for evaluating semen samples, to screening test for DNA integrity aimed to look at sperm chromatin abnormalities. Spermatozoa are an extremely differentiated cell, they have critical functions for embryo development and heredity, in addiction to delivering a haploid paternal genome to the oocyte. Towards this goal certain requirements must always be met. The ability of spermatozoa to perform its reproductive function taking place in the spermatogenesis, a highly specialized process depending on multiple factors with effect on male fertility. In the past 30 years, large-scale analyses of transcriptomic and genome expression in mammals have generated a large amount of informations on numberless biomolecules involved in spermatogenesis and male germ cell reproductive function. Sperm proteome represents the protein content that spermatozoa needs to survive and work correctly and modifications of sperm proteome play a role in determining functional changes leading to a decrease of reproductive competence into affected spermatozoa. The post-genomic approach consists of different methodologies for concurrently testicular transcriptome studies, protein compositional analysis and metabolomics findings of the spermatozoa in humans. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Quantitative proteomics in Giardia duodenalis-Achievements and challenges.
Emery, Samantha J; Lacey, Ernest; Haynes, Paul A
2016-08-01
Giardia duodenalis (syn. G. lamblia and G. intestinalis) is a protozoan parasite of vertebrates and a major contributor to the global burden of diarrheal diseases and gastroenteritis. The publication of multiple genome sequences in the G. duodenalis species complex has provided important insights into parasite biology, and made post-genomic technologies, including proteomics, significantly more accessible. The aims of proteomics are to identify and quantify proteins present in a cell, and assign functions to them within the context of dynamic biological systems. In Giardia, proteomics in the post-genomic era has transitioned from reliance on gel-based systems to utilisation of a diverse array of techniques based on bottom-up LC-MS/MS technologies. Together, these have generated crucial foundations for subcellular proteomes, elucidated intra- and inter-assemblage isolate variation, and identified pathways and markers in differentiation, host-parasite interactions and drug resistance. However, in Giardia, proteomics remains an emerging field, with considerable shortcomings evident from the published research. These include a bias towards assemblage A, a lack of emphasis on quantitative analytical techniques, and limited information on post-translational protein modifications. Additionally, there are multiple areas of research for which proteomic data is not available to add value to published transcriptomic data. The challenge of amalgamating data in the systems biology paradigm necessitates the further generation of large, high-quality quantitative datasets to accurately model parasite biology. This review surveys the current proteomic research available for Giardia and evaluates their technical and quantitative approaches, while contextualising their biological insights into parasite pathology, isolate variation and eukaryotic evolution. Finally, we propose areas of priority for the generation of future proteomic data to explore fundamental questions in Giardia, including the analysis of post-translational modifications, and the design of MS-based assays for validation of differentially expressed proteins in large datasets. Copyright © 2016 Elsevier B.V. All rights reserved.
A Unique Model Platform for C4 Plant Systems and Synthetic Biology
2015-12-10
International Conference in Bioinformatics , Sydney, Australia, July 31 - August 2, 2014. Nielsen LK (2015) Genome scale metabolic and regulatory...the comparison of transcriptome proteome and central metabolome in mature and immature tissue. Preliminary data were obtained suggesting successful...guide the comparison of transcriptome, proteome and central metabolome in mature and immature tissue. Preliminary data were obtained suggesting
Workflow based framework for life science informatics.
Tiwari, Abhishek; Sekhar, Arvind K T
2007-10-01
Workflow technology is a generic mechanism to integrate diverse types of available resources (databases, servers, software applications and different services) which facilitate knowledge exchange within traditionally divergent fields such as molecular biology, clinical research, computational science, physics, chemistry and statistics. Researchers can easily incorporate and access diverse, distributed tools and data to develop their own research protocols for scientific analysis. Application of workflow technology has been reported in areas like drug discovery, genomics, large-scale gene expression analysis, proteomics, and system biology. In this article, we have discussed the existing workflow systems and the trends in applications of workflow based systems.
Gautam, Vibhav; Sarkar, Ananda K
2015-04-01
Laser assisted microdissection (LAM) is an advanced technology used to perform tissue or cell-specific expression profiling of genes and proteins, owing to its ability to isolate the desired tissue or cell type from a heterogeneous population. Due to the specificity and high efficiency acquired during its pioneering use in medical science, the LAM technique has quickly been adopted for use in many biological researches. Today, it has become a potent tool to address a wide range of questions in diverse field of plant biology. Beginning with comparative transcriptome analysis of different tissues such as reproductive parts, meristems, lateral organs, roots etc., LAM has also been extensively used in plant-pathogen interaction studies, proteomics, and metabolomics. In combination with next generation sequencing and proteomics analysis, LAM has opened up promising opportunities in the area of large scale functional studies in plants. Ever since the advent of this technique, significant improvements have been achieved in term of its instrumentation and method, which has made LAM a more efficient tool applicable in wider research areas. Here, we discuss the advancement of LAM technique with special emphasis on its methodology and highlight its scope in modern research areas of plant biology. Although we put emphasis on use of LAM in transcriptome studies, which is mostly used, we also discuss its recent application and scope in proteome and metabolome studies.
Rikkerink, Erik H A
2018-03-08
Organisms face stress from multiple sources simultaneously and require mechanisms to respond to these scenarios if they are to survive in the long term. This overview focuses on a series of key points that illustrate how disorder and post-translational changes can combine to play a critical role in orchestrating the response of organisms to the stress of a changing environment. Increasingly, protein complexes are thought of as dynamic multi-component molecular machines able to adapt through compositional, conformational and/or post-translational modifications to control their largely metabolic outputs. These metabolites then feed into cellular physiological homeostasis or the production of secondary metabolites with novel anti-microbial properties. The control of adaptations to stress operates at multiple levels including the proteome and the dynamic nature of proteomic changes suggests a parallel with the equally dynamic epigenetic changes at the level of nucleic acids. Given their properties, I propose that some disordered protein platforms specifically enable organisms to sense and react rapidly as the first line of response to change. Using examples from the highly dynamic host-pathogen and host-stress response, I illustrate by example how disordered proteins are key to fulfilling the need for multiple levels of integration of response at different time scales to create robust control points.
de Santana Costa, Marília Gabriela; Mazzafera, Paulo; Balbuena, Tiago Santana
2017-05-01
Eucalyptus grandis and Eucalyptus globulus are among the most widely cultivated trees, differing in lignin composition and plantation areas, as E. grandis is mostly cultivated in tropical regions while E. globulus is preferred in temperate areas. As temperature is a key modulator in plant metabolism, a large-scale proteome analysis was carried out to investigate changes in the antioxidant system and the lignification metabolism in plantlets grown at different temperatures. Our strategy allowed the identification of 3111 stem proteins. A total of 103 antioxidant proteins were detected in the stems of both species. Hierarchical clustering revealed that alterations in the antioxidant proteins are more prominent when Eucalyptus seedlings were exposed to high temperature and that the superoxide isoforms coded by the gene Eucgr.B03930 are the most abundant antioxidant enzymes induced by thermal stimulus. Regarding the lignin biosynthesis, our proteomics approach resulted in the identification of 13 of the 17 core proteins involved in this metabolism, corroborating with gene predictions and the proposed lignin toolbox. Quantitative analyses revealed significant differences in 8 protein isoforms, including the ferulate 5-hydroxylase isoform F5H1, a key enzyme in catalyzing the synthesis of sinapyl alcohol, and the cinnamyl alcohol dehydrogenase isoform CAD2, the last enzyme in monolignol biosynthesis. Data are available via ProteomeXchange with identifier PXD005743. Copyright © 2017 Elsevier Ltd. All rights reserved.
Okada, Hirokazu; Ebhardt, H Alexander; Vonesch, Sibylle Chantal; Aebersold, Ruedi; Hafen, Ernst
2016-09-01
The manner by which genetic diversity within a population generates individual phenotypes is a fundamental question of biology. To advance the understanding of the genotype-phenotype relationships towards the level of biochemical processes, we perform a proteome-wide association study (PWAS) of a complex quantitative phenotype. We quantify the variation of wing imaginal disc proteomes in Drosophila genetic reference panel (DGRP) lines using SWATH mass spectrometry. In spite of the very large genetic variation (1/36 bp) between the lines, proteome variability is surprisingly small, indicating strong molecular resilience of protein expression patterns. Proteins associated with adult wing size form tight co-variation clusters that are enriched in fundamental biochemical processes. Wing size correlates with some basic metabolic functions, positively with glucose metabolism but negatively with mitochondrial respiration and not with ribosome biogenesis. Our study highlights the power of PWAS to filter functional variants from the large genetic variability in natural populations.
NASA Astrophysics Data System (ADS)
Li, Huilin; Nguyen, Hong Hanh; Ogorzalek Loo, Rachel R.; Campuzano, Iain D. G.; Loo, Joseph A.
2018-02-01
Mass spectrometry (MS) has become a crucial technique for the analysis of protein complexes. Native MS has traditionally examined protein subunit arrangements, while proteomics MS has focused on sequence identification. These two techniques are usually performed separately without taking advantage of the synergies between them. Here we describe the development of an integrated native MS and top-down proteomics method using Fourier-transform ion cyclotron resonance (FTICR) to analyse macromolecular protein complexes in a single experiment. We address previous concerns of employing FTICR MS to measure large macromolecular complexes by demonstrating the detection of complexes up to 1.8 MDa, and we demonstrate the efficacy of this technique for direct acquirement of sequence to higher-order structural information with several large complexes. We then summarize the unique functionalities of different activation/dissociation techniques. The platform expands the ability of MS to integrate proteomics and structural biology to provide insights into protein structure, function and regulation.
Infrared Multiphoton Dissociation for Quantitative Shotgun Proteomics
Ledvina, Aaron R.; Lee, M. Violet; McAlister, Graeme C.; Westphall, Michael S.; Coon, Joshua J.
2012-01-01
We modified a dual-cell linear ion trap mass spectrometer to perform infrared multiphoton dissociation (IRMPD) in the low pressure trap of a dual-cell quadrupole linear ion trap (dual cell QLT) and perform large-scale IRMPD analyses of complex peptide mixtures. Upon optimization of activation parameters (precursor q-value, irradiation time, and photon flux), IRMPD subtly, but significantly outperforms resonant excitation CAD for peptides identified at a 1% false-discovery rate (FDR) from a yeast tryptic digest (95% confidence, p = 0.019). We further demonstrate that IRMPD is compatible with the analysis of isobaric-tagged peptides. Using fixed QLT RF amplitude allows for the consistent retention of reporter ions, but necessitates the use of variable IRMPD irradiation times, dependent upon precursor mass-to-charge (m/z). We show that IRMPD activation parameters can be tuned to allow for effective peptide identification and quantitation simultaneously. We thus conclude that IRMPD performed in a dual-cell ion trap is an effective option for the large-scale analysis of both unmodified and isobaric-tagged peptides. PMID:22480380
Kedia, Komal; Nichols, Caitlin A; Thulin, Craig D; Graves, Steven W
2015-11-01
Tissue proteomics has relied heavily on two-dimensional gel electrophoresis, for protein separation and quantification, then single protein isolation, trypsin digestion, and mass spectrometric protein identification. Such methods are predominantly used for study of high-abundance, full-length proteins. Tissue peptidomics has recently been developed but is still used to study the most highly abundant species, often resulting in observation and identification of dozens of peptides only. Tissue lipidomics is likewise new, and reported studies are limited. We have developed an "omics" approach that enables over 7,000 low-molecular-weight, low-abundance species to be surveyed and have applied this to human placental tissue. Because the placenta is believed to be involved in complications of pregnancy, its proteomic evaluation is of substantial interest. In previous research on the placental proteome, abundant, high-molecular-weight proteins have been studied. Application of large-scale, global proteomics or peptidomics to the placenta have been limited, and would be challenging owing to the anatomic complexity and broad concentration range of proteins in this tissue. In our approach, involving protein depletion, capillary liquid chromatography, and tandem mass spectrometry, we attempted to identify molecular differences between two regions of the same placenta with only slightly different cellular composition. Our analysis revealed 16 species with statistically significant differences between the two regions. Tandem mass spectrometry enabled successful sequencing, or otherwise enabled chemical characterization, of twelve of these. The successful discovery and identification of regional differences between the expression of low-abundance, low-molecular weight biomolecules reveals the potential of our approach.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shi, Tujin; Fillmore, Thomas L.; Gao, Yuqian
2013-10-01
Long-gradient separations coupled to tandem MS were recently demonstrated to provide a deep proteome coverage for global proteomics; however, such long-gradient separations have not been explored for targeted proteomics. Herein, we investigate the potential performance of the long-gradient separations coupled with selected reaction monitoring (LG-SRM) for targeted protein quantification. Direct comparison of LG-SRM (5 h gradient) and conventional LC-SRM (45 min gradient) showed that the long-gradient separations significantly reduced background interference levels and provided an 8- to 100-fold improvement in LOQ for target proteins in human female serum. Based on at least one surrogate peptide per protein, an LOQ ofmore » 10 ng/mL was achieved for the two spiked proteins in non-depleted human serum. The LG-SRM detection of seven out of eight endogenous plasma proteins expressed at ng/mL or sub-ng/mL levels in clinical patient sera was also demonstrated. A correlation coefficient of >0.99 was observed for the results of LG-SRM and ELISA measurements for prostate-specific antigen (PSA) in selected patient sera. Further enhancement of LG-SRM sensitivity was achieved by applying front-end IgY14 immunoaffinity depletion. Besides improved sensitivity, LG-SRM offers at least 3 times higher multiplexing capacity than conventional LC-SRM due to ~3-fold increase in average peak widths for a 300-min gradient compared to a 45-min gradient. Therefore, LG-SRM holds great potential for bridging the gap between global and targeted proteomics due to its advantages in both sensitivity and multiplexing capacity.« less
Top-down Proteomics in Health and Disease: Challenges and Opportunities
Gregorich, Zachery R.; Ge, Ying
2014-01-01
Proteomics is essential for deciphering how molecules interact as a system and for understanding the functions of cellular systems in human disease; however, the unique characteristics of the human proteome, which include a high dynamic range of protein expression and extreme complexity due to a plethora of post-translational modifications (PTMs) and sequence variations, make such analyses challenging. An emerging “top-down” mass spectrometry (MS)-based proteomics approach, which provides a “bird’s eye” view of all proteoforms, has unique advantages for the assessment of PTMs and sequence variations. Recently, a number of studies have showcased the potential of top-down proteomics for unraveling of disease mechanisms and discovery of new biomarkers. Nevertheless, the top-down approach still faces significant challenges in terms of protein solubility, separation, and the detection of large intact proteins, as well as the under-developed data analysis tools. Consequently, new technological developments are urgently needed to advance the field of top-down proteomics. Herein, we intend to provide an overview of the recent applications of top-down proteomics in biomedical research. Moreover, we will outline the challenges and opportunities facing top-down proteomics strategies aimed at understanding and diagnosing human diseases. PMID:24723472
Helsens, Kenny; Colaert, Niklaas; Barsnes, Harald; Muth, Thilo; Flikka, Kristian; Staes, An; Timmerman, Evy; Wortelkamp, Steffi; Sickmann, Albert; Vandekerckhove, Joël; Gevaert, Kris; Martens, Lennart
2010-03-01
MS-based proteomics produces large amounts of mass spectra that require processing, identification and possibly quantification before interpretation can be undertaken. High-throughput studies require automation of these various steps, and management of the data in association with the results obtained. We here present ms_lims (http://genesis.UGent.be/ms_lims), a freely available, open-source system based on a central database to automate data management and processing in MS-driven proteomics analyses.
Wörheide, Gert; Jackson, Daniel John
2015-01-01
The ability to construct a mineralized skeleton was a major innovation for the Metazoa during their evolution in the late Precambrian/early Cambrian. Porifera (sponges) hold an informative position for efforts aimed at unraveling the origins of this ability because they are widely regarded to be the earliest branching metazoans, and are among the first multi-cellular animals to display the ability to biomineralize in the fossil record. Very few biomineralization associated proteins have been identified in sponges so far, with no transcriptome or proteome scale surveys yet available. In order to understand what genetic repertoire may have been present in the last common ancestor of the Metazoa (LCAM), and that may have contributed to the evolution of the ability to biocalcify, we have studied the skeletal proteome of the coralline demosponge Vaceletia sp. and compare this to other metazoan biomineralizing proteomes. We bring some spatial resolution to this analysis by dividing Vaceletia’s aragonitic calcium carbonate skeleton into “head” and “stalk” regions. With our approach we were able to identify 40 proteins from both the head and stalk regions, with many of these sharing some similarity to previously identified gene products from other organisms. Among these proteins are known biomineralization compounds, such as carbonic anhydrase, spherulin, extracellular matrix proteins and very acidic proteins. This report provides the first proteome scale analysis of a calcified poriferan skeletal proteome, and its composition clearly demonstrates that the LCAM contributed several key enzymes and matrix proteins to its descendants that supported the metazoan ability to biocalcify. However, lineage specific evolution is also likely to have contributed significantly to the ability of disparate metazoan lineages to biocalcify. PMID:26536128
Proteomics of the Human Placenta: Promises and Realities
Robinson, J.M.; Ackerman, W.E.; Kniss, D.A.; Takizawa, T.; Vandré, D.D.
2015-01-01
Proteomics is an area of study that sets as its ultimate goal the global analysis of all of the proteins expressed in a biological system of interest. However, technical limitations currently hamper proteome-wide analyses of complex systems. In a more practical sense, a desired outcome of proteomics research is the translation of large protein data sets into formats that provide meaningful information regarding clinical conditions (e.g., biomarkers to serve as diagnostic and/or prognostic indicators of disease). Herein, we discuss placental proteomics by describing existing studies, pointing out their strengths and weaknesses. In so doing, we strive to inform investigators interested in this area of research about the current gap between hyperbolic promises and realities. Additionally, we discuss the utility of proteomics in discovery-based research, particularly as regards the capacity to unearth novel insights into placental biology. Importantly, when considering under studied systems such as the human placenta and diseases associated with abnormalities in placental function, proteomics can serve as a robust ‘shortcut’ to obtaining information unlikely to be garnered using traditional approaches. PMID:18222537
Abascal, Federico; Ezkurdia, Iakes; Rodriguez-Rivas, Juan; Rodriguez, Jose Manuel; del Pozo, Angela; Vázquez, Jesús; Valencia, Alfonso; Tress, Michael L.
2015-01-01
Alternative splicing of messenger RNA can generate a wide variety of mature RNA transcripts, and these transcripts may produce protein isoforms with diverse cellular functions. While there is much supporting evidence for the expression of alternative transcripts, the same is not true for the alternatively spliced protein products. Large-scale mass spectroscopy experiments have identified evidence of alternative splicing at the protein level, but with conflicting results. Here we carried out a rigorous analysis of the peptide evidence from eight large-scale proteomics experiments to assess the scale of alternative splicing that is detectable by high-resolution mass spectroscopy. We find fewer splice events than would be expected: we identified peptides for almost 64% of human protein coding genes, but detected just 282 splice events. This data suggests that most genes have a single dominant isoform at the protein level. Many of the alternative isoforms that we could identify were only subtly different from the main splice isoform. Very few of the splice events identified at the protein level disrupted functional domains, in stark contrast to the two thirds of splice events annotated in the human genome that would lead to the loss or damage of functional domains. The most striking result was that more than 20% of the splice isoforms we identified were generated by substituting one homologous exon for another. This is significantly more than would be expected from the frequency of these events in the genome. These homologous exon substitution events were remarkably conserved—all the homologous exons we identified evolved over 460 million years ago—and eight of the fourteen tissue-specific splice isoforms we identified were generated from homologous exons. The combination of proteomics evidence, ancient origin and tissue-specific splicing indicates that isoforms generated from homologous exons may have important cellular roles. PMID:26061177
Mishra, Bud; Daruwala, Raoul-Sam; Zhou, Yi; Ugel, Nadia; Policriti, Alberto; Antoniotti, Marco; Paxia, Salvatore; Rejali, Marc; Rudra, Archisman; Cherepinsky, Vera; Silver, Naomi; Casey, William; Piazza, Carla; Simeoni, Marta; Barbano, Paolo; Spivak, Marina; Feng, Jiawu; Gill, Ofer; Venkatesh, Mysore; Cheng, Fang; Sun, Bing; Ioniata, Iuliana; Anantharaman, Thomas; Hubbard, E Jane Albert; Pnueli, Amir; Harel, David; Chandru, Vijay; Hariharan, Ramesh; Wigler, Michael; Park, Frank; Lin, Shih-Chieh; Lazebnik, Yuri; Winkler, Franz; Cantor, Charles R; Carbone, Alessandra; Gromov, Mikhael
2003-01-01
We collaborate in a research program aimed at creating a rigorous framework, experimental infrastructure, and computational environment for understanding, experimenting with, manipulating, and modifying a diverse set of fundamental biological processes at multiple scales and spatio-temporal modes. The novelty of our research is based on an approach that (i) requires coevolution of experimental science and theoretical techniques and (ii) exploits a certain universality in biology guided by a parsimonious model of evolutionary mechanisms operating at the genomic level and manifesting at the proteomic, transcriptomic, phylogenic, and other higher levels. Our current program in "systems biology" endeavors to marry large-scale biological experiments with the tools to ponder and reason about large, complex, and subtle natural systems. To achieve this ambitious goal, ideas and concepts are combined from many different fields: biological experimentation, applied mathematical modeling, computational reasoning schemes, and large-scale numerical and symbolic simulations. From a biological viewpoint, the basic issues are many: (i) understanding common and shared structural motifs among biological processes; (ii) modeling biological noise due to interactions among a small number of key molecules or loss of synchrony; (iii) explaining the robustness of these systems in spite of such noise; and (iv) cataloging multistatic behavior and adaptation exhibited by many biological processes.
Large-scale gene function analysis with the PANTHER classification system.
Mi, Huaiyu; Muruganujan, Anushya; Casagrande, John T; Thomas, Paul D
2013-08-01
The PANTHER (protein annotation through evolutionary relationship) classification system (http://www.pantherdb.org/) is a comprehensive system that combines gene function, ontology, pathways and statistical analysis tools that enable biologists to analyze large-scale, genome-wide data from sequencing, proteomics or gene expression experiments. The system is built with 82 complete genomes organized into gene families and subfamilies, and their evolutionary relationships are captured in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models or HMMs). Genes are classified according to their function in several different ways: families and subfamilies are annotated with ontology terms (Gene Ontology (GO) and PANTHER protein class), and sequences are assigned to PANTHER pathways. The PANTHER website includes a suite of tools that enable users to browse and query gene functions, and to analyze large-scale experimental data with a number of statistical tests. It is widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. In the 2013 release of PANTHER (v.8.0), in addition to an update of the data content, we redesigned the website interface to improve both user experience and the system's analytical capability. This protocol provides a detailed description of how to analyze genome-wide experimental data with the PANTHER classification system.
A proteomic approach to obesity and type 2 diabetes
López-Villar, Elena; Martos-Moreno, Gabriel Á; Chowen, Julie A; Okada, Shigeru; Kopchick, John J; Argente, Jesús
2015-01-01
The incidence of obesity and type diabetes 2 has increased dramatically resulting in an increased interest in its biomedical relevance. However, the mechanisms that trigger the development of diabetes type 2 in obese patients remain largely unknown. Scientific, clinical and pharmaceutical communities are dedicating vast resources to unravel this issue by applying different omics tools. During the last decade, the advances in proteomic approaches and the Human Proteome Organization have opened and are opening a new door that may be helpful in the identification of patients at risk and to improve current therapies. Here, we briefly review some of the advances in our understanding of type 2 diabetes that have occurred through the application of proteomics. We also review, in detail, the current improvements in proteomic methodologies and new strategies that could be employed to further advance our understanding of this pathology. By applying these new proteomic advances, novel therapeutic and/or diagnostic protein targets will be discovered in the obesity/Type 2 diabetes area. PMID:25960181
Bianco, Linda; Perrotta, Gaetano
2015-01-01
Filamentous fungi possess the extraordinary ability to digest complex biomasses and mineralize numerous xenobiotics, as consequence of their aptitude to sensing the environment and regulating their intra and extra cellular proteins, producing drastic changes in proteome and secretome composition. Recent advancement in proteomic technologies offers an exciting opportunity to reveal the fluctuations of fungal proteins and enzymes, responsible for their metabolic adaptation to a large variety of environmental conditions. Here, an overview of the most commonly used proteomic strategies will be provided; this paper will range from sample preparation to gel-free and gel-based proteomics, discussing pros and cons of each mentioned state-of-the-art technique. The main focus will be kept on filamentous fungi. Due to the biotechnological relevance of lignocellulose degrading fungi, special attention will be finally given to their extracellular proteome, or secretome. Secreted proteins and enzymes will be discussed in relation to their involvement in bio-based processes, such as biomass deconstruction and mycoremediation. PMID:25775160
Bianco, Linda; Perrotta, Gaetano
2015-03-12
Filamentous fungi possess the extraordinary ability to digest complex biomasses and mineralize numerous xenobiotics, as consequence of their aptitude to sensing the environment and regulating their intra and extra cellular proteins, producing drastic changes in proteome and secretome composition. Recent advancement in proteomic technologies offers an exciting opportunity to reveal the fluctuations of fungal proteins and enzymes, responsible for their metabolic adaptation to a large variety of environmental conditions. Here, an overview of the most commonly used proteomic strategies will be provided; this paper will range from sample preparation to gel-free and gel-based proteomics, discussing pros and cons of each mentioned state-of-the-art technique. The main focus will be kept on filamentous fungi. Due to the biotechnological relevance of lignocellulose degrading fungi, special attention will be finally given to their extracellular proteome, or secretome. Secreted proteins and enzymes will be discussed in relation to their involvement in bio-based processes, such as biomass deconstruction and mycoremediation.
Shaping biological knowledge: applications in proteomics.
Lisacek, F; Chichester, C; Gonnet, P; Jaillet, O; Kappus, S; Nikitin, F; Roland, P; Rossier, G; Truong, L; Appel, R
2004-01-01
The central dogma of molecular biology has provided a meaningful principle for data integration in the field of genomics. In this context, integration reflects the known transitions from a chromosome to a protein sequence: transcription, intron splicing, exon assembly and translation. There is no such clear principle for integrating proteomics data, since the laws governing protein folding and interactivity are not quite understood. In our effort to bring together independent pieces of information relative to proteins in a biologically meaningful way, we assess the bias of bioinformatics resources and consequent approximations in the framework of small-scale studies. We analyse proteomics data while following both a data-driven (focus on proteins smaller than 10 kDa) and a hypothesis-driven (focus on whole bacterial proteomes) approach. These applications are potentially the source of specialized complements to classical biological ontologies.
Tao, Dingyin; Zhang, Lihua; Shan, Yichu; Liang, Zhen; Zhang, Yukui
2011-01-01
High-performance liquid chromatography-electrospray ionization tandem mass spectrometry (HPLC-ESI-MS-MS) is regarded as one of the most powerful techniques for separation and identification of proteins. Recently, much effort has been made to improve the separation capacity, detection sensitivity, and analysis throughput of micro- and nano-HPLC, by increasing column length, reducing column internal diameter, and using integrated techniques. Development of HPLC columns has also been rapid, as a result of the use of submicrometer packing materials and monolithic columns. All these innovations result in clearly improved performance of micro- and nano-HPLC for proteome research.
A New Scheme to Characterize and Identify Protein Ubiquitination Sites.
Nguyen, Van-Nui; Huang, Kai-Yao; Huang, Chien-Hsun; Lai, K Robert; Lee, Tzong-Yi
2017-01-01
Protein ubiquitination, involving the conjugation of ubiquitin on lysine residue, serves as an important modulator of many cellular functions in eukaryotes. Recent advancements in proteomic technology have stimulated increasing interest in identifying ubiquitination sites. However, most computational tools for predicting ubiquitination sites are focused on small-scale data. With an increasing number of experimentally verified ubiquitination sites, we were motivated to design a predictive model for identifying lysine ubiquitination sites for large-scale proteome dataset. This work assessed not only single features, such as amino acid composition (AAC), amino acid pair composition (AAPC) and evolutionary information, but also the effectiveness of incorporating two or more features into a hybrid approach to model construction. The support vector machine (SVM) was applied to generate the prediction models for ubiquitination site identification. Evaluation by five-fold cross-validation showed that the SVM models learned from the combination of hybrid features delivered a better prediction performance. Additionally, a motif discovery tool, MDDLogo, was adopted to characterize the potential substrate motifs of ubiquitination sites. The SVM models integrating the MDDLogo-identified substrate motifs could yield an average accuracy of 68.70 percent. Furthermore, the independent testing result showed that the MDDLogo-clustered SVM models could provide a promising accuracy (78.50 percent) and perform better than other prediction tools. Two cases have demonstrated the effective prediction of ubiquitination sites with corresponding substrate motifs.
Colangelo, Christopher M.; Ivosev, Gordana; Chung, Lisa; Abbott, Thomas; Shifman, Mark; Sakaue, Fumika; Cox, David; Kitchen, Rob R.; Burton, Lyle; Tate, Stephen A; Gulcicek, Erol; Bonner, Ron; Rinehart, Jesse; Nairn, Angus C.; Williams, Kenneth R.
2015-01-01
We present a comprehensive workflow for large scale (>1000 transitions/run) label-free LC-MRM proteome assays. Innovations include automated MRM transition selection, intelligent retention time scheduling (xMRM) that improves Signal/Noise by >2-fold, and automatic peak modeling. Improvements to data analysis include a novel Q/C metric, Normalized Group Area Ratio (NGAR), MLR normalization, weighted regression analysis, and data dissemination through the Yale Protein Expression Database. As a proof of principle we developed a robust 90 minute LC-MRM assay for Mouse/Rat Post-Synaptic Density (PSD) fractions which resulted in the routine quantification of 337 peptides from 112 proteins based on 15 observations per protein. Parallel analyses with stable isotope dilution peptide standards (SIS), demonstrate very high correlation in retention time (1.0) and protein fold change (0.94) between the label-free and SIS analyses. Overall, our first method achieved a technical CV of 11.4% with >97.5% of the 1697 transitions being quantified without user intervention, resulting in a highly efficient, robust, and single injection LC-MRM assay. PMID:25476245
Vascular biology: cellular and molecular profiling.
Baird, Alison E; Wright, Violet L
2006-02-01
Our understanding of the mechanisms underlying cerebrovascular atherosclerosis has improved in recent years, but significant gaps remain. New insights into the vascular biological processes that result in ischemic stroke may come from cellular and molecular profiling studies of the peripheral blood. In recent cellular profiling studies, increased levels of a proinflammatory T-cell subset (CD4 (+)CD28 (-)) have been associated with stroke recurrence and death. Expansion of this T-cell subset may occur after ischemic stroke and be a pathogenic mechanism leading to recurrent stroke and death. Increases in certain phenotypes of endothelial cell microparticles have been found in stroke patients relative to controls, possibly indicating a state of increased vascular risk. Molecular profiling approaches include gene expression profiling and proteomic methods that permit large-scale analyses of the transcriptome and the proteome, respectively. Ultimately panels of genes and proteins may be identified that are predictive of stroke risk. Cellular and molecular profiling studies of the peripheral blood and of atherosclerotic plaques may also pave the way for the development of therapeutic agents for primary and secondary stroke prevention.
Nucleic Acids for Ultra-Sensitive Protein Detection
Janssen, Kris P. F.; Knez, Karel; Spasic, Dragana; Lammertyn, Jeroen
2013-01-01
Major advancements in molecular biology and clinical diagnostics cannot be brought about strictly through the use of genomics based methods. Improved methods for protein detection and proteomic screening are an absolute necessity to complement to wealth of information offered by novel, high-throughput sequencing technologies. Only then will it be possible to advance insights into clinical processes and to characterize the importance of specific protein biomarkers for disease detection or the realization of “personalized medicine”. Currently however, large-scale proteomic information is still not as easily obtained as its genomic counterpart, mainly because traditional antibody-based technologies struggle to meet the stringent sensitivity and throughput requirements that are required whereas mass-spectrometry based methods might be burdened by significant costs involved. However, recent years have seen the development of new biodetection strategies linking nucleic acids with existing antibody technology or replacing antibodies with oligonucleotide recognition elements altogether. These advancements have unlocked many new strategies to lower detection limits and dramatically increase throughput of protein detection assays. In this review, an overview of these new strategies will be given. PMID:23337338
Advances in crop proteomics: PTMs of proteins under abiotic stress.
Wu, Xiaolin; Gong, Fangping; Cao, Di; Hu, Xiuli; Wang, Wei
2016-03-01
Under natural conditions, crop plants are frequently subjected to various abiotic environmental stresses such as drought and heat wave, which may become more prevalent in the coming decades. Plant acclimation and tolerance to an abiotic stress are always associated with significant changes in PTMs of specific proteins. PTMs are important for regulating protein function, subcellular localization and protein activity and stability. Studies of plant responses to abiotic stress at the PTMs level are essential to the process of plant phenotyping for crop improvement. The ability to identify and quantify PTMs on a large-scale will contribute to a detailed protein functional characterization that will improve our understanding of the processes of crop plant stress acclimation and stress tolerance acquisition. Hundreds of PTMs have been reported, but it is impossible to review all of the possible protein modifications. In this review, we briefly summarize several main types of PTMs regarding their characteristics and detection methods, review the advances in PTMs research of crop proteomics, and highlight the importance of specific PTMs in crop response to abiotic stress. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Fröhlich, Thomas; Kemter, Elisabeth; Flenkenthaler, Florian; Klymiuk, Nikolai; Otte, Kathrin A; Blutke, Andreas; Krause, Sabine; Walter, Maggie C; Wanke, Rüdiger; Wolf, Eckhard; Arnold, Georg J
2016-09-16
Duchenne muscular dystrophy (DMD) is caused by genetic deficiency of dystrophin and characterized by massive structural and functional changes of skeletal muscle tissue, leading to terminal muscle failure. We recently generated a novel genetically engineered pig model reflecting pathological hallmarks of human DMD better than the widely used mdx mouse. To get insight into the hierarchy of molecular derangements during DMD progression, we performed a proteome analysis of biceps femoris muscle samples from 2-day-old and 3-month-old DMD and wild-type (WT) pigs. The extent of proteome changes in DMD vs. WT muscle increased markedly with age, reflecting progression of the pathological changes. In 3-month-old DMD muscle, proteins related to muscle repair such as vimentin, nestin, desmin and tenascin C were found to be increased, whereas a large number of respiratory chain proteins were decreased in abundance in DMD muscle, indicating serious disturbances in aerobic energy production and a reduction of functional muscle tissue. The combination of proteome data for fiber type specific myosin heavy chain proteins and immunohistochemistry showed preferential degeneration of fast-twitch fiber types in DMD muscle. The stage-specific proteome changes detected in this large animal model of clinically severe muscular dystrophy provide novel molecular readouts for future treatment trials.
Precision medicine for psychopharmacology: a general introduction.
Shin, Cheolmin; Han, Changsu; Pae, Chi-Un; Patkar, Ashwin A
2016-07-01
Precision medicine is an emerging medical model that can provide accurate diagnoses and tailored therapeutic strategies for patients based on data pertaining to genes, microbiomes, environment, family history and lifestyle. Here, we provide basic information about precision medicine and newly introduced concepts, such as the precision medicine ecosystem and big data processing, and omics technologies including pharmacogenomics, pharamacometabolomics, pharmacoproteomics, pharmacoepigenomics, connectomics and exposomics. The authors review the current state of omics in psychiatry and the future direction of psychopharmacology as it moves towards precision medicine. Expert commentary: Advances in precision medicine have been facilitated by achievements in multiple fields, including large-scale biological databases, powerful methods for characterizing patients (such as genomics, proteomics, metabolomics, diverse cellular assays, and even social networks and mobile health technologies), and computer-based tools for analyzing large amounts of data.
Nyman, Tuula A; Lorey, Martina B; Cypryk, Wojciech; Matikainen, Sampsa
2017-05-01
The immune system is our defense system against microbial infections and tissue injury, and understanding how it works in detail is essential for developing drugs for different diseases. Mass spectrometry-based proteomics can provide in-depth information on the molecular mechanisms involved in immune responses. Areas covered: Summarized are the key immunology findings obtained with MS-based proteomics in the past five years, with a focus on inflammasome activation, global protein secretion, mucosal immunology, immunopeptidome and T cells. Special focus is on extracellular vesicle-mediated protein secretion and its role in immune responses. Expert commentary: Proteomics is an essential part of modern omics-scale immunology research. To date, MS-based proteomics has been used in immunology to study protein expression levels, their subcellular localization, secretion, post-translational modifications, and interactions in immune cells upon activation by different stimuli. These studies have made major contributions to understanding the molecular mechanisms involved in innate and adaptive immune responses. New developments in proteomics offer constantly novel possibilities for exploring the immune system. Examples of these techniques include mass cytometry and different MS-based imaging approaches which can be widely used in immunology.
Ku, Taeyun; Swaney, Justin; Park, Jeong-Yoon; Albanese, Alexandre; Murray, Evan; Cho, Jae Hun; Park, Young-Gyun; Mangena, Vamsi; Chen, Jiapei; Chung, Kwanghun
2016-09-01
The biology of multicellular organisms is coordinated across multiple size scales, from the subnanoscale of molecules to the macroscale, tissue-wide interconnectivity of cell populations. Here we introduce a method for super-resolution imaging of the multiscale organization of intact tissues. The method, called magnified analysis of the proteome (MAP), linearly expands entire organs fourfold while preserving their overall architecture and three-dimensional proteome organization. MAP is based on the observation that preventing crosslinking within and between endogenous proteins during hydrogel-tissue hybridization allows for natural expansion upon protein denaturation and dissociation. The expanded tissue preserves its protein content, its fine subcellular details, and its organ-scale intercellular connectivity. We use off-the-shelf antibodies for multiple rounds of immunolabeling and imaging of a tissue's magnified proteome, and our experiments demonstrate a success rate of 82% (100/122 antibodies tested). We show that specimen size can be reversibly modulated to image both inter-regional connections and fine synaptic architectures in the mouse brain.
Unexpected features of the dark proteome.
Perdigão, Nelson; Heinrich, Julian; Stolte, Christian; Sabir, Kenneth S; Buckley, Michael J; Tabor, Bruce; Signal, Beth; Gloss, Brian S; Hammang, Christopher J; Rost, Burkhard; Schafferhans, Andrea; O'Donoghue, Seán I
2015-12-29
We surveyed the "dark" proteome-that is, regions of proteins never observed by experimental structure determination and inaccessible to homology modeling. For 546,000 Swiss-Prot proteins, we found that 44-54% of the proteome in eukaryotes and viruses was dark, compared with only ∼14% in archaea and bacteria. Surprisingly, most of the dark proteome could not be accounted for by conventional explanations, such as intrinsic disorder or transmembrane regions. Nearly half of the dark proteome comprised dark proteins, in which the entire sequence lacked similarity to any known structure. Dark proteins fulfill a wide variety of functions, but a subset showed distinct and largely unexpected features, such as association with secretion, specific tissues, the endoplasmic reticulum, disulfide bonding, and proteolytic cleavage. Dark proteins also had short sequence length, low evolutionary reuse, and few known interactions with other proteins. These results suggest new research directions in structural and computational biology.
EvoluCode: Evolutionary Barcodes as a Unifying Framework for Multilevel Evolutionary Data.
Linard, Benjamin; Nguyen, Ngoc Hoan; Prosdocimi, Francisco; Poch, Olivier; Thompson, Julie D
2012-01-01
Evolutionary systems biology aims to uncover the general trends and principles governing the evolution of biological networks. An essential part of this process is the reconstruction and analysis of the evolutionary histories of these complex, dynamic networks. Unfortunately, the methodologies for representing and exploiting such complex evolutionary histories in large scale studies are currently limited. Here, we propose a new formalism, called EvoluCode (Evolutionary barCode), which allows the integration of different evolutionary parameters (eg, sequence conservation, orthology, synteny …) in a unifying format and facilitates the multilevel analysis and visualization of complex evolutionary histories at the genome scale. The advantages of the approach are demonstrated by constructing barcodes representing the evolution of the complete human proteome. Two large-scale studies are then described: (i) the mapping and visualization of the barcodes on the human chromosomes and (ii) automatic clustering of the barcodes to highlight protein subsets sharing similar evolutionary histories and their functional analysis. The methodologies developed here open the way to the efficient application of other data mining and knowledge extraction techniques in evolutionary systems biology studies. A database containing all EvoluCode data is available at: http://lbgi.igbmc.fr/barcodes.
USDA-ARS?s Scientific Manuscript database
Despite the current use of chemical fungicides, Penicillium expansum still is one of the most devastating pathogens of pome fruit. In particular, P. expansum enters tissues through wounds causing large economic losses worldwide. To obtain new rational and environmental friendly control alternative...
Mass spectrometry for biomarker development
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Chaochao; Liu, Tao; Baker, Erin Shammel
2015-06-19
Biomarkers potentially play a crucial role in early disease diagnosis, prognosis and targeted therapy. In the past decade, mass spectrometry based proteomics has become increasingly important in biomarker development due to large advances in technology and associated methods. This chapter mainly focuses on the application of broad (e.g. shotgun) proteomics in biomarker discovery and the utility of targeted proteomics in biomarker verification and validation. A range of mass spectrometry methodologies are discussed emphasizing their efficacy in the different stages in biomarker development, with a particular emphasis on blood biomarker development.
Hegedűs, Tamás; Chaubey, Pururawa Mayank; Várady, György; Szabó, Edit; Sarankó, Hajnalka; Hofstetter, Lia; Roschitzki, Bernd; Sarkadi, Balázs
2015-01-01
Based on recent results, the determination of the easily accessible red blood cell (RBC) membrane proteins may provide new diagnostic possibilities for assessing mutations, polymorphisms or regulatory alterations in diseases. However, the analysis of the current mass spectrometry-based proteomics datasets and other major databases indicates inconsistencies—the results show large scattering and only a limited overlap for the identified RBC membrane proteins. Here, we applied membrane-specific proteomics studies in human RBC, compared these results with the data in the literature, and generated a comprehensive and expandable database using all available data sources. The integrated web database now refers to proteomic, genetic and medical databases as well, and contains an unexpected large number of validated membrane proteins previously thought to be specific for other tissues and/or related to major human diseases. Since the determination of protein expression in RBC provides a method to indicate pathological alterations, our database should facilitate the development of RBC membrane biomarker platforms and provide a unique resource to aid related further research and diagnostics. Database URL: http://rbcc.hegelab.org PMID:26078478
The amino acid's backup bone - storage solutions for proteomics facilities.
Meckel, Hagen; Stephan, Christian; Bunse, Christian; Krafzik, Michael; Reher, Christopher; Kohl, Michael; Meyer, Helmut Erich; Eisenacher, Martin
2014-01-01
Proteomics methods, especially high-throughput mass spectrometry analysis have been continually developed and improved over the years. The analysis of complex biological samples produces large volumes of raw data. Data storage and recovery management pose substantial challenges to biomedical or proteomic facilities regarding backup and archiving concepts as well as hardware requirements. In this article we describe differences between the terms backup and archive with regard to manual and automatic approaches. We also introduce different storage concepts and technologies from transportable media to professional solutions such as redundant array of independent disks (RAID) systems, network attached storages (NAS) and storage area network (SAN). Moreover, we present a software solution, which we developed for the purpose of long-term preservation of large mass spectrometry raw data files on an object storage device (OSD) archiving system. Finally, advantages, disadvantages, and experiences from routine operations of the presented concepts and technologies are evaluated and discussed. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan. Copyright © 2013. Published by Elsevier B.V.
Proteomic analysis of ligamentum flavum from patients with lumbar spinal stenosis.
Kamita, Masahiro; Mori, Taiki; Sakai, Yoshihito; Ito, Sadayuki; Gomi, Masahiro; Miyamoto, Yuko; Harada, Atsushi; Niida, Shumpei; Yamada, Tesshi; Watanabe, Ken; Ono, Masaya
2015-05-01
Lumbar spinal stenosis (LSS) is a syndromic degenerative spinal disease and is characterized by spinal canal narrowing with subsequent neural compression causing gait disturbances. Although LSS is a major age-related musculoskeletal disease that causes large decreases in the daily living activities of the elderly, its molecular pathology has not been investigated using proteomics. Thus, we used several proteomic technologies to analyze the ligamentum flavum (LF) of individuals with LSS. Using comprehensive proteomics with strong cation exchange fractionation, we detected 1288 proteins in these LF samples. A GO analysis of the comprehensive proteome revealed that more than 30% of the identified proteins were extracellular. Next, we used 2D image converted analysis of LC/MS to compare LF obtained from individuals with LSS to that obtained from individuals with disc herniation (nondegenerative control). We detected 64 781 MS peaks and identified 1675 differentially expressed peptides derived from 286 proteins. We verified four differentially expressed proteins (fibronectin, serine protease HTRA1, tenascin, and asporin) by quantitative proteomics using SRM/MRM. The present proteomic study is the first to identify proteins from degenerated and hypertrophied LF in LSS, which will help in studying LSS. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Liu, Kehui; Zhang, Jiyang; Fu, Bin; Xie, Hongwei; Wang, Yingchun; Qian, Xiaohong
2014-07-01
Precise protein quantification is essential in comparative proteomics. Currently, quantification bias is inevitable when using proteotypic peptide-based quantitative proteomics strategy for the differences in peptides measurability. To improve quantification accuracy, we proposed an "empirical rule for linearly correlated peptide selection (ERLPS)" in quantitative proteomics in our previous work. However, a systematic evaluation on general application of ERLPS in quantitative proteomics under diverse experimental conditions needs to be conducted. In this study, the practice workflow of ERLPS was explicitly illustrated; different experimental variables, such as, different MS systems, sample complexities, sample preparations, elution gradients, matrix effects, loading amounts, and other factors were comprehensively investigated to evaluate the applicability, reproducibility, and transferability of ERPLS. The results demonstrated that ERLPS was highly reproducible and transferable within appropriate loading amounts and linearly correlated response peptides should be selected for each specific experiment. ERLPS was used to proteome samples from yeast to mouse and human, and in quantitative methods from label-free to O18/O16-labeled and SILAC analysis, and enabled accurate measurements for all proteotypic peptide-based quantitative proteomics over a large dynamic range. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Beyond the proteome: Mass Spectrometry Special Interest Group (MS-SIG) at ISMB/ECCB 2013
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ryu, Soyoung; Payne, Samuel H.; Schaab, Christoph
2014-07-02
Mass spectrometry special interest group (MS-SIG) aims to bring together experts from the global research community to discuss highlights and challenges in the field of mass spectrometry (MS)-based proteomics and computational biology. The rapid echnological developments in MS-based proteomics have enabled the generation of a large amount of meaningful information on hundreds to thousands of proteins simultaneously from a biological sample; however, the complexity of the MS data require sophisticated computational algorithms and software for data analysis and interpretation. This year’s MS-SIG meeting theme was ‘Beyond the Proteome’ with major focuses on improving protein identification/quantification and using proteomics data tomore » solve interesting problems in systems biology and clinical research.« less
Computational Prediction of Protein-Protein Interactions
Ehrenberger, Tobias; Cantley, Lewis C.; Yaffe, Michael B.
2015-01-01
The prediction of protein-protein interactions and kinase-specific phosphorylation sites on individual proteins is critical for correctly placing proteins within signaling pathways and networks. The importance of this type of annotation continues to increase with the continued explosion of genomic and proteomic data, particularly with emerging data categorizing posttranslational modifications on a large scale. A variety of computational tools are available for this purpose. In this chapter, we review the general methodologies for these types of computational predictions and present a detailed user-focused tutorial of one such method and computational tool, Scansite, which is freely available to the entire scientific community over the Internet. PMID:25859943
What does physics have to do with cancer?
Michor, Franziska; Liphardt, Jan; Ferrari, Mauro; Widom, Jonathan
2013-01-01
Large-scale cancer genomics, proteomics and RNA-sequencing efforts are currently mapping in fine detail the genetic and biochemical alterations that occur in cancer. However, it is becoming clear that it is difficult to integrate and interpret these data and to translate them into treatments. This difficulty is compounded by the recognition that cancer cells evolve, and that initiation, progression and metastasis are influenced by a wide variety of factors. To help tackle this challenge, the US National Cancer Institute Physical Sciences-Oncology Centers initiative is bringing together physicists, cancer biologists, chemists, mathematicians and engineers. How are we beginning to address cancer from the perspective of the physical sciences? PMID:21850037
Mildew-Omics: How Global Analyses Aid the Understanding of Life and Evolution of Powdery Mildews.
Bindschedler, Laurence V; Panstruga, Ralph; Spanu, Pietro D
2016-01-01
The common powdery mildew plant diseases are caused by ascomycete fungi of the order Erysiphales. Their characteristic life style as obligate biotrophs renders functional analyses in these species challenging, mainly because of experimental constraints to genetic manipulation. Global large-scale ("-omics") approaches are thus particularly valuable and insightful for the characterisation of the life and evolution of powdery mildews. Here we review the knowledge obtained so far from genomic, transcriptomic and proteomic studies in these fungi. We consider current limitations and challenges regarding these surveys and provide an outlook on desired future investigations on the basis of the various -omics technologies.
Proteome-level interplay between folding and aggregation propensities of proteins.
Tartaglia, Gian Gaetano; Vendruscolo, Michele
2010-10-08
With the advent of proteomics, there is an increasing need of tools for predicting the properties of large numbers of proteins by using the information provided by their amino acid sequences, even in the absence of the knowledge of their structures. One of the most important types of predictions concerns whether proteins will fold or aggregate. Here, we study the competition between these two processes by analyzing the relationship between the folding and aggregation propensity profiles for the human and Escherichia coli proteomes. These profiles are calculated, respectively, using the CamFold method, which we introduce in this work, and the Zyggregator method. Our results indicate that the kinetic behavior of proteins is, to a large extent, determined by the interplay between regions of low folding and high aggregation propensities. Copyright © 2010. Published by Elsevier Ltd.
Current algorithmic solutions for peptide-based proteomics data generation and identification.
Hoopmann, Michael R; Moritz, Robert L
2013-02-01
Peptide-based proteomic data sets are ever increasing in size and complexity. These data sets provide computational challenges when attempting to quickly analyze spectra and obtain correct protein identifications. Database search and de novo algorithms must consider high-resolution MS/MS spectra and alternative fragmentation methods. Protein inference is a tricky problem when analyzing large data sets of degenerate peptide identifications. Combining multiple algorithms for improved peptide identification puts significant strain on computational systems when investigating large data sets. This review highlights some of the recent developments in peptide and protein identification algorithms for analyzing shotgun mass spectrometry data when encountering the aforementioned hurdles. Also explored are the roles that analytical pipelines, public spectral libraries, and cloud computing play in the evolution of peptide-based proteomics. Copyright © 2012 Elsevier Ltd. All rights reserved.
The Proteome Folding Project: Proteome-scale prediction of structure and function
Drew, Kevin; Winters, Patrick; Butterfoss, Glenn L.; Berstis, Viktors; Uplinger, Keith; Armstrong, Jonathan; Riffle, Michael; Schweighofer, Erik; Bovermann, Bill; Goodlett, David R.; Davis, Trisha N.; Shasha, Dennis; Malmström, Lars; Bonneau, Richard
2011-01-01
The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition, and grid-computing-enabled de novo structure prediction. We predict protein domain boundaries and three-dimensional (3D) structures for protein domains from 94 genomes (including human, Arabidopsis, rice, mouse, fly, yeast, Escherichia coli, and worm). De novo structure predictions were distributed on a grid of more than 1.5 million CPUs worldwide (World Community Grid). We generated significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions. PMID:21824995
Redox Proteomics of Protein-bound Methionine Oxidation*
Ghesquière, Bart; Jonckheere, Veronique; Colaert, Niklaas; Van Durme, Joost; Timmerman, Evy; Goethals, Marc; Schymkowitz, Joost; Rousseau, Frederic; Vandekerckhove, Joël; Gevaert, Kris
2011-01-01
We here present a new method to measure the degree of protein-bound methionine sulfoxide formation at a proteome-wide scale. In human Jurkat cells that were stressed with hydrogen peroxide, over 2000 oxidation-sensitive methionines in more than 1600 different proteins were mapped and their extent of oxidation was quantified. Meta-analysis of the sequences surrounding the oxidized methionine residues revealed a high preference for neighboring polar residues. Using synthetic methionine sulfoxide containing peptides designed according to the observed sequence preferences in the oxidized Jurkat proteome, we discovered that the substrate specificity of the cellular methionine sulfoxide reductases is a major determinant for the steady-state of methionine oxidation. This was supported by a structural modeling of the MsrA catalytic center. Finally, we applied our method onto a serum proteome from a mouse sepsis model and identified 35 in vivo methionine oxidation events in 27 different proteins. PMID:21406390
Ibáñez-Vea, María; Huang, Honggang; Martínez de Morentin, Xabier; Pérez, Estela; Gato, Maria; Zuazo, Miren; Arasanz, Hugo; Fernández-Irigoyen, Joaquin; Santamaría, Enrique; Fernandez-Hinojal, Gonzalo; Larsen, Martin R; Escors, David; Kochan, Grazyna
2018-03-02
Protein S-nitrosylation is a cysteine post-translational modification mediated by nitric oxide. An increasing number of studies highlight S-nitrosylation as an important regulator of signaling involved in numerous cellular processes. Despite the significant progress in the development of redox proteomic methods, identification and quantification of endogeneous S-nitrosylation using high-throughput mass-spectrometry-based methods is a technical challenge because this modification is highly labile. To overcome this drawback, most methods induce S-nitrosylation chemically in proteins using nitrosylating compounds before analysis, with the risk of introducing nonphysiological S-nitrosylation. Here we present a novel method to efficiently identify endogenous S-nitrosopeptides in the macrophage total proteome. Our approach is based on the labeling of S-nitrosopeptides reduced by ascorbate with a cysteine specific phosphonate adaptable tag (CysPAT), followed by titanium dioxide (TiO 2 ) chromatography enrichment prior to nLC-MS/MS analysis. To test our procedure, we performed a large-scale analysis of this low-abundant modification in a murine macrophage cell line. We identified 569 endogeneous S-nitrosylated proteins compared with 795 following exogenous chemically induced S-nitrosylation. Importantly, we discovered 579 novel S-nitrosylation sites. The large number of identified endogenous S-nitrosylated peptides allowed the definition of two S-nitrosylation consensus sites, highlighting protein translation and redox processes as key S-nitrosylation targets in macrophages.
Protein biomarker validation via proximity ligation assays.
Blokzijl, A; Nong, R; Darmanis, S; Hertz, E; Landegren, U; Kamali-Moghaddam, M
2014-05-01
The ability to detect minute amounts of specific proteins or protein modifications in blood as biomarkers for a plethora of human pathological conditions holds great promise for future medicine. Despite a large number of plausible candidate protein biomarkers published annually, the translation to clinical use is impeded by factors such as the required size of the initial studies, and limitations of the technologies used. The proximity ligation assay (PLA) is a versatile molecular tool that has the potential to address some obstacles, both in validation of biomarkers previously discovered using other techniques, and for future routine clinical diagnostic needs. The enhanced specificity of PLA extends the opportunities for large-scale, high-performance analyses of proteins. Besides advantages in the form of minimal sample consumption and an extended dynamic range, the PLA technique allows flexible assay reconfiguration. The technology can be adapted for detecting protein complexes, proximity between proteins in extracellular vesicles or in circulating tumor cells, and to address multiple post-translational modifications in the same protein molecule. We discuss herein requirements for biomarker validation, and how PLA may play an increasing role in this regard. We describe some recent developments of the technology, including proximity extension assays, the use of recombinant affinity reagents suitable for use in proximity assays, and the potential for single cell proteomics. This article is part of a Special Issue entitled: Biomarkers: A Proteomic Challenge. © 2013.
Cholewa, Brian D; Pellitteri-Hahn, Molly C; Scarlett, Cameron O; Ahmad, Nihal
2014-11-07
Polo-like kinase 1 (Plk1) is a serine/threonine kinase that plays a key role during the cell cycle by regulating mitotic entry, progression, and exit. Plk1 is overexpressed in a variety of human cancers and is essential to sustained oncogenic proliferation, thus making Plk1 an attractive therapeutic target. However, the clinical efficacy of Plk1 inhibition has not emulated the preclinical success, stressing an urgent need for a better understanding of Plk1 signaling. This study addresses that need by utilizing a quantitative proteomics strategy to compare the proteome of BRAF(V600E) mutant melanoma cells following treatment with the Plk1-specific inhibitor BI 6727. Employing label-free nano-LC-MS/MS technology on a Q-exactive followed by SIEVE processing, we identified more than 20 proteins of interest, many of which have not been previously associated with Plk1 signaling. Here we report the down-regulation of multiple metabolic proteins with an associated decrease in cellular metabolism, as assessed by lactate and NAD levels. Furthermore, we have also identified the down-regulation of multiple proteasomal subunits, resulting in a significant decrease in 20S proteasome activity. Additionally, we have identified a novel association between Plk1 and p53 through heterogeneous ribonucleoprotein C1/C2 (hnRNPC), thus providing valuable insight into Plk1's role in cancer cell survival.
Boja, Emily S; Fehniger, Thomas E; Baker, Mark S; Marko-Varga, György; Rodriguez, Henry
2014-12-05
Protein biomarker discovery and validation in current omics era are vital for healthcare professionals to improve diagnosis, detect cancers at an early stage, identify the likelihood of cancer recurrence, stratify stages with differential survival outcomes, and monitor therapeutic responses. The success of such biomarkers would have a huge impact on how we improve the diagnosis and treatment of patients and alleviate the financial burden of healthcare systems. In the past, the genomics community (mostly through large-scale, deep genomic sequencing technologies) has been steadily improving our understanding of the molecular basis of disease, with a number of biomarker panels already authorized by the U.S. Food and Drug Administration (FDA) for clinical use (e.g., MammaPrint, two recently cleared devices using next-generation sequencing platforms to detect DNA changes in the cystic fibrosis transmembrane conductance regulator (CFTR) gene). Clinical proteomics, on the other hand, albeit its ability to delineate the functional units of a cell, more likely driving the phenotypic differences of a disease (i.e., proteins and protein-protein interaction networks and signaling pathways underlying the disease), "staggers" to make a significant impact with only an average ∼ 1.5 protein biomarkers per year approved by the FDA over the past 15-20 years. This statistic itself raises the concern that major roadblocks have been impeding an efficient transition of protein marker candidates in biomarker development despite major technological advances in proteomics in recent years.
Al Feteisi, Hajar; Achour, Brahim; Rostami-Hodjegan, Amin; Barber, Jill
2015-01-01
Drug-metabolizing enzymes and transporters play an important role in drug absorption, distribution, metabolism and excretion and, consequently, they influence drug efficacy and toxicity. Quantification of drug-metabolizing enzymes and transporters in various tissues is therefore essential for comprehensive elucidation of drug absorption, distribution, metabolism and excretion. Recent advances in liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) have improved the quantification of pharmacologically relevant proteins. This report presents an overview of mass spectrometry-based methods currently used for the quantification of drug-metabolizing enzymes and drug transporters, mainly focusing on applications and cost associated with various quantitative strategies based on stable isotope-labeled standards (absolute quantification peptide standards, quantification concatemers, protein standards for absolute quantification) and label-free analysis. In mass spectrometry, there is no simple relationship between signal intensity and analyte concentration. Proteomic strategies are therefore complex and several factors need to be considered when selecting the most appropriate method for an intended application, including the number of proteins and samples. Quantitative strategies require appropriate mass spectrometry platforms, yet choice is often limited by the availability of appropriate instrumentation. Quantitative proteomics research requires specialist practical skills and there is a pressing need to dedicate more effort and investment to training personnel in this area. Large-scale multicenter collaborations are also needed to standardize quantitative strategies in order to improve physiologically based pharmacokinetic models.
Majeran, Wojciech; Friso, Giulia; Ponnala, Lalit; Connolly, Brian; Huang, Mingshu; Reidel, Edwin; Zhang, Cankui; Asakura, Yukari; Bhuiyan, Nazmul H; Sun, Qi; Turgeon, Robert; van Wijk, Klaas J
2010-11-01
C(4) grasses, such as maize (Zea mays), have high photosynthetic efficiency through combined biochemical and structural adaptations. C(4) photosynthesis is established along the developmental axis of the leaf blade, leading from an undifferentiated leaf base just above the ligule into highly specialized mesophyll cells (MCs) and bundle sheath cells (BSCs) at the tip. To resolve the kinetics of maize leaf development and C(4) differentiation and to obtain a systems-level understanding of maize leaf formation, the accumulation profiles of proteomes of the leaf and the isolated BSCs with their vascular bundle along the developmental gradient were determined using large-scale mass spectrometry. This was complemented by extensive qualitative and quantitative microscopy analysis of structural features (e.g., Kranz anatomy, plasmodesmata, cell wall, and organelles). More than 4300 proteins were identified and functionally annotated. Developmental protein accumulation profiles and hierarchical cluster analysis then determined the kinetics of organelle biogenesis, formation of cellular structures, metabolism, and coexpression patterns. Two main expression clusters were observed, each divided in subclusters, suggesting that a limited number of developmental regulatory networks organize concerted protein accumulation along the leaf gradient. The coexpression with BSC and MC markers provided strong candidates for further analysis of C(4) specialization, in particular transporters and biogenesis factors. Based on the integrated information, we describe five developmental transitions that provide a conceptual and practical template for further analysis. An online protein expression viewer is provided through the Plant Proteome Database.
Xu, Ruilian; Tang, Jun; Deng, Quantong; He, Wan; Sun, Xiujie; Xia, Ligang; Cheng, Zhiqiang; He, Lisheng; You, Shuyuan; Hu, Jintao; Fu, Yuxiang; Zhu, Jian; Chen, Yixin; Gao, Weina; He, An; Guo, Zhengyu; Lin, Lin; Li, Hua; Hu, Chaofeng; Tian, Ruijun
2018-05-01
Increasing attention has been focused on cell type proteome profiling for understanding the heterogeneous multicellular microenvironment in tissue samples. However, current cell type proteome profiling methods need large amounts of starting materials which preclude their application to clinical tumor specimens with limited access. Here, by seamlessly combining laser capture microdissection and integrated proteomics sample preparation technology SISPROT, specific cell types in tumor samples could be precisely dissected with single cell resolution and processed for high-sensitivity proteome profiling. Sample loss and contamination due to the multiple transfer steps are significantly reduced by the full integration and noncontact design. H&E staining dyes which are necessary for cell type investigation could be selectively removed by the unique two-stage design of the spintip device. This easy-to-use proteome profiling technology achieved high sensitivity with the identification of more than 500 proteins from only 0.1 mm 2 and 10 μm thickness colon cancer tissue section. The first cell type proteome profiling of four cell types from one colon tumor and surrounding normal tissue, including cancer cells, enterocytes, lymphocytes, and smooth muscle cells, was obtained. 5271, 4691, 4876, and 2140 protein groups were identified, respectively, from tissue section of only 5 mm 2 and 10 μm thickness. Furthermore, spatially resolved proteome distribution profiles of enterocytes, lymphocytes, and smooth muscle cells on the same tissue slices and across four consecutive sections with micrometer distance were successfully achieved. This fully integrated proteomics technology, termed LCM-SISPROT, is therefore promising for spatial-resolution cell type proteome profiling of tumor microenvironment with a minute amount of clinical starting materials.
Rigbolt, Kristoffer T G; Vanselow, Jens T; Blagoev, Blagoy
2011-08-01
Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics experimenters with a toolbox for bioinformatics analysis of quantitative proteomics data. The program is released as open-source and can be freely downloaded from the project webpage at http://gprox.sourceforge.net.
Rigbolt, Kristoffer T. G.; Vanselow, Jens T.; Blagoev, Blagoy
2011-01-01
Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)1. The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics experimenters with a toolbox for bioinformatics analysis of quantitative proteomics data. The program is released as open-source and can be freely downloaded from the project webpage at http://gprox.sourceforge.net. PMID:21602510
Wierer, Michael; Prestel, Matthias; Schiller, Herbert B; Yan, Guangyao; Schaab, Christoph; Azghandi, Sepiede; Werner, Julia; Kessler, Thorsten; Malik, Rainer; Murgia, Marta; Aherrahrou, Zouhair; Schunkert, Heribert; Dichgans, Martin; Mann, Matthias
2018-02-01
Atherosclerosis leads to vascular lesions that involve major rearrangements of the vascular proteome, especially of the extracellular matrix (ECM). Using single aortas from ApoE knock out mice, we quantified formation of plaques by single-run, high-resolution mass spectrometry (MS)-based proteomics. To probe localization on a proteome-wide scale we employed quantitative detergent solubility profiling. This compartment- and time-resolved resource of atherogenesis comprised 5117 proteins, 182 of which changed their expression status in response to vessel maturation and atherosclerotic plaque development. In the insoluble ECM proteome, 65 proteins significantly changed, including relevant collagens, matrix metalloproteinases and macrophage derived proteins. Among novel factors in atherosclerosis, we identified matrilin-2, the collagen IV crosslinking enzyme peroxidasin as well as the poorly characterized MAM-domain containing 2 (Mamdc2) protein as being up-regulated in the ECM during atherogenesis. Intriguingly, three subunits of the osteoclast specific V-ATPase complex were strongly increased in mature plaques with an enrichment in macrophages thus implying an active de-mineralization function. © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.
Wierer, Michael; Prestel, Matthias; Schiller, Herbert B.; Yan, Guangyao; Schaab, Christoph; Azghandi, Sepiede; Werner, Julia; Kessler, Thorsten; Malik, Rainer; Murgia, Marta; Aherrahrou, Zouhair; Schunkert, Heribert; Dichgans, Martin; Mann, Matthias
2018-01-01
Atherosclerosis leads to vascular lesions that involve major rearrangements of the vascular proteome, especially of the extracellular matrix (ECM). Using single aortas from ApoE knock out mice, we quantified formation of plaques by single-run, high-resolution mass spectrometry (MS)-based proteomics. To probe localization on a proteome-wide scale we employed quantitative detergent solubility profiling. This compartment- and time-resolved resource of atherogenesis comprised 5117 proteins, 182 of which changed their expression status in response to vessel maturation and atherosclerotic plaque development. In the insoluble ECM proteome, 65 proteins significantly changed, including relevant collagens, matrix metalloproteinases and macrophage derived proteins. Among novel factors in atherosclerosis, we identified matrilin-2, the collagen IV crosslinking enzyme peroxidasin as well as the poorly characterized MAM-domain containing 2 (Mamdc2) protein as being up-regulated in the ECM during atherogenesis. Intriguingly, three subunits of the osteoclast specific V-ATPase complex were strongly increased in mature plaques with an enrichment in macrophages thus implying an active de-mineralization function. PMID:29208753
Chen, Yao-Yi; Dasari, Surendra; Ma, Ze-Qiang; Vega-Montoto, Lorenzo J.; Li, Ming
2013-01-01
Spectral counting has become a widely used approach for measuring and comparing protein abundance in label-free shotgun proteomics. However, when analyzing complex samples, the ambiguity of matching between peptides and proteins greatly affects the assessment of peptide and protein inventories, differentiation, and quantification. Meanwhile, the configuration of database searching algorithms that assign peptides to MS/MS spectra may produce different results in comparative proteomic analysis. Here, we present three strategies to improve comparative proteomics through spectral counting. We show that comparing spectral counts for peptide groups rather than for protein groups forestalls problems introduced by shared peptides. We demonstrate the advantage and flexibility of this new method in two datasets. We present four models to combine four popular search engines that lead to significant gains in spectral counting differentiation. Among these models, we demonstrate a powerful vote counting model that scales well for multiple search engines. We also show that semi-tryptic searching outperforms tryptic searching for comparative proteomics. Overall, these techniques considerably improve protein differentiation on the basis of spectral count tables. PMID:22552787
Chen, Yao-Yi; Dasari, Surendra; Ma, Ze-Qiang; Vega-Montoto, Lorenzo J; Li, Ming; Tabb, David L
2012-09-01
Spectral counting has become a widely used approach for measuring and comparing protein abundance in label-free shotgun proteomics. However, when analyzing complex samples, the ambiguity of matching between peptides and proteins greatly affects the assessment of peptide and protein inventories, differentiation, and quantification. Meanwhile, the configuration of database searching algorithms that assign peptides to MS/MS spectra may produce different results in comparative proteomic analysis. Here, we present three strategies to improve comparative proteomics through spectral counting. We show that comparing spectral counts for peptide groups rather than for protein groups forestalls problems introduced by shared peptides. We demonstrate the advantage and flexibility of this new method in two datasets. We present four models to combine four popular search engines that lead to significant gains in spectral counting differentiation. Among these models, we demonstrate a powerful vote counting model that scales well for multiple search engines. We also show that semi-tryptic searching outperforms tryptic searching for comparative proteomics. Overall, these techniques considerably improve protein differentiation on the basis of spectral count tables.
Barkla, Bronwyn J; Castellanos-Cervantes, Thelma; de León, José L Diaz; Matros, Andrea; Mock, Hans-Peter; Perez-Alfocea, Francisco; Salekdeh, Ghasem H; Witzel, Katja; Zörb, Christian
2013-06-01
Salinity is a major threat limiting the productivity of crop plants. A clear demand for improving the salinity tolerance of the major crop plants is imposed by the rapidly growing world population. This review summarizes the achievements of proteomic studies to elucidate the response mechanisms of selected model and crop plants to cope with salinity stress. We also aim at identifying research areas, which deserve increased attention in future proteome studies, as a prerequisite to identify novel targets for breeding strategies. Such areas include the impact of plant-microbial communities on the salinity tolerance of crops under field conditions, the importance of hormone signaling in abiotic stress tolerance, and the significance of control mechanisms underlying the observed changes in the proteome patterns. We briefly highlight the impact of novel tools for future proteome studies and argue for the use of integrated approaches. The evaluation of genetic resources by means of novel automated phenotyping facilities will have a large impact on the application of proteomics especially in combination with metabolomics or transcriptomics. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A proteomic approach to obesity and type 2 diabetes.
López-Villar, Elena; Martos-Moreno, Gabriel Á; Chowen, Julie A; Okada, Shigeru; Kopchick, John J; Argente, Jesús
2015-07-01
The incidence of obesity and type diabetes 2 has increased dramatically resulting in an increased interest in its biomedical relevance. However, the mechanisms that trigger the development of diabetes type 2 in obese patients remain largely unknown. Scientific, clinical and pharmaceutical communities are dedicating vast resources to unravel this issue by applying different omics tools. During the last decade, the advances in proteomic approaches and the Human Proteome Organization have opened and are opening a new door that may be helpful in the identification of patients at risk and to improve current therapies. Here, we briefly review some of the advances in our understanding of type 2 diabetes that have occurred through the application of proteomics. We also review, in detail, the current improvements in proteomic methodologies and new strategies that could be employed to further advance our understanding of this pathology. By applying these new proteomic advances, novel therapeutic and/or diagnostic protein targets will be discovered in the obesity/Type 2 diabetes area. © 2015 The Authors. Journal of Cellular and Molecular Medicine published by John Wiley & Sons Ltd and Foundation for Cellular and Molecular Medicine.
An in-depth snake venom proteopeptidome characterization: Benchmarking Bothrops jararaca.
Nicolau, Carolina A; Carvalho, Paulo C; Junqueira-de-Azevedo, Inácio L M; Teixeira-Ferreira, André; Junqueira, Magno; Perales, Jonas; Neves-Ferreira, Ana Gisele C; Valente, Richard H
2017-01-16
A large-scale proteomic approach was devised to advance the understanding of venom composition. Bothrops jararaca venom was fractionated by OFFGEL followed by chromatography, generating peptidic and proteic fractions. The latter was submitted to trypsin digestion. Both fractions were separately analyzed by reversed-phase nanochromatography coupled to high resolution mass spectrometry. This strategy allowed deeper and joint characterizations of the peptidome and proteome (proteopeptidome) of this venom. Our results lead to the identification of 46 protein classes (with several uniquely assigned proteins per class) comprising eight high-abundance bona fide venom components, and 38 additional classes in smaller quantities. This last category included previously described B. jararaca venom proteins, common Elapidae venom constituents (cobra venom factor and three-finger toxin), and proteins typically encountered in lysosomes, cellular membranes and blood plasma. Furthermore, this report is the most complete snake venom peptidome described so far, both in number of peptides and in variety of unique proteins that could have originated them. It is hypothesized that such diversity could enclose cryptides, whose bioactivities would contribute to envenomation in yet undetermined ways. Finally, we propose that the broad range screening of B. jararaca peptidome will facilitate the discovery of bioactive molecules, eventually leading to valuable therapeutical agents. Our proteopeptidomic strategy yielded unprecedented insights into the remarkable diversity of B. jararaca venom composition, both at the peptide and protein levels. These results bring a substantial contribution to the actual pursuit of large-scale protein-level assignment in snake venomics. The detection of typical elapidic venom components, in a Viperidae venom, reinforces our view that the use of this approach (hand-in-hand with transcriptomic and genomic data) for venom proteomic analysis, at the specimen-level, can greatly contribute for venom toxin evolution studies. Furthermore, data were generated in support of a previous hypothesis that venom gland secretory vesicles are specialized forms of lysosomes. Two testable hypotheses also emerge from the results of this work. The first is that a nucleobindin-2-derived protein could lead to prey disorientation during envenomation, aiding in its capture by the snake. The other being that the venom's peptidome might contain a population of cryptides, whose biological activities could lead to the development of new therapeutical agents. Copyright © 2016 Elsevier B.V. All rights reserved.
Picotti, Paola; Clement-Ziza, Mathieu; Lam, Henry; Campbell, David S.; Schmidt, Alexander; Deutsch, Eric W.; Röst, Hannes; Sun, Zhi; Rinner, Oliver; Reiter, Lukas; Shen, Qin; Michaelson, Jacob J.; Frei, Andreas; Alberti, Simon; Kusebauch, Ulrike; Wollscheid, Bernd; Moritz, Robert; Beyer, Andreas; Aebersold, Ruedi
2013-01-01
Complete reference maps or datasets, like the genomic map of an organism, are highly beneficial tools for biological and biomedical research. Attempts to generate such reference datasets for a proteome so far failed to reach complete proteome coverage, with saturation apparent at approximately two thirds of the proteomes tested, even for the most thoroughly characterized proteomes. Here, we used a strategy based on high-throughput peptide synthesis and mass spectrometry to generate a close to complete reference map (97% of the genome-predicted proteins) of the S. cerevisiae proteome. We generated two versions of this mass spectrometric map one supporting discovery- (shotgun) and the other hypothesis-driven (targeted) proteomic measurements. The two versions of the map, therefore, constitute a complete set of proteomic assays to support most studies performed with contemporary proteomic technologies. The reference libraries can be browsed via a web-based repository and associated navigation tools. To demonstrate the utility of the reference libraries we applied them to a protein quantitative trait locus (pQTL) analysis, which requires measurement of the same peptides over a large number of samples with high precision. Protein measurements over a set of 78 S. cerevisiae strains revealed a complex relationship between independent genetic loci, impacting on the levels of related proteins. Our results suggest that selective pressure favors the acquisition of sets of polymorphisms that maintain the stoichiometry of protein complexes and pathways. PMID:23334424
Ramus, Claire; Hovasse, Agnès; Marcellin, Marlène; Hesse, Anne-Marie; Mouton-Barbosa, Emmanuelle; Bouyssié, David; Vaca, Sebastian; Carapito, Christine; Chaoui, Karima; Bruley, Christophe; Garin, Jérôme; Cianférani, Sarah; Ferro, Myriam; Van Dorssaeler, Alain; Burlet-Schiltz, Odile; Schaeffer, Christine; Couté, Yohann; Gonzalez de Peredo, Anne
2016-01-30
Proteomic workflows based on nanoLC-MS/MS data-dependent-acquisition analysis have progressed tremendously in recent years. High-resolution and fast sequencing instruments have enabled the use of label-free quantitative methods, based either on spectral counting or on MS signal analysis, which appear as an attractive way to analyze differential protein expression in complex biological samples. However, the computational processing of the data for label-free quantification still remains a challenge. Here, we used a proteomic standard composed of an equimolar mixture of 48 human proteins (Sigma UPS1) spiked at different concentrations into a background of yeast cell lysate to benchmark several label-free quantitative workflows, involving different software packages developed in recent years. This experimental design allowed to finely assess their performances in terms of sensitivity and false discovery rate, by measuring the number of true and false-positive (respectively UPS1 or yeast background proteins found as differential). The spiked standard dataset has been deposited to the ProteomeXchange repository with the identifier PXD001819 and can be used to benchmark other label-free workflows, adjust software parameter settings, improve algorithms for extraction of the quantitative metrics from raw MS data, or evaluate downstream statistical methods. Bioinformatic pipelines for label-free quantitative analysis must be objectively evaluated in their ability to detect variant proteins with good sensitivity and low false discovery rate in large-scale proteomic studies. This can be done through the use of complex spiked samples, for which the "ground truth" of variant proteins is known, allowing a statistical evaluation of the performances of the data processing workflow. We provide here such a controlled standard dataset and used it to evaluate the performances of several label-free bioinformatics tools (including MaxQuant, Skyline, MFPaQ, IRMa-hEIDI and Scaffold) in different workflows, for detection of variant proteins with different absolute expression levels and fold change values. The dataset presented here can be useful for tuning software tool parameters, and also testing new algorithms for label-free quantitative analysis, or for evaluation of downstream statistical methods. Copyright © 2015 Elsevier B.V. All rights reserved.
Kuuskeri, Jaana; Häkkinen, Mari; Laine, Pia; Smolander, Olli-Pekka; Tamene, Fitsum; Miettinen, Sini; Nousiainen, Paula; Kemell, Marianna; Auvinen, Petri; Lundell, Taina
2016-01-01
The white-rot Agaricomycetes species Phlebia radiata is an efficient wood-decaying fungus degrading all wood components, including cellulose, hemicellulose, and lignin. We cultivated P. radiata in solid state cultures on spruce wood, and extended the experiment to 6 weeks to gain more knowledge on the time-scale dynamics of protein expression upon growth and wood decay. Total proteome and transcriptome of P. radiata were analyzed by peptide LC-MS/MS and RNA sequencing at specific time points to study the enzymatic machinery on the fungus' natural growth substrate. According to proteomics analyses, several CAZy oxidoreductase class-II peroxidases with glyoxal and alcohol oxidases were the most abundant proteins produced on wood together with enzymes important for cellulose utilization, such as GH7 and GH6 cellobiohydrolases. Transcriptome additionally displayed expression of multiple AA9 lytic polysaccharide monooxygenases indicative of oxidative cleavage of wood carbohydrate polymers. Large differences were observed for individual protein quantities at specific time points, with a tendency of enhanced production of specific peroxidases on the first 2 weeks of growth on wood. Among the 10 class-II peroxidases, new MnP1-long, characterized MnP2-long and LiP3 were produced in high protein abundances, while LiP2 and LiP1 were upregulated at highest level as transcripts on wood together with the oxidases and one acetyl xylan esterase, implying their necessity as primary enzymes to function against coniferous wood lignin to gain carbohydrate accessibility and fungal growth. Majority of the CAZy encoding transcripts upregulated on spruce wood represented activities against plant cell wall and were identified in the proteome, comprising main activities of white-rot decay. Our data indicate significant changes in carbohydrate-active enzyme expression during the six-week surveillance of P. radiata growing on wood. Response to wood substrate is seen already during the first weeks. The immediate oxidative enzyme action on lignin and wood cell walls is supported by detected lignin substructure sidechain cleavages, release of phenolic units, and visual changes in xylem cell wall ultrastructure. This study contributes to increasing knowledge on fungal genetics and lignocellulose bioconversion pathways, allowing us to head for systems biology, development of biofuel production, and industrial applications on plant biomass utilizing wood-decay fungi.
Krishnan, Hari B; Natarajan, Savithiry S; Oehrle, Nathan W; Garrett, Wesley M; Darwish, Omar
2017-06-14
Pigeonpea is one of the major sources of dietary protein for more than a billion people living in South Asia. This hardy legume is often grown in low-input and risk-prone marginal environments. Considerable research effort has been devoted by a global research consortium to develop genomic resources for the improvement of this legume crop. These efforts have resulted in the elucidation of the complete genome sequence of pigeonpea. Despite these developments, little is known about the seed proteome of this important crop. Here, we report the proteome of pigeonpea seed. To enable the isolation of maximum number of seed proteins, including those that are present in very low amounts, three different protein fractions were obtained by employing different extraction media. High-resolution two-dimensional (2-D) electrophoresis followed by MALDI-TOF-TOF-MS/MS analysis of these protein fractions resulted in the identification of 373 pigeonpea seed proteins. Consistent with the reported high degree of synteny between the pigeonpea and soybean genomes, a large number of pigeonpea seed proteins exhibited significant amino acid homology with soybean seed proteins. Our proteomic analysis identified a large number of stress-related proteins, presumably due to its adaptation to drought-prone environments. The availability of a pigeonpea seed proteome reference map should shed light on the roles of these identified proteins in various biological processes and facilitate the improvement of seed composition.
A reference guide for tree analysis and visualization
2010-01-01
The quantities of data obtained by the new high-throughput technologies, such as microarrays or ChIP-Chip arrays, and the large-scale OMICS-approaches, such as genomics, proteomics and transcriptomics, are becoming vast. Sequencing technologies become cheaper and easier to use and, thus, large-scale evolutionary studies towards the origins of life for all species and their evolution becomes more and more challenging. Databases holding information about how data are related and how they are hierarchically organized expand rapidly. Clustering analysis is becoming more and more difficult to be applied on very large amounts of data since the results of these algorithms cannot be efficiently visualized. Most of the available visualization tools that are able to represent such hierarchies, project data in 2D and are lacking often the necessary user friendliness and interactivity. For example, the current phylogenetic tree visualization tools are not able to display easy to understand large scale trees with more than a few thousand nodes. In this study, we review tools that are currently available for the visualization of biological trees and analysis, mainly developed during the last decade. We describe the uniform and standard computer readable formats to represent tree hierarchies and we comment on the functionality and the limitations of these tools. We also discuss on how these tools can be developed further and should become integrated with various data sources. Here we focus on freely available software that offers to the users various tree-representation methodologies for biological data analysis. PMID:20175922
DOE Office of Scientific and Technical Information (OSTI.GOV)
Patil, Rajreddy; Kumar, B. Mohana; Lee, Won-Jae
Dental tissues provide an alternative autologous source of mesenchymal stem cells (MSCs) for regenerative medicine. In this study, we isolated human dental MSCs of follicle, pulp and papilla tissue from a single donor tooth after impacted third molar extraction by excluding the individual differences. We then compared the morphology, proliferation rate, expression of MSC-specific and pluripotency markers, and in vitro differentiation ability into osteoblasts, adipocytes, chondrocytes and functional hepatocyte-like cells (HLCs). Finally, we analyzed the protein expression profiles of undifferentiated dental MSCs using 2DE coupled with MALDI-TOF-MS. Three types of dental MSCs largely shared similar morphology, proliferation potential, expression ofmore » surface markers and pluripotent transcription factors, and differentiation ability into osteoblasts, adipocytes, and chondrocytes. Upon hepatogenic induction, all MSCs were transdifferentiated into functional HLCs, and acquired hepatocyte functions by showing their ability for glycogen storage and urea production. Based on the proteome profiling results, we identified nineteen proteins either found commonly or differentially expressed among the three types of dental MSCs. In conclusion, three kinds of dental MSCs from a single donor tooth possessed largely similar cellular properties and multilineage potential. Further, these dental MSCs had similar proteomic profiles, suggesting their interchangeable applications for basic research and call therapy. - Highlights: • Isolated and characterized three types of human dental MSCs from a single donor. • MSCs of dental follicle, pulp and papilla had largely similar biological properties. • All MSCs were capable of transdifferentiating into functional hepatocyte-like cells. • 2DE proteomics with MALDI-TOF/MS identified 19 proteins in three types of MSCs. • Similar proteomic profiles suggest interchangeable applications of dental MSCs.« less
Surinova, Silvia; Hüttenhain, Ruth; Chang, Ching-Yun; Espona, Lucia; Vitek, Olga; Aebersold, Ruedi
2013-08-01
Targeted proteomics based on selected reaction monitoring (SRM) mass spectrometry is commonly used for accurate and reproducible quantification of protein analytes in complex biological mixtures. Strictly hypothesis-driven, SRM assays quantify each targeted protein by collecting measurements on its peptide fragment ions, called transitions. To achieve sensitive and accurate quantitative results, experimental design and data analysis must consistently account for the variability of the quantified transitions. This consistency is especially important in large experiments, which increasingly require profiling up to hundreds of proteins over hundreds of samples. Here we describe a robust and automated workflow for the analysis of large quantitative SRM data sets that integrates data processing, statistical protein identification and quantification, and dissemination of the results. The integrated workflow combines three software tools: mProphet for peptide identification via probabilistic scoring; SRMstats for protein significance analysis with linear mixed-effect models; and PASSEL, a public repository for storage, retrieval and query of SRM data. The input requirements for the protocol are files with SRM traces in mzXML format, and a file with a list of transitions in a text tab-separated format. The protocol is especially suited for data with heavy isotope-labeled peptide internal standards. We demonstrate the protocol on a clinical data set in which the abundances of 35 biomarker candidates were profiled in 83 blood plasma samples of subjects with ovarian cancer or benign ovarian tumors. The time frame to realize the protocol is 1-2 weeks, depending on the number of replicates used in the experiment.
Gray, Michael W
2015-08-18
Comparative studies of the mitochondrial proteome have identified a conserved core of proteins descended from the α-proteobacterial endosymbiont that gave rise to the mitochondrion and was the source of the mitochondrial genome in contemporary eukaryotes. A surprising result of phylogenetic analyses is the relatively small proportion (10-20%) of the mitochondrial proteome displaying a clear α-proteobacterial ancestry. A large fraction of mitochondrial proteins typically has detectable homologs only in other eukaryotes and is presumed to represent proteins that emerged specifically within eukaryotes. A further significant fraction of the mitochondrial proteome consists of proteins with homologs in prokaryotes, but without a robust phylogenetic signal affiliating them with specific prokaryotic lineages. The presumptive evolutionary source of these proteins is quite different in contending models of mitochondrial origin.
Pont, Frédéric; Fournié, Jean Jacques
2010-03-01
MS, the reference technology for proteomics, routinely produces large numbers of protein lists whose fast comparison would prove very useful. Unfortunately, most softwares only allow comparisons of two to three lists at once. We introduce here nwCompare, a simple tool for n-way comparison of several protein lists without any query language, and exemplify its use with differential and shared cancer cell proteomes. As the software compares character strings, it can be applied to any type of data mining, such as genomic or metabolomic datalists.
Tissue proteomics of the low-molecular weight proteome using an integrated cLC-ESI-QTOFMS approach.
Alvarez, MeiHwa Tanielle Bench; Shah, Dipti Jigar; Thulin, Craig D; Graves, Steven W
2013-05-01
Analysis of the protein/peptide composition of tissue has provided meaningful insights into tissue biology and even disease mechanisms. However, little has been published regarding top down methods to investigate lower molecular weight (MW) (500-5000 Da) species in tissue. Here, we evaluate a tissue proteomics approach involving tissue homogenization followed by depletion of large proteins and then cLC-MS (where c stands for capillary) analysis to interrogate the low MW/low abundance tissue proteome. In the development of this method, sheep heart, lung, liver, kidney, and spleen were surveyed to test our ability to observe tissue differences. After categorical tissue differences were demonstrated, a detailed study of this method's reproducibility was undertaken to determine whether or not it is suitable for analyzing more subtle differences in the abundance of small proteins and peptides. Our results suggest that this method should be useful in exploring the low MW proteome of tissues. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Ararso, Zewdu; Ma, Chuan; Qi, Yuping; Feng, Mao; Han, Bin; Hu, Han; Meng, Lifeng; Li, Jianke
2018-01-05
Hemolymph is vital for the immunity of honeybees and offers a way to investigate their physiological status. To gain novel insight into the functionality and molecular details of the hemolymph in driving increased Royal Jelly (RJ) production, we characterized and compared hemolymph proteomes across the larval and adult ages of Italian bees (ITbs) and Royal Jelly bees (RJbs), a stock selected from ITbs for increasing RJ output. Unprecedented in-depth proteome was attained with the identification of 3394 hemolymph proteins in both bee lines. The changes in proteome support the general function of hemolymph to drive development and immunity across different ages. However, age-specific proteome settings have adapted to prime the distinct physiology for larvae and adult bees. In larvae, the proteome is thought to drive temporal immunity, rapid organogenesis, and reorganization of larval structures. In adults, the proteome plays key roles in prompting tissue development and immune defense in newly emerged bees, in gland maturity in nurse bees, and in carbohydrate energy production in forager bees. Between larval and adult samples of the same age, RJbs and ITbs have tailored distinct hemolymph proteome programs to drive their physiology. In particular, in day 4 larvae and nurse bees, a large number of highly abundant proteins are enriched in protein synthesis and energy metabolism in RJbs. This implies that they have adapted their proteome to initiate different developmental trajectories and high RJ secretion in response to selection for enhanced RJ production. Our hitherto unexplored in-depth proteome coverage provides novel insight into molecular details that drive hemolymph function and high RJ production by RJbs.
Neural Stem Cells (NSCs) and Proteomics*
Shoemaker, Lorelei D.; Kornblum, Harley I.
2016-01-01
Neural stem cells (NSCs) can self-renew and give rise to the major cell types of the CNS. Studies of NSCs include the investigation of primary, CNS-derived cells as well as animal and human embryonic stem cell (ESC)-derived and induced pluripotent stem cell (iPSC)-derived sources. NSCs provide a means with which to study normal neural development, neurodegeneration, and neurological disease and are clinically relevant sources for cellular repair to the damaged and diseased CNS. Proteomics studies of NSCs have the potential to delineate molecules and pathways critical for NSC biology and the means by which NSCs can participate in neural repair. In this review, we provide a background to NSC biology, including the means to obtain them and the caveats to these processes. We then focus on advances in the proteomic interrogation of NSCs. This includes the analysis of posttranslational modifications (PTMs); approaches to analyzing different proteomic compartments, such the secretome; as well as approaches to analyzing temporal differences in the proteome to elucidate mechanisms of differentiation. We also discuss some of the methods that will undoubtedly be useful in the investigation of NSCs but which have not yet been applied to the field. While many proteomics studies of NSCs have largely catalogued the proteome or posttranslational modifications of specific cellular states, without delving into specific functions, some have led to understandings of functional processes or identified markers that could not have been identified via other means. Many challenges remain in the field, including the precise identification and standardization of NSCs used for proteomic analyses, as well as how to translate fundamental proteomics studies to functional biology. The next level of investigation will require interdisciplinary approaches, combining the skills of those interested in the biochemistry of proteomics with those interested in modulating NSC function. PMID:26494823
Havugimana, Pierre C; Hu, Pingzhao; Emili, Andrew
2017-10-01
Elucidation of the networks of physical (functional) interactions present in cells and tissues is fundamental for understanding the molecular organization of biological systems, the mechanistic basis of essential and disease-related processes, and for functional annotation of previously uncharacterized proteins (via guilt-by-association or -correlation). After a decade in the field, we felt it timely to document our own experiences in the systematic analysis of protein interaction networks. Areas covered: Researchers worldwide have contributed innovative experimental and computational approaches that have driven the rapidly evolving field of 'functional proteomics'. These include mass spectrometry-based methods to characterize macromolecular complexes on a global-scale and sophisticated data analysis tools - most notably machine learning - that allow for the generation of high-quality protein association maps. Expert commentary: Here, we recount some key lessons learned, with an emphasis on successful workflows, and challenges, arising from our own and other groups' ongoing efforts to generate, interpret and report proteome-scale interaction networks in increasingly diverse biological contexts.
Keren, Leeat; Segal, Eran; Milo, Ron
2016-01-01
Most proteins show changes in level across growth conditions. Many of these changes seem to be coordinated with the specific growth rate rather than the growth environment or the protein function. Although cellular growth rates, gene expression levels and gene regulation have been at the center of biological research for decades, there are only a few models giving a base line prediction of the dependence of the proteome fraction occupied by a gene with the specific growth rate. We present a simple model that predicts a widely coordinated increase in the fraction of many proteins out of the proteome, proportionally with the growth rate. The model reveals how passive redistribution of resources, due to active regulation of only a few proteins, can have proteome wide effects that are quantitatively predictable. Our model provides a potential explanation for why and how such a coordinated response of a large fraction of the proteome to the specific growth rate arises under different environmental conditions. The simplicity of our model can also be useful by serving as a baseline null hypothesis in the search for active regulation. We exemplify the usage of the model by analyzing the relationship between growth rate and proteome composition for the model microorganism E.coli as reflected in recent proteomics data sets spanning various growth conditions. We find that the fraction out of the proteome of a large number of proteins, and from different cellular processes, increases proportionally with the growth rate. Notably, ribosomal proteins, which have been previously reported to increase in fraction with growth rate, are only a small part of this group of proteins. We suggest that, although the fractions of many proteins change with the growth rate, such changes may be partially driven by a global effect, not necessarily requiring specific cellular control mechanisms. PMID:27073913
Mühlhaus, Timo; Weiss, Julia; Hemme, Dorothea; Sommer, Frederik; Schroda, Michael
2011-01-01
Crop-plant-yield safety is jeopardized by temperature stress caused by the global climate change. To take countermeasures by breeding and/or transgenic approaches it is essential to understand the mechanisms underlying plant acclimation to heat stress. To this end proteomics approaches are most promising, as acclimation is largely mediated by proteins. Accordingly, several proteomics studies, mainly based on two-dimensional gel-tandem MS approaches, were conducted in the past. However, results often were inconsistent, presumably attributable to artifacts inherent to the display of complex proteomes via two-dimensional-gels. We describe here a new approach to monitor proteome dynamics in time course experiments. This approach involves full 15N metabolic labeling and mass spectrometry based quantitative shotgun proteomics using a uniform 15N standard over all time points. It comprises a software framework, IOMIQS, that features batch job mediated automated peptide identification by four parallelized search engines, peptide quantification and data assembly for the processing of large numbers of samples. We have applied this approach to monitor proteome dynamics in a heat stress time course using the unicellular green alga Chlamydomonas reinhardtii as model system. We were able to identify 3433 Chlamydomonas proteins, of which 1116 were quantified in at least three of five time points of the time course. Statistical analyses revealed that levels of 38 proteins significantly increased, whereas levels of 206 proteins significantly decreased during heat stress. The increasing proteins comprise 25 (co-)chaperones and 13 proteins involved in chromatin remodeling, signal transduction, apoptosis, photosynthetic light reactions, and yet unknown functions. Proteins decreasing during heat stress were significantly enriched in functional categories that mediate carbon flux from CO2 and external acetate into protein biosynthesis, which also correlated with a rapid, but fully reversible cell cycle arrest after onset of stress. Our approach opens up new perspectives for plant systems biology and provides novel insights into plant stress acclimation. PMID:21610104
Yang, Xia; Zhang, Zichang; Gu, Tao; Dong, Mingchao; Peng, Qiong; Bai, Lianyang; Li, Yongfeng
2017-01-06
Barnyardgrass (Echinochloa crus-galli) is one of the top 15 herbicide-resistant weeds around the world that interferes with rice growth, resulting in major losses of rice yield. Thus, multi-herbicide resistance in barnyardgrass presents a major threat, with the underlying mechanisms that contribute to resistance requiring elucidation. In an attempt to characterize this multi-herbicide resistance at the proteomic level, comparative analysis of resistant and susceptible barnyardgrasses was performed using iTRAQ, both with and without quinclorac, bispyribac-sodium and penoxsulam herbicidal treatment. A total of 1342 protein species were identified from 2248 unique peptides by searching the UniProt database and conducting data analysis. Approximately 904 protein species with 4774 Gene Ontology (GO) terms were grouped into the categories of biological process, cellular component and molecular function. Among these, 688 protein species were annotated into 1583 KEGG pathways, with 980 protein species relating to metabolism and 93 relating to environmental information processing. A total of 292 protein species showed more than a 1.2-fold change in abundance in the resistant biotype relative to the susceptible biotype. Furthermore, herbicide treatment resulted in 157 protein species that showed more than a 1.2-fold change in the resistant biotype. Moreover, physiological analyses demonstrated an ecological fitness cost in the resistant biotype. While some studies have shown a fitness cost to be associated with an altered ecological interaction, our understanding of the fitness costs associated with herbicide resistance are limited. Herein, physiological and proteomic analysis demonstrates herbicide resistance associated ecological fitness cost and potential mechanisms of herbicide-resistance in resistant biotypes of E. crus-galli. The results presented herein have revealed differences in ecological adaptation between resistant and susceptible biotypes in E. crus-galli and provide a fundamental basis enabling the development of new strategies for weed control. Lastly, this is the first large-scale proteomics study to examine herbicide stress responses in different barnyardgrass biotypes. Copyright © 2016 Elsevier B.V. All rights reserved.
Chowdhury, Saiful M; Zhu, Xuewei; Aloor, Jim J; Azzam, Kathleen M; Gabor, Kristin A; Ge, William; Addo, Kezia A; Tomer, Kenneth B; Parks, John S; Fessler, Michael B
2015-07-01
Lipid raft membrane microdomains organize signaling by many prototypical receptors, including the Toll-like receptors (TLRs) of the innate immune system. Raft-localization of proteins is widely thought to be regulated by raft cholesterol levels, but this is largely on the basis of studies that have manipulated cell cholesterol using crude and poorly specific chemical tools, such as β-cyclodextrins. To date, there has been no proteome-scale investigation of whether endogenous regulators of intracellular cholesterol trafficking, such as the ATP binding cassette (ABC)A1 lipid efflux transporter, regulate targeting of proteins to rafts. Abca1(-/-) macrophages have cholesterol-laden rafts that have been reported to contain increased levels of select proteins, including TLR4, the lipopolysaccharide receptor. Here, using quantitative proteomic profiling, we identified 383 proteins in raft isolates from Abca1(+/+) and Abca1(-/-) macrophages. ABCA1 deletion induced wide-ranging changes to the raft proteome. Remarkably, many of these changes were similar to those seen in Abca1(+/+) macrophages after lipopolysaccharide exposure. Stomatin-like protein (SLP)-2, a member of the stomatin-prohibitin-flotillin-HflK/C family of membrane scaffolding proteins, was robustly and specifically increased in Abca1(-/-) rafts. Pursuing SLP-2 function, we found that rafts of SLP-2-silenced macrophages had markedly abnormal composition. SLP-2 silencing did not compromise ABCA1-dependent cholesterol efflux but reduced macrophage responsiveness to multiple TLR ligands. This was associated with reduced raft levels of the TLR co-receptor, CD14, and defective lipopolysaccharide-induced recruitment of the common TLR adaptor, MyD88, to rafts. Taken together, we show that the lipid transporter ABCA1 regulates the protein repertoire of rafts and identify SLP-2 as an ABCA1-dependent regulator of raft composition and of the innate immune response. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Chowdhury, Saiful M.; Zhu, Xuewei; Aloor, Jim J.; Azzam, Kathleen M.; Gabor, Kristin A.; Ge, William; Addo, Kezia A.; Tomer, Kenneth B.; Parks, John S.; Fessler, Michael B.
2015-01-01
Lipid raft membrane microdomains organize signaling by many prototypical receptors, including the Toll-like receptors (TLRs) of the innate immune system. Raft-localization of proteins is widely thought to be regulated by raft cholesterol levels, but this is largely on the basis of studies that have manipulated cell cholesterol using crude and poorly specific chemical tools, such as β-cyclodextrins. To date, there has been no proteome-scale investigation of whether endogenous regulators of intracellular cholesterol trafficking, such as the ATP binding cassette (ABC)A1 lipid efflux transporter, regulate targeting of proteins to rafts. Abca1−/− macrophages have cholesterol-laden rafts that have been reported to contain increased levels of select proteins, including TLR4, the lipopolysaccharide receptor. Here, using quantitative proteomic profiling, we identified 383 proteins in raft isolates from Abca1+/+ and Abca1−/− macrophages. ABCA1 deletion induced wide-ranging changes to the raft proteome. Remarkably, many of these changes were similar to those seen in Abca1+/+ macrophages after lipopolysaccharide exposure. Stomatin-like protein (SLP)-2, a member of the stomatin-prohibitin-flotillin-HflK/C family of membrane scaffolding proteins, was robustly and specifically increased in Abca1−/− rafts. Pursuing SLP-2 function, we found that rafts of SLP-2-silenced macrophages had markedly abnormal composition. SLP-2 silencing did not compromise ABCA1-dependent cholesterol efflux but reduced macrophage responsiveness to multiple TLR ligands. This was associated with reduced raft levels of the TLR co-receptor, CD14, and defective lipopolysaccharide-induced recruitment of the common TLR adaptor, MyD88, to rafts. Taken together, we show that the lipid transporter ABCA1 regulates the protein repertoire of rafts and identify SLP-2 as an ABCA1-dependent regulator of raft composition and of the innate immune response. PMID:25910759
Placental Proteomics: A Shortcut to Biological Insight
Robinson, John M.; Vandré, Dale D.; Ackerman, William E.
2012-01-01
Proteomics analysis of biological samples has the potential to identify novel protein expression patterns and/or changes in protein expression patterns in different developmental or disease states. An important component of successful proteomics research, at least in its present form, is to reduce the complexity of the sample if it is derived from cells or tissues. One method to simplify complex tissues is to focus on a specific, highly purified sub-proteome. Using this approach we have developed methods to prepare highly enriched fractions of the apical plasma membrane of the syncytiotrophoblast. Through proteomics analysis of this fraction we have identified over five hundred proteins several of which were previously not known to reside in the syncytiotrophoblast. Herein, we focus on two of these, dysferlin and myoferlin. These proteins, largely known from studies of skeletal muscle, may not have been found in the human placenta were it not for discovery-based proteomics analysis. This new knowledge, acquired through a discovery-driven approach, can now be applied for the generation of hypothesis-based experimentation. Thus discovery-based and hypothesis-based research are complimentary approaches that when coupled together can hasten scientific discoveries. PMID:19070895
The Pacific Northwest National Laboratory library of bacterial and archaeal proteomic biodiversity
Payne, Samuel H.; Monroe, Matthew E.; Overall, Christopher C.; ...
2015-08-18
This dataset deposition announces the submission to public repositories of the PNNL Biodiversity Library, a large collection of global proteomics data for 112 bacterial and archaeal organisms. The data comprises 35,162 tandem mass spectrometry (MS/MS) datasets from ~10 years of research. All data has been searched, annotated and organized in a consistent manner to promote reuse by the community. Protein identifications were cross-referenced with KEGG functional annotations which allows for pathway oriented investigation. We present the data as a freely available community resource. A variety of data re-use options are described for computational modeling, proteomics assay design and bioengineering. Instrumentmore » data and analysis files are available at ProteomeXchange via the MassIVE partner repository under the identifiers PXD001860 and MSV000079053.« less
The Pacific Northwest National Laboratory library of bacterial and archaeal proteomic biodiversity
DOE Office of Scientific and Technical Information (OSTI.GOV)
Payne, Samuel H.; Monroe, Matthew E.; Overall, Christopher C.
This dataset deposition announces the submission to public repositories of the PNNL Biodiversity Library, a large collection of global proteomics data for 112 bacterial and archaeal organisms. The data comprises 35,162 tandem mass spectrometry (MS/MS) datasets from ~10 years of research. All data has been searched, annotated and organized in a consistent manner to promote reuse by the community. Protein identifications were cross-referenced with KEGG functional annotations which allows for pathway oriented investigation. We present the data as a freely available community resource. A variety of data re-use options are described for computational modeling, proteomics assay design and bioengineering. Instrumentmore » data and analysis files are available at ProteomeXchange via the MassIVE partner repository under the identifiers PXD001860 and MSV000079053.« less
Platelet proteomics: from discovery to diagnosis.
Looße, Christina; Swieringa, Frauke; Heemskerk, Johan W M; Sickmann, Albert; Lorenz, Christin
2018-05-22
Platelets are the smallest cells within the circulating blood with key roles in physiological haemostasis and pathological thrombosis regulated by the onset of activating/inhibiting processes via receptor responses and signalling cascades. Areas covered: Proteomics as well as genomic approaches have been fundamental in identifying and quantifying potential targets for future diagnostic strategies in the prevention of bleeding and thrombosis, and uncovering the complexity of platelet functions in health and disease. In this article, we provide a critical overview on current functional tests used in diagnostics and the future perspectives for platelet proteomics in clinical applications. Expert commentary: Proteomics represents a valuable tool for the identification of patients with diverse platelet associated defects. In-depth validation of identified biomarkers, e.g. receptors, signalling proteins, post-translational modifications, in large cohorts is decisive for translation into routine clinical diagnostics.
Review of Software Tools for Design and Analysis of Large scale MRM Proteomic Datasets
Colangelo, Christopher M.; Chung, Lisa; Bruce, Can; Cheung, Kei-Hoi
2013-01-01
Selective or Multiple Reaction monitoring (SRM/MRM) is a liquid-chromatography (LC)/tandem-mass spectrometry (MS/MS) method that enables the quantitation of specific proteins in a sample by analyzing precursor ions and the fragment ions of their selected tryptic peptides. Instrumentation software has advanced to the point that thousands of transitions (pairs of primary and secondary m/z values) can be measured in a triple quadrupole instrument coupled to an LC, by a well-designed scheduling and selection of m/z windows. The design of a good MRM assay relies on the availability of peptide spectra from previous discovery-phase LC-MS/MS studies. The tedious aspect of manually developing and processing MRM assays involving thousands of transitions has spurred to development of software tools to automate this process. Software packages have been developed for project management, assay development, assay validation, data export, peak integration, quality assessment, and biostatistical analysis. No single tool provides a complete end-to-end solution, thus this article reviews the current state and discusses future directions of these software tools in order to enable researchers to combine these tools for a comprehensive targeted proteomics workflow. PMID:23702368
Blattmann, Peter; Heusel, Moritz; Aebersold, Ruedi
2016-01-01
SWATH-MS is an acquisition and analysis technique of targeted proteomics that enables measuring several thousand proteins with high reproducibility and accuracy across many samples. OpenSWATH is popular open-source software for peptide identification and quantification from SWATH-MS data. For downstream statistical and quantitative analysis there exist different tools such as MSstats, mapDIA and aLFQ. However, the transfer of data from OpenSWATH to the downstream statistical tools is currently technically challenging. Here we introduce the R/Bioconductor package SWATH2stats, which allows convenient processing of the data into a format directly readable by the downstream analysis tools. In addition, SWATH2stats allows annotation, analyzing the variation and the reproducibility of the measurements, FDR estimation, and advanced filtering before submitting the processed data to downstream tools. These functionalities are important to quickly analyze the quality of the SWATH-MS data. Hence, SWATH2stats is a new open-source tool that summarizes several practical functionalities for analyzing, processing, and converting SWATH-MS data and thus facilitates the efficient analysis of large-scale SWATH/DIA datasets.
Glandorf, J.; Thiele, H.; Macht, M.; Vorm, O.; Podtelejnikov, A.
2007-01-01
In the course of a full-scale proteomics experiment, the handling of the data as well as the retrieval of the relevant information from the results is a major challenge due to the massive amount of generated data (gel images, chromatograms, and spectra) as well as associated result information (sequences, literature, etc.). To obtain meaningful information from these data, one has to filter the results in an easy way. Possibilities to do so can be based on GO terms or structural features such as transmembrane domains, involvement in certain pathways, etc. In this presentation we will show how a combination of a software package with a workflow-based result organization (Bruker ProteinScape) and a protein-centered data-mining software (Proxeon ProteinCenter) can assist in the comparison of the results from large projects, such as comparison of cross-platform results from 2D PAGE/MS with shotgun LC-ESI-MS/MS. We will present differences between different technologies and show how these differences can be easily identified and how they allow us to draw conclusions on the involved technologies.
Proteomic analyses of host and pathogen responses during bovine mastitis.
Boehmer, Jamie L
2011-12-01
The pursuit of biomarkers for use as clinical screening tools, measures for early detection, disease monitoring, and as a means for assessing therapeutic responses has steadily evolved in human and veterinary medicine over the past two decades. Concurrently, advances in mass spectrometry have markedly expanded proteomic capabilities for biomarker discovery. While initial mass spectrometric biomarker discovery endeavors focused primarily on the detection of modulated proteins in human tissues and fluids, recent efforts have shifted to include proteomic analyses of biological samples from food animal species. Mastitis continues to garner attention in veterinary research due mainly to affiliated financial losses and food safety concerns over antimicrobial use, but also because there are only a limited number of efficacious mastitis treatment options. Accordingly, comparative proteomic analyses of bovine milk have emerged in recent years. Efforts to prevent agricultural-related food-borne illness have likewise fueled an interest in the proteomic evaluation of several prominent strains of bacteria, including common mastitis pathogens. The interest in establishing biomarkers of the host and pathogen responses during bovine mastitis stems largely from the need to better characterize mechanisms of the disease, to identify reliable biomarkers for use as measures of early detection and drug efficacy, and to uncover potentially novel targets for the development of alternative therapeutics. The following review focuses primarily on comparative proteomic analyses conducted on healthy versus mastitic bovine milk. However, a comparison of the host defense proteome of human and bovine milk and the proteomic analysis of common veterinary pathogens are likewise introduced.
Mildew-Omics: How Global Analyses Aid the Understanding of Life and Evolution of Powdery Mildews
Bindschedler, Laurence V.; Panstruga, Ralph; Spanu, Pietro D.
2016-01-01
The common powdery mildew plant diseases are caused by ascomycete fungi of the order Erysiphales. Their characteristic life style as obligate biotrophs renders functional analyses in these species challenging, mainly because of experimental constraints to genetic manipulation. Global large-scale (“-omics”) approaches are thus particularly valuable and insightful for the characterisation of the life and evolution of powdery mildews. Here we review the knowledge obtained so far from genomic, transcriptomic and proteomic studies in these fungi. We consider current limitations and challenges regarding these surveys and provide an outlook on desired future investigations on the basis of the various –omics technologies. PMID:26913042
Noninvasive metabolic profiling for painless diagnosis of human diseases and disorders.
Mal, Mainak
2016-06-01
Metabolic profiling provides a powerful diagnostic tool complementary to genomics and proteomics. The pain, discomfort and probable iatrogenic injury associated with invasive or minimally invasive diagnostic methods, render them unsuitable in terms of patient compliance and participation. Metabolic profiling of biomatrices like urine, breath, saliva, sweat and feces, which can be collected in a painless manner, could be used for noninvasive diagnosis. This review article covers the noninvasive metabolic profiling studies that have exhibited diagnostic potential for diseases and disorders. Their potential applications are evident in different forms of cancer, metabolic disorders, infectious diseases, neurodegenerative disorders, rheumatic diseases and pulmonary diseases. Large scale clinical validation of such diagnostic methods is necessary in future.
Noninvasive metabolic profiling for painless diagnosis of human diseases and disorders
Mal, Mainak
2016-01-01
Metabolic profiling provides a powerful diagnostic tool complementary to genomics and proteomics. The pain, discomfort and probable iatrogenic injury associated with invasive or minimally invasive diagnostic methods, render them unsuitable in terms of patient compliance and participation. Metabolic profiling of biomatrices like urine, breath, saliva, sweat and feces, which can be collected in a painless manner, could be used for noninvasive diagnosis. This review article covers the noninvasive metabolic profiling studies that have exhibited diagnostic potential for diseases and disorders. Their potential applications are evident in different forms of cancer, metabolic disorders, infectious diseases, neurodegenerative disorders, rheumatic diseases and pulmonary diseases. Large scale clinical validation of such diagnostic methods is necessary in future. PMID:28031956
Statistical issues in the design and planning of proteomic profiling experiments.
Cairns, David A
2015-01-01
The statistical design of a clinical proteomics experiment is a critical part of well-undertaken investigation. Standard concepts from experimental design such as randomization, replication and blocking should be applied in all experiments, and this is possible when the experimental conditions are well understood by the investigator. The large number of proteins simultaneously considered in proteomic discovery experiments means that determining the number of required replicates to perform a powerful experiment is more complicated than in simple experiments. However, by using information about the nature of an experiment and making simple assumptions this is achievable for a variety of experiments useful for biomarker discovery and initial validation.
McKew, Boyd A; Metodieva, Gergana; Raines, Christine A; Metodiev, Metodi V; Geider, Richard J
2015-10-01
Limitation of marine primary production by the availability of nitrogen or phosphorus is common. Emiliania huxleyi, a ubiquitous phytoplankter that plays key roles in primary production, calcium carbonate precipitation and production of dimethyl sulfide, often blooms in mid-latitude at the beginning of summer when inorganic nutrient concentrations are low. To understand physiological mechanisms that allow such blooms, we examined how the proteome of E. huxleyi (strain 1516) responds to N and P limitation. We observed modest changes in much of the proteome despite large physiological changes (e.g. cellular biomass, C, N and P) associated with nutrient limitation of growth rate. Acclimation to nutrient limitation did however involve significant increases in the abundance of transporters for ammonium and nitrate under N limitation and for phosphate under P limitation. More notable were large increases in proteins involved in the acquisition of organic forms of N and P, including urea and amino acid/polyamine transporters and numerous C-N hydrolases under N limitation and a large upregulation of alkaline phosphatase under P limitation. This highly targeted reorganization of the proteome towards scavenging organic forms of macronutrients gives unique insight into the molecular mechanisms that underpin how E. huxleyi has found its niche to bloom in surface waters depleted of inorganic nutrients. © 2015 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.
Heslop, James A; Rowe, Cliff; Walsh, Joanne; Sison-Young, Rowena; Jenkins, Roz; Kamalian, Laleh; Kia, Richard; Hay, David; Jones, Robert P; Malik, Hassan Z; Fenwick, Stephen; Chadwick, Amy E; Mills, John; Kitteringham, Neil R; Goldring, Chris E P; Kevin Park, B
2017-01-01
The application of primary human hepatocytes following isolation from human tissue is well accepted to be compromised by the process of dedifferentiation. This phenomenon reduces many unique hepatocyte functions, limiting their use in drug disposition and toxicity assessment. The aetiology of dedifferentiation has not been well defined, and further understanding of the process would allow the development of novel strategies for sustaining the hepatocyte phenotype in culture or for improving protocols for maturation of hepatocytes generated from stem cells. We have therefore carried out the first proteomic comparison of primary human hepatocyte differentiation. Cells were cultured for 0, 24, 72 and 168 h as a monolayer in order to permit unrestricted hepatocyte dedifferentiation, so as to reveal the causative signalling pathways and factors in this process, by pathway analysis. A total of 3430 proteins were identified with a false detection rate of <1 %, of which 1117 were quantified at every time point. Increasing numbers of significantly differentially expressed proteins compared with the freshly isolated cells were observed at 24 h (40 proteins), 72 h (118 proteins) and 168 h (272 proteins) (p < 0.05). In particular, cytochromes P450 and mitochondrial proteins underwent major changes, confirmed by functional studies and investigated by pathway analysis. We report the key factors and pathways which underlie the loss of hepatic phenotype in vitro, particularly those driving the large-scale and selective remodelling of the mitochondrial and metabolic proteomes. In summary, these findings expand the current understanding of dedifferentiation should facilitate further development of simple and complex hepatic culture systems.
Falter, Christian; Ellinger, Dorothea; von Hülsen, Behrend; Heim, René; Voigt, Christian A.
2015-01-01
The outwardly directed cell wall and associated plasma membrane of epidermal cells represent the first layers of plant defense against intruding pathogens. Cell wall modifications and the formation of defense structures at sites of attempted pathogen penetration are decisive for plant defense. A precise isolation of these stress-induced structures would allow a specific analysis of regulatory mechanism and cell wall adaption. However, methods for large-scale epidermal tissue preparation from the model plant Arabidopsis thaliana, which would allow proteome and cell wall analysis of complete, laser-microdissected epidermal defense structures, have not been provided. We developed the adhesive tape – liquid cover glass technique (ACT) for simple leaf epidermis preparation from A. thaliana, which is also applicable on grass leaves. This method is compatible with subsequent staining techniques to visualize stress-related cell wall structures, which were precisely isolated from the epidermal tissue layer by laser microdissection (LM) coupled to laser pressure catapulting. We successfully demonstrated that these specific epidermal tissue samples could be used for quantitative downstream proteome and cell wall analysis. The development of the ACT for simple leaf epidermis preparation and the compatibility to LM and downstream quantitative analysis opens new possibilities in the precise examination of stress- and pathogen-related cell wall structures in epidermal cells. Because the developed tissue processing is also applicable on A. thaliana, well-established, model pathosystems that include the interaction with powdery mildews can be studied to determine principal regulatory mechanisms in plant–microbe interaction with their potential outreach into crop breeding. PMID:25870605
Falter, Christian; Ellinger, Dorothea; von Hülsen, Behrend; Heim, René; Voigt, Christian A
2015-01-01
The outwardly directed cell wall and associated plasma membrane of epidermal cells represent the first layers of plant defense against intruding pathogens. Cell wall modifications and the formation of defense structures at sites of attempted pathogen penetration are decisive for plant defense. A precise isolation of these stress-induced structures would allow a specific analysis of regulatory mechanism and cell wall adaption. However, methods for large-scale epidermal tissue preparation from the model plant Arabidopsis thaliana, which would allow proteome and cell wall analysis of complete, laser-microdissected epidermal defense structures, have not been provided. We developed the adhesive tape - liquid cover glass technique (ACT) for simple leaf epidermis preparation from A. thaliana, which is also applicable on grass leaves. This method is compatible with subsequent staining techniques to visualize stress-related cell wall structures, which were precisely isolated from the epidermal tissue layer by laser microdissection (LM) coupled to laser pressure catapulting. We successfully demonstrated that these specific epidermal tissue samples could be used for quantitative downstream proteome and cell wall analysis. The development of the ACT for simple leaf epidermis preparation and the compatibility to LM and downstream quantitative analysis opens new possibilities in the precise examination of stress- and pathogen-related cell wall structures in epidermal cells. Because the developed tissue processing is also applicable on A. thaliana, well-established, model pathosystems that include the interaction with powdery mildews can be studied to determine principal regulatory mechanisms in plant-microbe interaction with their potential outreach into crop breeding.
Proteomics to study DNA-bound and chromatin-associated gene regulatory complexes
Wierer, Michael; Mann, Matthias
2016-01-01
High-resolution mass spectrometry (MS)-based proteomics is a powerful method for the identification of soluble protein complexes and large-scale affinity purification screens can decode entire protein interaction networks. In contrast, protein complexes residing on chromatin have been much more challenging, because they are difficult to purify and often of very low abundance. However, this is changing due to recent methodological and technological advances in proteomics. Proteins interacting with chromatin marks can directly be identified by pulldowns with synthesized histone tails containing posttranslational modifications (PTMs). Similarly, pulldowns with DNA baits harbouring single nucleotide polymorphisms or DNA modifications reveal the impact of those DNA alterations on the recruitment of transcription factors. Accurate quantitation – either isotope-based or label free – unambiguously pinpoints proteins that are significantly enriched over control pulldowns. In addition, protocols that combine classical chromatin immunoprecipitation (ChIP) methods with mass spectrometry (ChIP-MS) target gene regulatory complexes in their in-vivo context. Similar to classical ChIP, cells are crosslinked with formaldehyde and chromatin sheared by sonication or nuclease digested. ChIP-MS baits can be proteins in tagged or endogenous form, histone PTMs, or lncRNAs. Locus-specific ChIP-MS methods would allow direct purification of a single genomic locus and the proteins associated with it. There, loci can be targeted either by artificial DNA-binding sites and corresponding binding proteins or via proteins with sequence specificity such as TAL or nuclease deficient Cas9 in combination with a specific guide RNA. We predict that advances in MS technology will soon make such approaches generally applicable tools in epigenetics. PMID:27402878
The test skeletal matrix of the black sea urchin Arbacia lixula.
Kanold, Julia M; Immel, Francoise; Broussard, Cédric; Guichard, Nathalie; Plasseraud, Laurent; Corneillat, Marion; Alcaraz, Gérard; Brümmer, Franz; Marin, Frédéric
2015-03-01
In the field of biomineralization, the past decade has been marked by the increasing use of high throughput techniques, i.e. proteomics, for identifying in one shot the protein content of complex macromolecular mixtures extracted from mineralized tissues. Although crowned with success, this approach has been restricted so far to a limited set of key-organisms, such as the purple sea urchin Strongylocentrotus purpuratus, the pearl oyster or the abalone, leaving in the shadow non-model organisms. As a consequence, it is still unknown to what extent the calcifying repertoire varies, from group to group, at high (phylum, class), median (order, family) or low (genus, species) taxonomic rank. The present paper shows the first biochemical and proteomic characterization of the test matrix of the Mediterranean black sea urchin Arbacia lixula (Arbacioida). Our work suggests that the skeletal repertoire of A. lixula exhibits some similarities but also several differences with that of the few sea urchin species (S. purpuratus, Paracentrotus lividus), for which molecular data are already available. The differences may be attributable to the taxonomic position of the species considered: A. lixula belongs to an order - Arbacioida - that diverged more than one hundred million years ago from the Camarodonta, which includes the two species S. purpuratus and P. lividus. For the echinoid class, we suggest that large-scale proteomic screening should be performed in order to understand which molecular functions related to calcification are conserved and which ones have been co-opted for biomineralization in particular lineages. Copyright © 2014 Elsevier Inc. All rights reserved.
Characterization of the human aqueous humour proteome: A comparison of the genders.
Perumal, Natarajan; Manicam, Caroline; Steinicke, Matthias; Funke, Sebastian; Pfeiffer, Norbert; Grus, Franz H
2017-01-01
Aqueous humour (AH) is an important biologic fluid that maintains normal intraocular pressure and contains proteins that regulate the homeostasis of ocular tissues. Any alterations in the protein compositions are correlated to the pathogenesis of various ocular disorders. In recent years, gender-based medicine has emerged as an important research focus considering the prevalence of certain diseases, which are higher in a particular sex. Nevertheless, the inter-gender variations in the AH proteome are unknown. Therefore, this study endeavoured to characterize the AH proteome to assess the differences between genders. Thirty AH samples of patients who underwent cataract surgery were categorized according to their gender. Label-free quantitative discovery mass spectrometry-based proteomics strategy was employed to characterize the AH proteome. A total of 147 proteins were identified with a false discovery rate of less than 1% and only the top 10 major AH proteins make up almost 90% of the total identified proteins. A large number of proteins identified were correlated to defence, immune and inflammatory mechanisms, and response to wounding. Four proteins were found to be differentially abundant between the genders, comprising SERPINF1, SERPINA3, SERPING1 and PTGDS. The findings emerging from our study provide the first insight into the gender-based proteome differences in the AH and also highlight the importance in considering potential sex-dependent changes in the proteome of ocular pathologies in future studies employing the AH.
Gómez-Molero, Emilia; de Boer, Albert D; Dekker, Henk L; Moreno-Martínez, Ana; Kraneveld, Eef A; Ichsan; Chauhan, Neeraj; Weig, Michael; de Soet, Johannes J; de Koster, Chris G; Bader, Oliver; de Groot, Piet W J
2015-12-01
Attachment to human host tissues or abiotic medical devices is a key step in the development of infections by Candida glabrata. The genome of this pathogenic yeast codes for a large number of adhesins, but proteomic work using reference strains has shown incorporation of only few adhesins in the cell wall. By making inventories of the wall proteomes of hyperadhesive clinical isolates and reference strain CBS138 using mass spectrometry, we describe the cell wall proteome of C. glabrata and tested the hypothesis that hyperadhesive isolates display differential incorporation of adhesins. Two clinical strains (PEU382 and PEU427) were selected, which both were hyperadhesive to polystyrene and showed high surface hydrophobicity. Cell wall proteome analysis under biofilm-forming conditions identified a core proteome of about 20 proteins present in all C. glabrata strains. In addition, 12 adhesin-like wall proteins were identified in the hyperadherent strains, including six novel adhesins (Awp8-13) of which only Awp12 was also present in CBS138. We conclude that the hyperadhesive capacity of these two clinical C. glabrata isolates is correlated with increased and differential incorporation of cell wall adhesins. Future studies should elucidate the role of the identified proteins in the establishment of C. glabrata infections. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Global Proteomics Analysis of Protein Lysine Methylation.
Cao, Xing-Jun; Garcia, Benjamin A
2016-11-01
Lysine methylation is a common protein post-translational modification dynamically mediated by protein lysine methyltransferases (PKMTs) and protein lysine demethylases (PKDMs). Beyond histone proteins, lysine methylation on non-histone proteins plays a substantial role in a variety of functions in cells and is closely associated with diseases such as cancer. A large body of evidence indicates that the dysregulation of some PKMTs leads to tumorigenesis via their non-histone substrates. However, most studies on other PKMTs have made slow progress owing to the lack of approaches for extensive screening of lysine methylation sites. However, recently, there has been a series of publications to perform large-scale analysis of protein lysine methylation. In this unit, we introduce a protocol for the global analysis of protein lysine methylation in cells by means of immunoaffinity enrichment and mass spectrometry. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Functional genomic Landscape of Human Breast Cancer drivers, vulnerabilities, and resistance
Marcotte, Richard; Sayad, Azin; Brown, Kevin R.; Sanchez-Garcia, Felix; Reimand, Jüri; Haider, Maliha; Virtanen, Carl; Bradner, James E.; Bader, Gary D.; Mills, Gordon B.; Pe’er, Dana; Moffat, Jason; Neel, Benjamin G.
2016-01-01
Summary Large-scale genomic studies have identified multiple somatic aberrations in breast cancer, including copy number alterations, and point mutations. Still, identifying causal variants and emergent vulnerabilities that arise as a consequence of genetic alterations remain major challenges. We performed whole genome shRNA “dropout screens” on 77 breast cancer cell lines. Using a hierarchical linear regression algorithm to score our screen results and integrate them with accompanying detailed genetic and proteomic information, we identify vulnerabilities in breast cancer, including candidate “drivers,” and reveal general functional genomic properties of cancer cells. Comparisons of gene essentiality with drug sensitivity data suggest potential resistance mechanisms, effects of existing anti-cancer drugs, and opportunities for combination therapy. Finally, we demonstrate the utility of this large dataset by identifying BRD4 as a potential target in luminal breast cancer, and PIK3CA mutations as a resistance determinant for BET-inhibitors. PMID:26771497
Recent developments in structural proteomics for protein structure determination.
Liu, Hsuan-Liang; Hsu, Jyh-Ping
2005-05-01
The major challenges in structural proteomics include identifying all the proteins on the genome-wide scale, determining their structure-function relationships, and outlining the precise three-dimensional structures of the proteins. Protein structures are typically determined by experimental approaches such as X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy. However, the knowledge of three-dimensional space by these techniques is still limited. Thus, computational methods such as comparative and de novo approaches and molecular dynamic simulations are intensively used as alternative tools to predict the three-dimensional structures and dynamic behavior of proteins. This review summarizes recent developments in structural proteomics for protein structure determination; including instrumental methods such as X-ray crystallography and NMR spectroscopy, and computational methods such as comparative and de novo structure prediction and molecular dynamics simulations.
Large-scale serum protein biomarker discovery in Duchenne muscular dystrophy.
Hathout, Yetrib; Brody, Edward; Clemens, Paula R; Cripe, Linda; DeLisle, Robert Kirk; Furlong, Pat; Gordish-Dressman, Heather; Hache, Lauren; Henricson, Erik; Hoffman, Eric P; Kobayashi, Yvonne Monique; Lorts, Angela; Mah, Jean K; McDonald, Craig; Mehler, Bob; Nelson, Sally; Nikrad, Malti; Singer, Britta; Steele, Fintan; Sterling, David; Sweeney, H Lee; Williams, Steve; Gold, Larry
2015-06-09
Serum biomarkers in Duchenne muscular dystrophy (DMD) may provide deeper insights into disease pathogenesis, suggest new therapeutic approaches, serve as acute read-outs of drug effects, and be useful as surrogate outcome measures to predict later clinical benefit. In this study a large-scale biomarker discovery was performed on serum samples from patients with DMD and age-matched healthy volunteers using a modified aptamer-based proteomics technology. Levels of 1,125 proteins were quantified in serum samples from two independent DMD cohorts: cohort 1 (The Parent Project Muscular Dystrophy-Cincinnati Children's Hospital Medical Center), 42 patients with DMD and 28 age-matched normal volunteers; and cohort 2 (The Cooperative International Neuromuscular Research Group, Duchenne Natural History Study), 51 patients with DMD and 17 age-matched normal volunteers. Forty-four proteins showed significant differences that were consistent in both cohorts when comparing DMD patients and healthy volunteers at a 1% false-discovery rate, a large number of significant protein changes for such a small study. These biomarkers can be classified by known cellular processes and by age-dependent changes in protein concentration. Our findings demonstrate both the utility of this unbiased biomarker discovery approach and suggest potential new diagnostic and therapeutic avenues for ameliorating the burden of DMD and, we hope, other rare and devastating diseases.
Tummala, Seshu B; Junne, Stefan G; Paredes, Carlos J; Papoutsakis, Eleftherios T
2003-12-30
Antisense RNA (asRNA) downregulation alters protein expression without changing the regulation of gene expression. Downregulation of primary metabolic enzymes possibly combined with overexpression of other metabolic enzymes may result in profound changes in product formation, and this may alter the large-scale transcriptional program of the cells. DNA-array based large-scale transcriptional analysis has the potential to elucidate factors that control cellular fluxes even in the absence of proteome data. These themes are explored in the study of large-scale transcriptional analysis programs and the in vivo primary-metabolism fluxes of several related recombinant C. acetobutylicum strains: C. acetobutylicum ATCC 824(pSOS95del) (plasmid control; produces high levels of butanol snd acetone), 824(pCTFB1AS) (expresses antisense RNA against CoA transferase (ctfb1-asRNA); produces very low levels of butanol and acetone), and 824(pAADB1) (expresses ctfb1-asRNA and the alcohol-aldehyde dahydrogenase gene (aad); produce high alcohol and low acetone levels). DNA-array based transcriptional analysis revealed that the large changes in product concentrations (snd notably butanol concentration) due to ctfb1-asRNA expression alone and in combination with aad overexpression resulted in dramatic changes of the cellular transcriptome. Cluster analysis and gene expression patterns of established and putative operons involved in stress response, motility, sporulation, and fatty-acid biosynthesis indicate that these simple genetic changes dramatically alter the cellular programs of C. acetobutylicum. Comparison of gene expression and flux analysis data may point to possible flux-controling steps and suggest unknown regulatory mechanisms. Copyright 2003; Wiley Periodicals, Inc.
Lo, Yu-Chen; Senese, Silvia; Li, Chien-Ming; Hu, Qiyang; Huang, Yong; Damoiseaux, Robert; Torres, Jorge Z.
2015-01-01
Target identification is one of the most critical steps following cell-based phenotypic chemical screens aimed at identifying compounds with potential uses in cell biology and for developing novel disease therapies. Current in silico target identification methods, including chemical similarity database searches, are limited to single or sequential ligand analysis that have limited capabilities for accurate deconvolution of a large number of compounds with diverse chemical structures. Here, we present CSNAP (Chemical Similarity Network Analysis Pulldown), a new computational target identification method that utilizes chemical similarity networks for large-scale chemotype (consensus chemical pattern) recognition and drug target profiling. Our benchmark study showed that CSNAP can achieve an overall higher accuracy (>80%) of target prediction with respect to representative chemotypes in large (>200) compound sets, in comparison to the SEA approach (60–70%). Additionally, CSNAP is capable of integrating with biological knowledge-based databases (Uniprot, GO) and high-throughput biology platforms (proteomic, genetic, etc) for system-wise drug target validation. To demonstrate the utility of the CSNAP approach, we combined CSNAP's target prediction with experimental ligand evaluation to identify the major mitotic targets of hit compounds from a cell-based chemical screen and we highlight novel compounds targeting microtubules, an important cancer therapeutic target. The CSNAP method is freely available and can be accessed from the CSNAP web server (http://services.mbi.ucla.edu/CSNAP/). PMID:25826798
Preparation of the low molecular weight serum proteome for mass spectrometry analysis.
Waybright, Timothy J; Chan, King C; Veenstra, Timothy D; Xiao, Zhen
2013-01-01
The discovery of viable biomarkers or indicators of disease states is complicated by the inherent complexity of the chosen biological specimen. Every sample, whether it is serum, plasma, urine, tissue, cells, or a host of others, contains thousands of large and small components, each interacting in multiple ways. The need to concentrate on a group of these components to narrow the focus on a potential biomarker candidate becomes, out of necessity, a priority, especially in the search for immune-related low molecular weight serum biomarkers. One such method in the field of proteomics is to divide the sample proteome into groups based on the size of the protein, analyze each group, and mine the data for statistically significant items. This chapter details a portion of this method, concentrating on a method for fractionating and analyzing the low molecular weight proteome of human serum.
Unexpected features of the dark proteome
Perdigão, Nelson; Heinrich, Julian; Stolte, Christian; Sabir, Kenneth S.; Buckley, Michael J.; Tabor, Bruce; Signal, Beth; Gloss, Brian S.; Hammang, Christopher J.; Rost, Burkhard; Schafferhans, Andrea
2015-01-01
We surveyed the “dark” proteome–that is, regions of proteins never observed by experimental structure determination and inaccessible to homology modeling. For 546,000 Swiss-Prot proteins, we found that 44–54% of the proteome in eukaryotes and viruses was dark, compared with only ∼14% in archaea and bacteria. Surprisingly, most of the dark proteome could not be accounted for by conventional explanations, such as intrinsic disorder or transmembrane regions. Nearly half of the dark proteome comprised dark proteins, in which the entire sequence lacked similarity to any known structure. Dark proteins fulfill a wide variety of functions, but a subset showed distinct and largely unexpected features, such as association with secretion, specific tissues, the endoplasmic reticulum, disulfide bonding, and proteolytic cleavage. Dark proteins also had short sequence length, low evolutionary reuse, and few known interactions with other proteins. These results suggest new research directions in structural and computational biology. PMID:26578815
Dormeyer, Wilma; van Hoof, Dennis; Mummery, Christine L; Krijgsveld, Jeroen; Heck, Albert J R
2008-10-01
The identification of (plasma) membrane proteins in cells can provide valuable insights into the regulation of their biological processes. Pluripotent cells such as human embryonic stem cells and embryonal carcinoma cells are capable of unlimited self-renewal and share many of the biological mechanisms that regulate proliferation and differentiation. The comparison of their membrane proteomes will help unravel the biological principles of pluripotency, and the identification of biomarker proteins in their plasma membranes is considered a crucial step to fully exploit pluripotent cells for therapeutic purposes. For these tasks, membrane proteomics is the method of choice, but as indicated by the scarce identification of membrane and plasma membrane proteins in global proteomic surveys it is not an easy task. In this minireview, we first describe the general challenges of membrane proteomics. We then review current sample preparation steps and discuss protocols that we found particularly beneficial for the identification of large numbers of (plasma) membrane proteins in human tumour- and embryo-derived stem cells. Our optimized assembled protocol led to the identification of a large number of membrane proteins. However, as the composition of cells and membranes is highly variable we still recommend adapting the sample preparation protocol for each individual system.
Functional analysis of proteins and protein species using shotgun proteomics and linear mathematics.
Hoehenwarter, Wolfgang; Chen, Yanmei; Recuenco-Munoz, Luis; Wienkoop, Stefanie; Weckwerth, Wolfram
2011-07-01
Covalent post-translational modification of proteins is the primary modulator of protein function in the cell. It greatly expands the functional potential of the proteome compared to the genome. In the past few years shotgun proteomics-based research, where the proteome is digested into peptides prior to mass spectrometric analysis has been prolific in this area. It has determined the kinetics of tens of thousands of sites of covalent modification on an equally large number of proteins under various biological conditions and uncovered a transiently active regulatory network that extends into diverse branches of cellular physiology. In this review, we discuss this work in light of the concept of protein speciation, which emphasizes the entire post-translationally modified molecule and its interactions and not just the modification site as the functional entity. Sometimes, particularly when considering complex multisite modification, all of the modified molecular species involved in the investigated condition, the protein species must be completely resolved for full understanding. We present a mathematical technique that delivers a good approximation for shotgun proteomics data.
Kappler, Ulrike; Rowland, Susan L; Pedwell, Rhianna K
2017-05-01
Systems biology is frequently taught with an emphasis on mathematical modeling approaches. This focus effectively excludes most biology, biochemistry, and molecular biology students, who are not mathematics majors. The mathematical focus can also present a misleading picture of systems biology, which is a multi-disciplinary pursuit requiring collaboration between biochemists, bioinformaticians, and mathematicians. This article describes an authentic large-scale undergraduate research experience (ALURE) in systems biology that incorporates proteomics, bacterial genomics, and bioinformatics in the one exercise. This project is designed to engage students who have a basic grounding in protein chemistry and metabolism and no mathematical modeling skills. The pedagogy around the research experience is designed to help students attack complex datasets and use their emergent metabolic knowledge to make meaning from large amounts of raw data. On completing the ALURE, participants reported a significant increase in their confidence around analyzing large datasets, while the majority of the cohort reported good or great gains in a variety of skills including "analysing data for patterns" and "conducting database or internet searches." An environmental scan shows that this ALURE is the only undergraduate-level system-biology research project offered on a large-scale in Australia; this speaks to the perceived difficulty of implementing such an opportunity for students. We argue however, that based on the student feedback, allowing undergraduate students to complete a systems-biology project is both feasible and desirable, even if the students are not maths and computing majors. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(3):235-248, 2017. © 2016 The International Union of Biochemistry and Molecular Biology.
Proteomic analysis of Chlorella vulgaris: Potential targets for enhanced lipid accumulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guarnieri, Michael T.; Nag, Ambarish; Yang, Shihui
2013-11-01
Oleaginous microalgae are capable of producing large quantities of fatty acids and triacylglycerides. As such, they are promising feedstocks for the production of biofuels and bioproducts. Genetic strain-engineering strategies offer a means to accelerate the commercialization of algal biofuels by improving the rate and total accumulation of microalgal lipids. However, the industrial potential of these organisms remains to be met, largely due to the incomplete knowledgebase surrounding the mechanisms governing the induction of algal lipid biosynthesis. Such strategies require further elucidation of genes and gene products controlling algal lipid accumulation. In this study, we have set out to examine thesemore » mechanisms and identify novel strain-engineering targets in the oleaginous microalga, Chlorella vulgaris. Comparative shotgun proteomic analyses have identified a number of novel targets, including previously unidentified transcription factors and proteins involved in cell signaling and cell cycle regulation. These results lay the foundation for strain-improvement strategies and demonstrate the power of translational proteomic analysis.« less
Analysis of high accuracy, quantitative proteomics data in the MaxQB database.
Schaab, Christoph; Geiger, Tamar; Stoehr, Gabriele; Cox, Juergen; Mann, Matthias
2012-03-01
MS-based proteomics generates rapidly increasing amounts of precise and quantitative information. Analysis of individual proteomic experiments has made great strides, but the crucial ability to compare and store information across different proteome measurements still presents many challenges. For example, it has been difficult to avoid contamination of databases with low quality peptide identifications, to control for the inflation in false positive identifications when combining data sets, and to integrate quantitative data. Although, for example, the contamination with low quality identifications has been addressed by joint analysis of deposited raw data in some public repositories, we reasoned that there should be a role for a database specifically designed for high resolution and quantitative data. Here we describe a novel database termed MaxQB that stores and displays collections of large proteomics projects and allows joint analysis and comparison. We demonstrate the analysis tools of MaxQB using proteome data of 11 different human cell lines and 28 mouse tissues. The database-wide false discovery rate is controlled by adjusting the project specific cutoff scores for the combined data sets. The 11 cell line proteomes together identify proteins expressed from more than half of all human genes. For each protein of interest, expression levels estimated by label-free quantification can be visualized across the cell lines. Similarly, the expression rank order and estimated amount of each protein within each proteome are plotted. We used MaxQB to calculate the signal reproducibility of the detected peptides for the same proteins across different proteomes. Spearman rank correlation between peptide intensity and detection probability of identified proteins was greater than 0.8 for 64% of the proteome, whereas a minority of proteins have negative correlation. This information can be used to pinpoint false protein identifications, independently of peptide database scores. The information contained in MaxQB, including high resolution fragment spectra, is accessible to the community via a user-friendly web interface at http://www.biochem.mpg.de/maxqb.
Neural Stem Cells (NSCs) and Proteomics.
Shoemaker, Lorelei D; Kornblum, Harley I
2016-02-01
Neural stem cells (NSCs) can self-renew and give rise to the major cell types of the CNS. Studies of NSCs include the investigation of primary, CNS-derived cells as well as animal and human embryonic stem cell (ESC)-derived and induced pluripotent stem cell (iPSC)-derived sources. NSCs provide a means with which to study normal neural development, neurodegeneration, and neurological disease and are clinically relevant sources for cellular repair to the damaged and diseased CNS. Proteomics studies of NSCs have the potential to delineate molecules and pathways critical for NSC biology and the means by which NSCs can participate in neural repair. In this review, we provide a background to NSC biology, including the means to obtain them and the caveats to these processes. We then focus on advances in the proteomic interrogation of NSCs. This includes the analysis of posttranslational modifications (PTMs); approaches to analyzing different proteomic compartments, such the secretome; as well as approaches to analyzing temporal differences in the proteome to elucidate mechanisms of differentiation. We also discuss some of the methods that will undoubtedly be useful in the investigation of NSCs but which have not yet been applied to the field. While many proteomics studies of NSCs have largely catalogued the proteome or posttranslational modifications of specific cellular states, without delving into specific functions, some have led to understandings of functional processes or identified markers that could not have been identified via other means. Many challenges remain in the field, including the precise identification and standardization of NSCs used for proteomic analyses, as well as how to translate fundamental proteomics studies to functional biology. The next level of investigation will require interdisciplinary approaches, combining the skills of those interested in the biochemistry of proteomics with those interested in modulating NSC function. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Aasebø, Elise; Forthun, Rakel B.; Berven, Frode; Selheim, Frode; Hernandez-Valladares, Maria
2016-01-01
The identification of protein biomarkers for acute myeloid leukemia (AML) that could find applications in AML diagnosis and prognosis, treatment and the selection for bone marrow transplant requires substantial comparative analyses of the proteomes from AML patients. In the past years, several studies have suggested some biomarkers for AML diagnosis or AML classification using methods for sample preparation with low proteome coverage and low resolution mass spectrometers. However, most of the studies did not follow up, confirm or validate their candidates with more patient samples. Current proteomics methods, new high resolution and fast mass spectrometers allow the identification and quantification of several thousands of proteins obtained from few tens of μg of AML cell lysate. Enrichment methods for posttranslational modifications (PTM), such as phosphorylation, can isolate several thousands of site-specific phosphorylated peptides from AML patient samples, which subsequently can be quantified with high confidence in new mass spectrometers. While recent reports aiming to propose proteomic or phosphoproteomic biomarkers on the studied AML patient samples have taken advantage of the technological progress, the access to large cohorts of AML patients to sample from and the availability of appropriate control samples still remain challenging. PMID:26306748
Morisawa, Hiraku; Hirota, Mikako; Toda, Tosifusa
2006-01-01
Background In the post-genome era, most research scientists working in the field of proteomics are confronted with difficulties in management of large volumes of data, which they are required to keep in formats suitable for subsequent data mining. Therefore, a well-developed open source laboratory information management system (LIMS) should be available for their proteomics research studies. Results We developed an open source LIMS appropriately customized for 2-D gel electrophoresis-based proteomics workflow. The main features of its design are compactness, flexibility and connectivity to public databases. It supports the handling of data imported from mass spectrometry software and 2-D gel image analysis software. The LIMS is equipped with the same input interface for 2-D gel information as a clickable map on public 2DPAGE databases. The LIMS allows researchers to follow their own experimental procedures by reviewing the illustrations of 2-D gel maps and well layouts on the digestion plates and MS sample plates. Conclusion Our new open source LIMS is now available as a basic model for proteome informatics, and is accessible for further improvement. We hope that many research scientists working in the field of proteomics will evaluate our LIMS and suggest ways in which it can be improved. PMID:17018156
Systematic Errors in Peptide and Protein Identification and Quantification by Modified Peptides*
Bogdanow, Boris; Zauber, Henrik; Selbach, Matthias
2016-01-01
The principle of shotgun proteomics is to use peptide mass spectra in order to identify corresponding sequences in a protein database. The quality of peptide and protein identification and quantification critically depends on the sensitivity and specificity of this assignment process. Many peptides in proteomic samples carry biochemical modifications, and a large fraction of unassigned spectra arise from modified peptides. Spectra derived from modified peptides can erroneously be assigned to wrong amino acid sequences. However, the impact of this problem on proteomic data has not yet been investigated systematically. Here we use combinations of different database searches to show that modified peptides can be responsible for 20–50% of false positive identifications in deep proteomic data sets. These false positive hits are particularly problematic as they have significantly higher scores and higher intensities than other false positive matches. Furthermore, these wrong peptide assignments lead to hundreds of false protein identifications and systematic biases in protein quantification. We devise a “cleaned search” strategy to address this problem and show that this considerably improves the sensitivity and specificity of proteomic data. In summary, we show that modified peptides cause systematic errors in peptide and protein identification and quantification and should therefore be considered to further improve the quality of proteomic data annotation. PMID:27215553
Clinical proteomic analysis of scrub typhus infection.
Park, Edmond Changkyun; Lee, Sang-Yeop; Yun, Sung Ho; Choi, Chi-Won; Lee, Hayoung; Song, Hyun Seok; Jun, Sangmi; Kim, Gun-Hwa; Lee, Chang-Seop; Kim, Seung Il
2018-01-01
Scrub typhus is an acute and febrile infectious disease caused by the Gram-negative α-proteobacterium Orientia tsutsugamushi from the family Rickettsiaceae that is widely distributed in Northern, Southern and Eastern Asia. In the present study, we analysed the serum proteome of scrub typhus patients to investigate specific clinical protein patterns in an attempt to explain pathophysiology and discover potential biomarkers of infection. Serum samples were collected from three patients (before and after treatment with antibiotics) and three healthy subjects. One-dimensional sodium dodecyl sulphate-polyacrylamide gel electrophoresis followed by liquid chromatography-tandem mass spectrometry was performed to identify differentially abundant proteins using quantitative proteomic approaches. Bioinformatic analysis was then performed using Ingenuity Pathway Analysis. Proteomic analysis identified 236 serum proteins, of which 32 were differentially expressed in normal subjects, naive scrub typhus patients and patients treated with antibiotics. Comparative bioinformatic analysis of the identified proteins revealed up-regulation of proteins involved in immune responses, especially complement system, following infection with O. tsutsugamushi , and normal expression was largely rescued by antibiotic treatment. This is the first proteomic study of clinical serum samples from scrub typhus patients. Proteomic analysis identified changes in protein expression upon infection with O. tsutsugamushi and following antibiotic treatment. Our results provide valuable information for further investigation of scrub typhus therapy and diagnosis.
Nuez-Ortín, Waldo G; Carter, Chris G; Nichols, Peter D; Wilson, Richard
2016-07-01
Understanding diet- and environmentally induced physiological changes in fish larvae is a major goal for the aquaculture industry. Proteomic analysis of whole fish larvae comprising multiple tissues offers considerable potential but is challenging due to the very large dynamic range of protein abundance. To extend the coverage of the larval phase of the Atlantic salmon (Salmo salar) proteome, we applied a two-step sequential extraction (SE) method, based on differential protein solubility, using a nondenaturing buffer containing 150 mM NaCl followed by a denaturing buffer containing 7 M urea and 2 M thiourea. Extracts prepared using SE and one-step direct extraction were characterized via label-free shotgun proteomics using nanoLC-MS/MS (LTQ-Orbitrap). SE partitioned the proteins into two fractions of approximately equal amounts, but with very distinct protein composition, leading to identification of ∼40% more proteins than direct extraction. This fractionation strategy enabled the most detailed characterization of the salmon larval proteome to date and provides a platform for greater understanding of physiological changes in whole fish larvae. The MS data are available via the ProteomeXchange Consortium PRIDE partner repository, dataset PXD003366. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Wang, Jian; Anania, Veronica G.; Knott, Jeff; Rush, John; Lill, Jennie R.; Bourne, Philip E.; Bandeira, Nuno
2014-01-01
The combination of chemical cross-linking and mass spectrometry has recently been shown to constitute a powerful tool for studying protein–protein interactions and elucidating the structure of large protein complexes. However, computational methods for interpreting the complex MS/MS spectra from linked peptides are still in their infancy, making the high-throughput application of this approach largely impractical. Because of the lack of large annotated datasets, most current approaches do not capture the specific fragmentation patterns of linked peptides and therefore are not optimal for the identification of cross-linked peptides. Here we propose a generic approach to address this problem and demonstrate it using disulfide-bridged peptide libraries to (i) efficiently generate large mass spectral reference data for linked peptides at a low cost and (ii) automatically train an algorithm that can efficiently and accurately identify linked peptides from MS/MS spectra. We show that using this approach we were able to identify thousands of MS/MS spectra from disulfide-bridged peptides through comparison with proteome-scale sequence databases and significantly improve the sensitivity of cross-linked peptide identification. This allowed us to identify 60% more direct pairwise interactions between the protein subunits in the 20S proteasome complex than existing tools on cross-linking studies of the proteasome complexes. The basic framework of this approach and the MS/MS reference dataset generated should be valuable resources for the future development of new tools for the identification of linked peptides. PMID:24493012
Characterization of the canine urinary proteome.
Brandt, Laura E; Ehrhart, E J; Scherman, Hataichanok; Olver, Christine S; Bohn, Andrea A; Prenni, Jessica E
2014-06-01
Urine is an attractive biofluid for biomarker discovery as it is easy and minimally invasive to obtain. While numerous studies have focused on the characterization of human urine, much less research has focused on canine urine. The objectives of this study were to characterize the universal canine urinary proteome (both soluble and exosomal), to determine the overlap between the canine proteome and a representative human urinary proteome study, to generate a resource for future canine studies, and to determine the suitability of the dog as a large animal model for human diseases. The soluble and exosomal fractions of normal canine urine were characterized using liquid chromatography tandem mass spectrometry (LC-MS/MS). Biological Networks Gene Ontology (BiNGO) software was utilized to assign the canine urinary proteome to respective Gene Ontology categories, such as Cellular Component, Molecular Function, and Biological Process. Over 500 proteins were confidently identified in normal canine urine. Gene Ontology analysis revealed that exosomal proteins were largely derived from an intracellular location, while soluble proteins included both extracellular and membrane proteins. Exosome proteins were assigned to metabolic processes and localization, while soluble proteins were primarily annotated to specific localization processes. Several proteins identified in normal canine urine have previously been identified in human urine where these proteins are related to various extrarenal and renal diseases. The results of this study illustrate the potential of the dog as an animal model for human disease states and provide the framework for future studies of canine renal diseases. © 2014 American Society for Veterinary Clinical Pathology and European Society for Veterinary Clinical Pathology.
Schanzenbächer, Christoph T
2018-01-01
In homeostatic scaling at central synapses, the depth and breadth of cellular mechanisms that detect the offset from the set-point, detect the duration of the offset and implement a cellular response are not well understood. To understand the time-dependent scaling dynamics we treated cultured rat hippocampal cells with either TTX or bicucculline for 2 hr to induce the process of up- or down-scaling, respectively. During the activity manipulation we metabolically labeled newly synthesized proteins using BONCAT. We identified 168 newly synthesized proteins that exhibited significant changes in expression. To obtain a temporal trajectory of the response, we compared the proteins synthesized within 2 hr or 24 hr of the activity manipulation. Surprisingly, there was little overlap in the significantly regulated newly synthesized proteins identified in the early- and integrated late response datasets. There was, however, overlap in the functional categories that are modulated early and late. These data indicate that within protein function groups, different proteomic choices can be made to effect early and late homeostatic responses that detect the duration and polarity of the activity manipulation. PMID:29447110
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kolker, Eugene
Our project focused primarily on analysis of different types of data produced by global high-throughput technologies, data integration of gene annotation, and gene and protein expression information, as well as on getting a better functional annotation of Shewanella genes. Specifically, four of our numerous major activities and achievements include the development of: statistical models for identification and expression proteomics, superior to currently available approaches (including our own earlier ones); approaches to improve gene annotations on the whole-organism scale; standards for annotation, transcriptomics and proteomics approaches; and generalized approaches for data integration of gene annotation, gene and protein expression information.
Proteomic Analysis of the Cell Cycle of Procylic Form Trypanosoma brucei.
Crozier, Thomas W M; Tinti, Michele; Wheeler, Richard J; Ly, Tony; Ferguson, Michael A J; Lamond, Angus I
2018-06-01
We describe a single-step centrifugal elutriation method to produce synchronous Gap1 (G1)-phase procyclic trypanosomes at a scale amenable for proteomic analysis of the cell cycle. Using ten-plex tandem mass tag (TMT) labeling and mass spectrometry (MS)-based proteomics technology, the expression levels of 5325 proteins were quantified across the cell cycle in this parasite. Of these, 384 proteins were classified as cell-cycle regulated and subdivided into nine clusters with distinct temporal regulation. These groups included many known cell cycle regulators in trypanosomes, which validates the approach. In addition, we identify 40 novel cell cycle regulated proteins that are essential for trypanosome survival and thus represent potential future drug targets for the prevention of trypanosomiasis. Through cross-comparison to the TrypTag endogenous tagging microscopy database, we were able to validate the cell-cycle regulated patterns of expression for many of the proteins of unknown function detected in our proteomic analysis. A convenient interface to access and interrogate these data is also presented, providing a useful resource for the scientific community. Data are available via ProteomeXchange with identifier PXD008741 (https://www.ebi.ac.uk/pride/archive/). © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; Jacobs, Jon M.
2011-12-01
Quantification of LC-MS peak intensities assigned during peptide identification in a typical comparative proteomics experiment will deviate from run-to-run of the instrument due to both technical and biological variation. Thus, normalization of peak intensities across a LC-MS proteomics dataset is a fundamental step in pre-processing. However, the downstream analysis of LC-MS proteomics data can be dramatically affected by the normalization method selected . Current normalization procedures for LC-MS proteomics data are presented in the context of normalization values derived from subsets of the full collection of identified peptides. The distribution of these normalization values is unknown a priori. If theymore » are not independent from the biological factors associated with the experiment the normalization process can introduce bias into the data, which will affect downstream statistical biomarker discovery. We present a novel approach to evaluate normalization strategies, where a normalization strategy includes the peptide selection component associated with the derivation of normalization values. Our approach evaluates the effect of normalization on the between-group variance structure in order to identify candidate normalization strategies that improve the structure of the data without introducing bias into the normalized peak intensities.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stekhoven, Daniel J.; Omasits, Ulrich; Quebatte, Maxime
2014-03-01
Proteomics data provide unique insights into biological systems, including the predominant subcellular localization (SCL) of proteins, which can reveal important clues about their functions. Here we analyzed data of a complete prokaryotic proteome expressed under two conditions mimicking interaction of the emerging pathogen Bartonella henselae with its mammalian host. Normalized spectral count data from cytoplasmic, total membrane, inner and outer membrane fractions allowed us to identify the predominant SCL for 82% of the identified proteins. The spectral count proportion of total membrane versus cytoplasmic fractions indicated the propensity of cytoplasmic proteins to co-fractionate with the inner membrane, and enabled usmore » to distinguish cytoplasmic, peripheral innermembrane and bona fide inner membrane proteins. Principal component analysis and k-nearest neighbor classification training on selected marker proteins or predominantly localized proteins, allowed us to determine an extensive catalog of at least 74 expressed outer membrane proteins, and to extend the SCL assignment to 94% of the identified proteins, including 18% where in silico methods gave no prediction. Suitable experimental proteomics data combined with straightforward computational approaches can thus identify the predominant SCL on a proteome-wide scale. Finally, we present a conceptual approach to identify proteins potentially changing their SCL in a condition-dependent fashion.« less
Pasculescu, Adrian; Schoof, Erwin M; Creixell, Pau; Zheng, Yong; Olhovsky, Marina; Tian, Ruijun; So, Jonathan; Vanderlaan, Rachel D; Pawson, Tony; Linding, Rune; Colwill, Karen
2014-04-04
A major challenge in mass spectrometry and other large-scale applications is how to handle, integrate, and model the data that is produced. Given the speed at which technology advances and the need to keep pace with biological experiments, we designed a computational platform, CoreFlow, which provides programmers with a framework to manage data in real-time. It allows users to upload data into a relational database (MySQL), and to create custom scripts in high-level languages such as R, Python, or Perl for processing, correcting and modeling this data. CoreFlow organizes these scripts into project-specific pipelines, tracks interdependencies between related tasks, and enables the generation of summary reports as well as publication-quality images. As a result, the gap between experimental and computational components of a typical large-scale biology project is reduced, decreasing the time between data generation, analysis and manuscript writing. CoreFlow is being released to the scientific community as an open-sourced software package complete with proteomics-specific examples, which include corrections for incomplete isotopic labeling of peptides (SILAC) or arginine-to-proline conversion, and modeling of multiple/selected reaction monitoring (MRM/SRM) results. CoreFlow was purposely designed as an environment for programmers to rapidly perform data analysis. These analyses are assembled into project-specific workflows that are readily shared with biologists to guide the next stages of experimentation. Its simple yet powerful interface provides a structure where scripts can be written and tested virtually simultaneously to shorten the life cycle of code development for a particular task. The scripts are exposed at every step so that a user can quickly see the relationships between the data, the assumptions that have been made, and the manipulations that have been performed. Since the scripts use commonly available programming languages, they can easily be transferred to and from other computational environments for debugging or faster processing. This focus on 'on the fly' analysis sets CoreFlow apart from other workflow applications that require wrapping of scripts into particular formats and development of specific user interfaces. Importantly, current and future releases of data analysis scripts in CoreFlow format will be of widespread benefit to the proteomics community, not only for uptake and use in individual labs, but to enable full scrutiny of all analysis steps, thus increasing experimental reproducibility and decreasing errors. This article is part of a Special Issue entitled: Can Proteomics Fill the Gap Between Genomics and Phenotypes? Copyright © 2014 Elsevier B.V. All rights reserved.
Harden, Charlotte J; Perez-Carrion, Kristine; Babakordi, Zara; Plummer, Sue F; Hepburn, Natalie; Barker, Margo E; Wright, Phillip C; Evans, Caroline A; Corfe, Bernard M
2012-06-06
Current measurement of appetite depends upon tools that are either subjective (visual analogue scales), or invasive (blood). Saliva is increasingly recognised as a valuable resource for biomarker analysis. Proteomics workflows may provide alternative means for the assessment of appetitive response. The study aimed to assess the potential value of the salivary proteome to detect novel biomarkers of appetite using an iTRAQ-based workflow. Diurnal variation of salivary protein concentrations was assessed. A randomised, controlled, crossover study examined the effects on the salivary proteome of isocaloric doses of various long chain fatty acid (LCFA) oil emulsions compared to no treatment (NT). Fasted males provided saliva samples before and following NT or dosing with LCFA emulsions. The oil component of the DHA emulsion contained predominantly docosahexaenoic acid and the oil component of OA contained predominantly oleic acid. Several proteins were present in significantly (p<0.05) different quantities in saliva samples taken following treatments compared to fasting samples. DHA caused alterations in thioredoxin and serpin B4 relative to OA and NT. A further study evaluated energy intake (EI) in response to LCFA in conjunction with subjective appetite scoring. DHA was associated with significantly lower EI relative to NT and OA (p=0.039). The collective data suggest investigation of salivary proteome may be of value in appetitive response. This article is part of a Special Issue entitled: Proteomics: The clinical link. Copyright © 2011 Elsevier B.V. All rights reserved.
Trauma-associated Human Neutrophil Alterations Revealed by Comparative Proteomics Profiling
Zhou, Jian-Ying; Krovvidi, Ravi K.; Gao, Yuqian; Gao, Hong; Petritis, Brianne O.; De, Asit; Miller-Graziano, Carol; Bankey, Paul E.; Petyuk, Vladislav A.; Nicora, Carrie D.; Clauss, Therese R; Moore, Ronald J.; Shi, Tujin; Brown, Joseph N.; Kaushal, Amit; Xiao, Wenzhong; Davis, Ronald W.; Maier, Ronald V.; Tompkins, Ronald G.; Qian, Wei-Jun; Camp, David G.; Smith, Richard D.
2013-01-01
PURPOSE Polymorphonuclear neutrophils (PMNs) play an important role in mediating the innate immune response after severe traumatic injury; however, the cellular proteome response to traumatic condition is still largely unknown. EXPERIMENTAL DESIGN We applied 2D-LC-MS/MS based shotgun proteomics to perform comparative proteome profiling of human PMNs from severe trauma patients and healthy controls. RESULTS A total of 197 out of ~2500 proteins (being identified with at least two peptides) were observed with significant abundance changes following the injury. The proteomics data were further compared with transcriptomics data for the same genes obtained from an independent patient cohort. The comparison showed that the protein abundance changes for the majority of proteins were consistent with the mRNA abundance changes in terms of directions of changes. Moreover, increased protein secretion was suggested as one of the mechanisms contributing to the observed discrepancy between protein and mRNA abundance changes. Functional analyses of the altered proteins showed that many of these proteins were involved in immune response, protein biosynthesis, protein transport, NRF2-mediated oxidative stress response, the ubiquitin-proteasome system, and apoptosis pathways. CONCLUSIONS AND CLINICAL RELEVANCE Our data suggest increased neutrophil activation and inhibited neutrophil apoptosis in response to trauma. The study not only reveals an overall picture of functional neutrophil response to trauma at the proteome level, but also provides a rich proteomics data resource of trauma-associated changes in the neutrophil that will be valuable for further studies of the functions of individual proteins in PMNs. PMID:23589343
Computational biology in the cloud: methods and new insights from computing at scale.
Kasson, Peter M
2013-01-01
The past few years have seen both explosions in the size of biological data sets and the proliferation of new, highly flexible on-demand computing capabilities. The sheer amount of information available from genomic and metagenomic sequencing, high-throughput proteomics, experimental and simulation datasets on molecular structure and dynamics affords an opportunity for greatly expanded insight, but it creates new challenges of scale for computation, storage, and interpretation of petascale data. Cloud computing resources have the potential to help solve these problems by offering a utility model of computing and storage: near-unlimited capacity, the ability to burst usage, and cheap and flexible payment models. Effective use of cloud computing on large biological datasets requires dealing with non-trivial problems of scale and robustness, since performance-limiting factors can change substantially when a dataset grows by a factor of 10,000 or more. New computing paradigms are thus often needed. The use of cloud platforms also creates new opportunities to share data, reduce duplication, and to provide easy reproducibility by making the datasets and computational methods easily available.
Direct digestion of proteins in living cells into peptides for proteomic analysis.
Chen, Qi; Yan, Guoquan; Gao, Mingxia; Zhang, Xiangmin
2015-01-01
To analyze the proteome of an extremely low number of cells or even a single cell, we established a new method of digesting whole cells into mass-spectrometry-identifiable peptides in a single step within 2 h. Our sampling method greatly simplified the processes of cell lysis, protein extraction, protein purification, and overnight digestion, without compromising efficiency. We used our method to digest hundred-scale cells. As far as we know, there is no report of proteome analysis starting directly with as few as 100 cells. We identified an average of 109 proteins from 100 cells, and with three replicates, the number of proteins rose to 204. Good reproducibility was achieved, showing stability and reliability of the method. Gene Ontology analysis revealed that proteins in different cellular compartments were well represented.
Li, Nan; Stein, Richard S L; He, Wei; Komives, Elizabeth; Wang, Wei
2013-10-01
Methylation is one of the important post-translational modifications that play critical roles in regulating protein functions. Proteomic identification of this post-translational modification and understanding how it affects protein activity remain great challenges. We tackled this problem from the aspect of methylation mediating protein-protein interaction. Using the chromodomain of human chromobox protein homolog 6 as a model system, we developed a systematic approach that integrates structure modeling, bioinformatics analysis, and peptide microarray experiments to identify lysine residues that are methylated and recognized by the chromodomain in the human proteome. Given the important role of chromobox protein homolog 6 as a reader of histone modifications, it was interesting to find that the majority of its interacting partners identified via this approach function in chromatin remodeling and transcriptional regulation. Our study not only illustrates a novel angle for identifying methyllysines on a proteome-wide scale and elucidating their potential roles in regulating protein function, but also suggests possible strategies for engineering the chromodomain-peptide interface to enhance the recognition of and manipulate the signal transduction mediated by such interactions.
Multiple testing corrections in quantitative proteomics: A useful but blunt tool.
Pascovici, Dana; Handler, David C L; Wu, Jemma X; Haynes, Paul A
2016-09-01
Multiple testing corrections are a useful tool for restricting the FDR, but can be blunt in the context of low power, as we demonstrate by a series of simple simulations. Unfortunately, in proteomics experiments low power can be common, driven by proteomics-specific issues like small effects due to ratio compression, and few replicates due to reagent high cost, instrument time availability and other issues; in such situations, most multiple testing corrections methods, if used with conventional thresholds, will fail to detect any true positives even when many exist. In this low power, medium scale situation, other methods such as effect size considerations or peptide-level calculations may be a more effective option, even if they do not offer the same theoretical guarantee of a low FDR. Thus, we aim to highlight in this article that proteomics presents some specific challenges to the standard multiple testing corrections methods, which should be employed as a useful tool but not be regarded as a required rubber stamp. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Feist, Peter; Hummon, Amanda B.
2015-01-01
Proteins regulate many cellular functions and analyzing the presence and abundance of proteins in biological samples are central focuses in proteomics. The discovery and validation of biomarkers, pathways, and drug targets for various diseases can be accomplished using mass spectrometry-based proteomics. However, with mass-limited samples like tumor biopsies, it can be challenging to obtain sufficient amounts of proteins to generate high-quality mass spectrometric data. Techniques developed for macroscale quantities recover sufficient amounts of protein from milligram quantities of starting material, but sample losses become crippling with these techniques when only microgram amounts of material are available. To combat this challenge, proteomicists have developed micro-scale techniques that are compatible with decreased sample size (100 μg or lower) and still enable excellent proteome coverage. Extraction, contaminant removal, protein quantitation, and sample handling techniques for the microgram protein range are reviewed here, with an emphasis on liquid chromatography and bottom-up mass spectrometry-compatible techniques. Also, a range of biological specimens, including mammalian tissues and model cell culture systems, are discussed. PMID:25664860
A proteome-scale map of the human interactome network
Rolland, Thomas; Taşan, Murat; Charloteaux, Benoit; Pevzner, Samuel J.; Zhong, Quan; Sahni, Nidhi; Yi, Song; Lemmens, Irma; Fontanillo, Celia; Mosca, Roberto; Kamburov, Atanas; Ghiassian, Susan D.; Yang, Xinping; Ghamsari, Lila; Balcha, Dawit; Begg, Bridget E.; Braun, Pascal; Brehme, Marc; Broly, Martin P.; Carvunis, Anne-Ruxandra; Convery-Zupan, Dan; Corominas, Roser; Coulombe-Huntington, Jasmin; Dann, Elizabeth; Dreze, Matija; Dricot, Amélie; Fan, Changyu; Franzosa, Eric; Gebreab, Fana; Gutierrez, Bryan J.; Hardy, Madeleine F.; Jin, Mike; Kang, Shuli; Kiros, Ruth; Lin, Guan Ning; Luck, Katja; MacWilliams, Andrew; Menche, Jörg; Murray, Ryan R.; Palagi, Alexandre; Poulin, Matthew M.; Rambout, Xavier; Rasla, John; Reichert, Patrick; Romero, Viviana; Ruyssinck, Elien; Sahalie, Julie M.; Scholz, Annemarie; Shah, Akash A.; Sharma, Amitabh; Shen, Yun; Spirohn, Kerstin; Tam, Stanley; Tejeda, Alexander O.; Trigg, Shelly A.; Twizere, Jean-Claude; Vega, Kerwin; Walsh, Jennifer; Cusick, Michael E.; Xia, Yu; Barabási, Albert-László; Iakoucheva, Lilia M.; Aloy, Patrick; De Las Rivas, Javier; Tavernier, Jan; Calderwood, Michael A.; Hill, David E.; Hao, Tong; Roth, Frederick P.; Vidal, Marc
2014-01-01
SUMMARY Just as reference genome sequences revolutionized human genetics, reference maps of interactome networks will be critical to fully understand genotype-phenotype relationships. Here, we describe a systematic map of ~14,000 high-quality human binary protein-protein interactions. At equal quality, this map is ~30% larger than what is available from small-scale studies published in the literature in the last few decades. While currently available information is highly biased and only covers a relatively small portion of the proteome, our systematic map appears strikingly more homogeneous, revealing a “broader” human interactome network than currently appreciated. The map also uncovers significant inter-connectivity between known and candidate cancer gene products, providing unbiased evidence for an expanded functional cancer landscape, while demonstrating how high quality interactome models will help “connect the dots” of the genomic revolution. PMID:25416956
Large-scale identification of c-MYC-associated proteins using a combined TAP/MudPIT approach.
Koch, Heike B; Zhang, Ru; Verdoodt, Berlinda; Bailey, Aaron; Zhang, Chang-Dong; Yates, John R; Menssen, Antje; Hermeking, Heiko
2007-01-15
The c-MYC oncogene encodes a transcription factor, which is sufficient and necessary for the induction of cellular proliferation. However, the c-MYC protein is a relatively weak transactivator suggesting that it may have other functions. To identify protein interactors which may reveal new functions or represent regulators of c-MYC we systematically identified proteins associated with c-MYC in vivo using a proteomic approach. We combined tandem affinity purification (TAP) with the mass spectral multidimensional protein identification technology (MudPIT). Thereby, 221 c-MYC-associated proteins were identified. Among them were 17 previously known c-MYC-interactors. Selected new c-MYC-associated proteins (DBC-1, FBX29, KU70, MCM7, Mi2-beta/CHD4, RNA Pol II, RFC2, RFC3, SV40 Large T Antigen, TCP1alpha, U5-116kD, ZNF281) were confirmed independently. For association with MCM7, SV40 Large T Antigen and DBC-1 the functionally important MYC-box II region was required, whereas FBX29 and Mi2-beta interacted via MYC-box II and the BR-HLH-LZ motif. In addition, regulators of c-MYC activity were identified: ectopic expression of FBX29, an E3 ubiquitin ligase, decreased c-MYC protein levels and inhibited c-MYC transactivation, whereas knock-down of FBX29 elevated the concentration of c-MYC. Furthermore, sucrose gradient analysis demonstrated that c-MYC is present in numerous complexes with varying size and composition, which may accommodate the large number of new c-MYC-associated proteins identified here and mediate the diverse functions of c-MYC. Our results suggest that c-MYC, besides acting as a mitogenic transcription factor, regulates cellular proliferation by direct association with protein complexes involved in multiple synthetic processes required for cell division, as for example DNA-replication/repair and RNA-processing. Furthermore, this first comprehensive description of the c-MYC-associated sub-proteome will facilitate further studies aimed to elucidate the biology of c-MYC.
McGorum, Bruce C; Pirie, R Scott; Eaton, Samantha L; Keen, John A; Cumyn, Elizabeth M; Arnott, Danielle M; Chen, Wenzhang; Lamont, Douglas J; Graham, Laura C; Llavero Hurtado, Maica; Pemberton, Alan; Wishart, Thomas M
2015-11-01
Equine grass sickness (EGS) is an acute, predominantly fatal, multiple system neuropathy of grazing horses with reported incidence rates of ∼2%. An apparently identical disease occurs in multiple species, including but not limited to cats, dogs, and rabbits. Although the precise etiology remains unclear, ultrastructural findings have suggested that the primary lesion lies in the glycoprotein biosynthetic pathway of specific neuronal populations. The goal of this study was therefore to identify the molecular processes underpinning neurodegeneration in EGS. Here, we use a bottom-up approach beginning with the application of modern proteomic tools to the analysis of cranial (superior) cervical ganglion (CCG, a consistently affected tissue) from EGS-affected patients and appropriate control cases postmortem. In what appears to be the proteomic application of modern proteomic tools to equine neuronal tissues and/or to an inherent neurodegenerative disease of large animals (not a model of human disease), we identified 2,311 proteins in CCG extracts, with 320 proteins increased and 186 decreased by greater than 20% relative to controls. Further examination of selected proteomic candidates by quantitative fluorescent Western blotting (QFWB) and subcellular expression profiling by immunohistochemistry highlighted a previously unreported dysregulation in proteins commonly associated with protein misfolding/aggregation responses seen in a myriad of human neurodegenerative conditions, including but not limited to amyloid precursor protein (APP), microtubule associated protein (Tau), and multiple components of the ubiquitin proteasome system (UPS). Differentially expressed proteins eligible for in silico pathway analysis clustered predominantly into the following biofunctions: (1) diseases and disorders, including; neurological disease and skeletal and muscular disorders and (2) molecular and cellular functions, including cellular assembly and organization, cell-to-cell signaling and interaction (including epinephrine, dopamine, and adrenergic signaling and receptor function), and small molecule biochemistry. Interestingly, while the biofunctions identified in this study may represent pathways underpinning EGS-induced neurodegeneration, this is also the first demonstration of potential molecular conservation (including previously unreported dysregulation of the UPS and APP) spanning the degenerative cascades from an apparently unrelated condition of large animals, to small animal models with altered neuronal vulnerability, and human neurological conditions. Importantly, this study highlights the feasibility and benefits of applying modern proteomic techniques to veterinary investigations of neurodegenerative processes in diseases of large animals. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Pressey, Joseph G.; Pressey, Christine S.; Robinson, Gloria; Herring, Richie; Wilson, Landon; Kelly, David R.; Kim, Helen
2011-01-01
To evaluate the consequences of expression of the protein encoded by PAX3-FOXO1 (P3F) in the pediatric malignancy alveolar rhabdomyosarcoma (A-RMS), we developed and evaluated a genetically defined in vitro model of A-RMS tumorigenesis. The expression of P3F in cooperation with simian virus 40 (SV40) Large-T (LT) antigen in murine C3H10T1/2 fibroblasts led to robust malignant transformation. Using 2 dimensional difference gel electrophoresis (2D-DIGE) we compared proteomes from lysates from cells that express P3F + LT versus from cells that express LT alone. Analysis of 2D gel spot patterns by DeCyder™ image analysis software indicated 93 spots that were different in abundance. Peptide mass fingerprint analysis of the 93 spots by matrix assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) analysis identified 37 non-redundant proteins. 2D DIGE analysis of cell culture media conditioned by cells transduced by P3F + LT versus by LT alone found 29 spots in the P3F + LT cells leading to the identification of 11 non-redundant proteins. A substantial number of proteins with potential roles in tumorigenesis and myogenesis were detected, most of which have not been identified in previous wide-scale expression studies of RMS experimental models or tumors. We validated the 2D gel image analysis findings by western blot analysis and immunohistochemistry (IHC). Thus, the 2D DIGE proteomics methodology described here provided an important discovery approach to the study of RMS biology and complements the findings of previous mRNA expression studies. PMID:21110518
Silva, Ana F; Carvalho, Gilda; Soares, Renata; Coelho, Ana V; Barreto Crespo, M Teresa
2012-08-01
Extracellular polymeric substances (EPS) are keys in biomass aggregation and settleability in wastewater treatment systems. In membrane bioreactors (MBR), EPS are an important factor as they are considered to be largely responsible for membrane fouling. Proteins were shown to be the major component of EPS produced by activated sludge and to be correlated with the properties of the sludge, like settling, hydrophobicity and cell aggregation. Previous EPS proteomic studies of activated sludge revealed several problems, like the interference of other EPS molecules in protein analysis. In this study, a successful strategy was outlined to identify the proteins from soluble and bound EPS extracted from activated sludge of a lab-scale MBR. EPS samples were first subjected to pre-concentration through lyophilisation, centrifugal ultrafiltration or concentration with a dialysis membrane coated by a highly absorbent powder of polyacrylate-polyalcohol, preceded or not by a dialysis step. The highest protein concentration factors were achieved with the highly absorbent powder method without previous dialysis step. Four protein precipitation methods were then tested: acetone, trichloroacetic acid (TCA), perchloric acid and a commercial kit. Protein profiles were compared in 4-12 % sodium dodecyl sulphate polyacrylamide gel electrophoresis gels. Both acetone and TCA should be applied for the highest coverage for soluble EPS proteins, whereas TCA was the best method for bound EPS proteins. All visible bands of selected profiles were subjected to mass spectrometry analysis. A high number of proteins (25-32 for soluble EPS and 17 for bound EPS) were identified. As a conclusion of this study, a workflow is proposed for the successful proteome characterisation of soluble and bound EPS from activated sludge samples.
Sharma, Minu; Sud, Amit; Kaur, Tanzeer; Tandon, Chanderdeep; Singla, S K
2016-09-01
Diminished mitochondrial activities were deemed to play an imperative role in surged oxidative damage perceived in hyperoxaluric renal tissue. Proteomics is particularly valuable to delineate the damaging effects of oxidative stress on mitochondrial proteins. The present study was designed to apply large-scale proteomics to describe systematically how mitochondrial proteins/pathways govern the renal damage and calcium oxalate crystal adhesion in hyperoxaluria. Furthermore, the potential beneficial effects of combinatorial therapy with N-acetylcysteine (NAC) and apocynin were studied to establish its credibility in the modulation of hyperoxaluria-induced alterations in mitochondrial proteins. In an experimental setup with male Wistar rats, five groups were designed for 9 d. At the end of the experiment, 24-h urine was collected and rats were euthanized. Urinary samples were analyzed for kidney injury marker and creatinine clearance. Transmission electron microscopy revealed distorted renal mitochondria in hyperoxaluria but combinatorial therapy restored the normal mitochondrial architecture. Mitochondria were isolated from renal tissue of experimental rats, and mitochondrial membrane potential was analyzed. The two-dimensional electrophoresis (2-DE) based comparative proteomic analysis was performed on proteins isolated from renal mitochondria. The results revealed eight differentially expressed mitochondrial proteins in hyperoxaluric rats, which were identified by Matrix-assisted laser desorption/ionization time of flight/time of flight (MALDI-TOF/TOF) analysis. Identified proteins including those involved in important mitochondrial processes, e.g. antioxidant defense, energy metabolism, and electron transport chain. Therapeutic administration of NAC with apocynin significantly expunged hyperoxaluria-induced discrepancy in the renal mitochondrial proteins, bringing them closer to the controls. The results provide insights to further understand the underlying mechanisms in the development of hyperoxaluria-induced nephrolithiasis and the therapeutic relevance of the combinatorial therapy.
A Computational Tool to Detect and Avoid Redundancy in Selected Reaction Monitoring
Röst, Hannes; Malmström, Lars; Aebersold, Ruedi
2012-01-01
Selected reaction monitoring (SRM), also called multiple reaction monitoring, has become an invaluable tool for targeted quantitative proteomic analyses, but its application can be compromised by nonoptimal selection of transitions. In particular, complex backgrounds may cause ambiguities in SRM measurement results because peptides with interfering transitions similar to those of the target peptide may be present in the sample. Here, we developed a computer program, the SRMCollider, that calculates nonredundant theoretical SRM assays, also known as unique ion signatures (UIS), for a given proteomic background. We show theoretically that UIS of three transitions suffice to conclusively identify 90% of all yeast peptides and 85% of all human peptides. Using predicted retention times, the SRMCollider also simulates time-scheduled SRM acquisition, which reduces the number of interferences to consider and leads to fewer transitions necessary to construct an assay. By integrating experimental fragment ion intensities from large scale proteome synthesis efforts (SRMAtlas) with the information content-based UIS, we combine two orthogonal approaches to create high quality SRM assays ready to be deployed. We provide a user friendly, open source implementation of an algorithm to calculate UIS of any order that can be accessed online at http://www.srmcollider.org to find interfering transitions. Finally, our tool can also simulate the specificity of novel data-independent MS acquisition methods in Q1–Q3 space. This allows us to predict parameters for these methods that deliver a specificity comparable with that of SRM. Using SRM interference information in addition to other sources of information can increase the confidence in an SRM measurement. We expect that the consideration of information content will become a standard step in SRM assay design and analysis, facilitated by the SRMCollider. PMID:22535207
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aryal, Uma K.; Callister, Stephen J.; Mishra, Sujata
2013-02-01
Cultures of the cyanobacterial genus Cyanothece have been shown to produce high levels of biohydrogen. These strains are diazotrophic and undergo pronounced diurnal cycles when grown under N2-fixing conditions in light-dark cycles. We seek to better understand the way in which proteins respond to these diurnal changes and we performed quantitative proteome analysis of Cyanothece ATCC 51142 and PCC 7822 grown under 8 different nutritional conditions. Nitrogenase expression was limited to N2-fixing conditions, and in the absence of glycerol, nitrogenase gene expression was linked to the dark period. However, glycerol induced expression of nitrogenase during part of the light period,more » together with cytochrome c oxidase (Cox), glycogen phosphorylase (Glp), and glycolytic and pentose-phosphate pathway (PPP) enzymes. This indicated that nitrogenase expression in the light was facilitated via higher respiration and glycogen breakdown. Key enzymes of the Calvin cycle were inhibited in Cyanothece ATCC 51142 in the presence of glycerol under H2 producing conditions, suggesting a competition between these sources of carbon. However, in Cyanothece PCC 7822, the Calvin cycle still played a role in cofactor recycling during H2 production. Our data comprise the first comprehensive profiling of proteome changes in Cyanothece PCC 7822, and allows an in-depth comparative analysis of major physiological and biochemical processes that influence H2-production in both the strains. Our results revealed many previously uncharacterized proteins that may play a role in nitrogenase activity and in other metabolic pathways and may provide suitable targets for genetic manipulation that would lead to improvement of large scale H2 production.« less
Pressey, Joseph G; Pressey, Christine S; Robinson, Gloria; Herring, Richie; Wilson, Landon; Kelly, David R; Kim, Helen
2011-02-04
To evaluate the consequences of expression of the protein encoded by PAX3-FOXO1 (P3F) in the pediatric malignancy alveolar rhabdomyosarcoma (A-RMS), we developed and evaluated a genetically defined in vitro model of A-RMS tumorigenesis. The expression of P3F in cooperation with simian virus 40 (SV40) Large-T (LT) antigen in murine C3H10T1/2 fibroblasts led to robust malignant transformation. Using 2-dimensional-difference gel electrophoresis (2D-DIGE), we compared proteomes from lysates from cells that express P3F + LT versus from cells that express LT alone. Analysis of 2D gel spot patterns by DeCyder image analysis software indicated 93 spots that were different in abundance. Peptide mass fingerprint analysis of the 93 spots by matrix assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) analysis identified 37 nonredundant proteins. 2D-DIGE analysis of cell culture media conditioned by cells transduced by P3F + LT versus by LT alone found 29 spots in the P3F + LT cells leading to the identification of 11 nonredundant proteins. A substantial number of proteins with potential roles in tumorigenesis and myogenesis were detected, most of which have not been identified in previous wide-scale expression studies of RMS experimental models or tumors. We validated the 2D gel image analysis findings by Western blot analysis and immunohistochemistry (IHC). Thus, the 2D-DIGE proteomics methodology described here provided an important discovery approach to the study of RMS biology and complements the findings of previous mRNA expression studies.
Comparison of the large-scale periplasmic proteomes of the Escherichia coli K-12 and B strains.
Han, Mee-Jung; Kim, Jin Young; Kim, Jung A
2014-04-01
Escherichia coli typically secretes many proteins into the periplasmic space, and the periplasmic proteins have been used for the secretory production of various proteins by the biotechnology industry. However, the identity of all of the E. coli periplasmic proteins remains unknown. Here, high-resolution periplasmic proteome reference maps of the E. coli K-12 and B strains were constructed and compared. Of the 145 proteins identified by tandem mass spectrometry, 61 proteins were conserved in the two strains, whereas 11 and 12 strain-specific proteins were identified for the E. coli K-12 and B strains, respectively. In addition, 27 proteins exhibited differences in intensities greater than 2-fold between the K-12 and B strains. The periplasmic proteins MalE and OppA were the most abundant proteins in the two E. coli strains. Distinctive differences between the two strains included several proteins that were caused by genetic variations, such as CybC, FliC, FliY, KpsD, MglB, ModA, and Ybl119, hydrolytic enzymes, particularly phosphatases, glycosylases, and proteases, and many uncharacterized proteins. Compared to previous studies, the localization of many proteins, including 30 proteins for the K-12 strain and 53 proteins for the B strain, was newly identified as periplasmic. This study identifies the largest number of proteins in the E. coli periplasm as well as the dynamics of these proteins. Additionally, these findings are summarized as reference proteome maps that will be useful for studying protein secretion and may provide new strategies for the enhanced secretory production of recombinant proteins. Copyright © 2013. Published by Elsevier B.V.
Mapping Proteome-Wide Interactions of Reactive Chemicals Using Chemoproteomic Platforms
Counihan, Jessica L.; Ford, Breanna; Nomura, Daniel K.
2015-01-01
A large number of pharmaceuticals, endogenous metabolites, and environmental chemicals act through covalent mechanisms with protein targets. Yet, their specific interactions with the proteome still remain poorly defined for most of these reactive chemicals. Deciphering direct protein targets of reactive small-molecules is critical in understanding their biological action, off-target effects, potential toxicological liabilities, and development of safer and more selective agents. Chemoproteomic technologies have arisen as a powerful strategy that enable the assessment of proteome-wide interactions of these irreversible agents directly in complex biological systems. We review here several chemoproteomic strategies that have facilitated our understanding of specific protein interactions of irreversibly-acting pharmaceuticals, endogenous metabolites, and environmental electrophiles to reveal novel pharmacological, biological, and toxicological mechanisms. PMID:26647369
Xu, Yu; Wang, Hong; Nussinov, Ruth; Ma, Buyong
2013-01-01
We constructed and simulated a ‘minimal proteome’ model using Langevin dynamics. It contains 206 essential protein types which were compiled from the literature. For comparison, we generated six proteomes with randomized concentrations. We found that the net charges and molecular weights of the proteins in the minimal genome are not random. The net charge of a protein decreases linearly with molecular weight, with small proteins being mostly positively charged and large proteins negatively charged. The protein copy numbers in the minimal genome have the tendency to maximize the number of protein-protein interactions in the network. Negatively charged proteins which tend to have larger sizes can provide large collision cross-section allowing them to interact with other proteins; on the other hand, the smaller positively charged proteins could have higher diffusion speed and are more likely to collide with other proteins. Proteomes with random charge/mass populations form less stable clusters than those with experimental protein copy numbers. Our study suggests that ‘proper’ populations of negatively and positively charged proteins are important for maintaining a protein-protein interaction network in a proteome. It is interesting to note that the minimal genome model based on the charge and mass of E. Coli may have a larger protein-protein interaction network than that based on the lower organism M. pneumoniae. PMID:23420643
The Proteins API: accessing key integrated protein and genome information
Antunes, Ricardo; Alpi, Emanuele; Gonzales, Leonardo; Liu, Wudong; Luo, Jie; Qi, Guoying; Turner, Edd
2017-01-01
Abstract The Proteins API provides searching and programmatic access to protein and associated genomics data such as curated protein sequence positional annotations from UniProtKB, as well as mapped variation and proteomics data from large scale data sources (LSS). Using the coordinates service, researchers are able to retrieve the genomic sequence coordinates for proteins in UniProtKB. This, the LSS genomics and proteomics data for UniProt proteins is programmatically only available through this service. A Swagger UI has been implemented to provide documentation, an interface for users, with little or no programming experience, to ‘talk’ to the services to quickly and easily formulate queries with the services and obtain dynamically generated source code for popular programming languages, such as Java, Perl, Python and Ruby. Search results are returned as standard JSON, XML or GFF data objects. The Proteins API is a scalable, reliable, fast, easy to use RESTful services that provides a broad protein information resource for users to ask questions based upon their field of expertise and allowing them to gain an integrated overview of protein annotations available to aid their knowledge gain on proteins in biological processes. The Proteins API is available at (http://www.ebi.ac.uk/proteins/api/doc). PMID:28383659
Discovering naturally processed antigenic determinants that confer protective T cell immunity
Gilchuk, Pavlo; Spencer, Charles T.; Conant, Stephanie B.; Hill, Timothy; Gray, Jennifer J.; Niu, Xinnan; Zheng, Mu; Erickson, John J.; Boyd, Kelli L.; McAfee, K. Jill; Oseroff, Carla; Hadrup, Sine R.; Bennink, Jack R.; Hildebrand, William; Edwards, Kathryn M.; Crowe, James E.; Williams, John V.; Buus, Søren; Sette, Alessandro; Schumacher, Ton N.M.; Link, Andrew J.; Joyce, Sebastian
2013-01-01
CD8+ T cells (TCD8) confer protective immunity against many infectious diseases, suggesting that microbial TCD8 determinants are promising vaccine targets. Nevertheless, current T cell antigen identification approaches do not discern which epitopes drive protective immunity during active infection — information that is critical for the rational design of TCD8-targeted vaccines. We employed a proteomics-based approach for large-scale discovery of naturally processed determinants derived from a complex pathogen, vaccinia virus (VACV), that are presented by the most frequent representatives of four major HLA class I supertypes. Immunologic characterization revealed that many previously unidentified VACV determinants were recognized by smallpox-vaccinated human peripheral blood cells in a variegated manner. Many such determinants were recognized by HLA class I–transgenic mouse immune TCD8 too and elicited protective TCD8 immunity against lethal intranasal VACV infection. Notably, efficient processing and stable presentation of immune determinants as well as the availability of naive TCD8 precursors were sufficient to drive a multifunctional, protective TCD8 response. Our approach uses fundamental insights into T cell epitope processing and presentation to define targets of protective TCD8 immunity within human pathogens that have complex proteomes, suggesting that this approach has general applicability in vaccine sciences. PMID:23543059
Discovering naturally processed antigenic determinants that confer protective T cell immunity.
Gilchuk, Pavlo; Spencer, Charles T; Conant, Stephanie B; Hill, Timothy; Gray, Jennifer J; Niu, Xinnan; Zheng, Mu; Erickson, John J; Boyd, Kelli L; McAfee, K Jill; Oseroff, Carla; Hadrup, Sine R; Bennink, Jack R; Hildebrand, William; Edwards, Kathryn M; Crowe, James E; Williams, John V; Buus, Søren; Sette, Alessandro; Schumacher, Ton N M; Link, Andrew J; Joyce, Sebastian
2013-05-01
CD8+ T cells (TCD8) confer protective immunity against many infectious diseases, suggesting that microbial TCD8 determinants are promising vaccine targets. Nevertheless, current T cell antigen identification approaches do not discern which epitopes drive protective immunity during active infection - information that is critical for the rational design of TCD8-targeted vaccines. We employed a proteomics-based approach for large-scale discovery of naturally processed determinants derived from a complex pathogen, vaccinia virus (VACV), that are presented by the most frequent representatives of four major HLA class I supertypes. Immunologic characterization revealed that many previously unidentified VACV determinants were recognized by smallpox-vaccinated human peripheral blood cells in a variegated manner. Many such determinants were recognized by HLA class I-transgenic mouse immune TCD8 too and elicited protective TCD8 immunity against lethal intranasal VACV infection. Notably, efficient processing and stable presentation of immune determinants as well as the availability of naive TCD8 precursors were sufficient to drive a multifunctional, protective TCD8 response. Our approach uses fundamental insights into T cell epitope processing and presentation to define targets of protective TCD8 immunity within human pathogens that have complex proteomes, suggesting that this approach has general applicability in vaccine sciences.
Review of software tools for design and analysis of large scale MRM proteomic datasets.
Colangelo, Christopher M; Chung, Lisa; Bruce, Can; Cheung, Kei-Hoi
2013-06-15
Selective or Multiple Reaction monitoring (SRM/MRM) is a liquid-chromatography (LC)/tandem-mass spectrometry (MS/MS) method that enables the quantitation of specific proteins in a sample by analyzing precursor ions and the fragment ions of their selected tryptic peptides. Instrumentation software has advanced to the point that thousands of transitions (pairs of primary and secondary m/z values) can be measured in a triple quadrupole instrument coupled to an LC, by a well-designed scheduling and selection of m/z windows. The design of a good MRM assay relies on the availability of peptide spectra from previous discovery-phase LC-MS/MS studies. The tedious aspect of manually developing and processing MRM assays involving thousands of transitions has spurred to development of software tools to automate this process. Software packages have been developed for project management, assay development, assay validation, data export, peak integration, quality assessment, and biostatistical analysis. No single tool provides a complete end-to-end solution, thus this article reviews the current state and discusses future directions of these software tools in order to enable researchers to combine these tools for a comprehensive targeted proteomics workflow. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Glycoproteins Enrichment and LC-MS/MS Glycoproteomics in Central Nervous System Applications.
Zhu, Rui; Song, Ehwang; Hussein, Ahmed; Kobeissy, Firas H; Mechref, Yehia
2017-01-01
Proteins and glycoproteins play important biological roles in central nervous systems (CNS). Qualitative and quantitative evaluation of proteins and glycoproteins expression in CNS is critical to reveal the inherent biomolecular mechanism of CNS diseases. This chapter describes proteomic and glycoproteomic approaches based on liquid chromatography/tandem mass spectrometry (LC-MS or LC-MS/MS) for the qualitative and quantitative assessment of proteins and glycoproteins expressed in CNS. Proteins and glycoproteins, extracted by a mass spectrometry friendly surfactant from CNS samples, were subjected to enzymatic (tryptic) digestion and three down-stream analyses: (1) a nano LC system coupled with a high-resolution MS instrument to achieve qualitative proteomic profile, (2) a nano LC system combined with a triple quadrupole MS to quantify identified proteins, and (3) glycoprotein enrichment prior to LC-MS/MS analysis. Enrichment techniques can be applied to improve coverage of low abundant glycopeptides/glycoproteins. An example described in this chapter is hydrophilic interaction liquid chromatographic (HILIC) enrichment to capture glycopeptides, allowing efficient removal of peptides. The combination of three LC-MS/MS-based approaches is capable of the investigation of large-scale proteins and glycoproteins from CNS with an in-depth coverage, thus offering a full view of proteins and glycoproteins changes in CNS.
Yu, Wen; Taylor, J Alex; Davis, Michael T; Bonilla, Leo E; Lee, Kimberly A; Auger, Paul L; Farnsworth, Chris C; Welcher, Andrew A; Patterson, Scott D
2010-03-01
Despite recent advances in qualitative proteomics, the automatic identification of peptides with optimal sensitivity and accuracy remains a difficult goal. To address this deficiency, a novel algorithm, Multiple Search Engines, Normalization and Consensus is described. The method employs six search engines and a re-scoring engine to search MS/MS spectra against protein and decoy sequences. After the peptide hits from each engine are normalized to error rates estimated from the decoy hits, peptide assignments are then deduced using a minimum consensus model. These assignments are produced in a series of progressively relaxed false-discovery rates, thus enabling a comprehensive interpretation of the data set. Additionally, the estimated false-discovery rate was found to have good concordance with the observed false-positive rate calculated from known identities. Benchmarking against standard proteins data sets (ISBv1, sPRG2006) and their published analysis, demonstrated that the Multiple Search Engines, Normalization and Consensus algorithm consistently achieved significantly higher sensitivity in peptide identifications, which led to increased or more robust protein identifications in all data sets compared with prior methods. The sensitivity and the false-positive rate of peptide identification exhibit an inverse-proportional and linear relationship with the number of participating search engines.
Sun, Yi; Yang, Yixuan; Zeng, Sicong; Tan, Yueqiu; Lu, Guangxiu; Lin, Ge
2014-01-01
Previous reports have demonstrated that human embryonic stem cells (hESCs) tend to develop genomic alterations and progress to a malignant state during long-term in vitro culture. This raises concerns of the clinical safety in using cultured hESCs. However, transformed hESCs might serve as an excellent model to determine the process of embryonic stem cell transition. In this study, ITRAQ-based tandem mass spectrometry was used to quantify normal and aberrant karyotypic hESCs proteins from simple to more complex karyotypic abnormalities. We identified and quantified 2583 proteins, and found that the expression levels of 316 proteins that represented at least 23 functional molecular groups were significantly different in both normal and abnormal hESCs. Dysregulated protein expression in epigenetic regulation was further verified in six pairs of hESC lines in early and late passage. In summary, this study is the first large-scale quantitative proteomic analysis of the malignant transformation of aberrant karyotypic hESCs. The data generated should serve as a useful reference of stem cell-derived tumor progression. Increased expression of both HDAC2 and CTNNB1 are detected as early as the pre-neoplastic stage, and might serve as prognostic markers in the malignant transformation of hESCs. PMID:24465727
Janssen, K A; Sidoli, S; Garcia, B A
2017-01-01
Functional epigenetic regulation occurs by dynamic modification of chromatin, including genetic material (i.e., DNA methylation), histone proteins, and other nuclear proteins. Due to the highly complex nature of the histone code, mass spectrometry (MS) has become the leading technique in identification of single and combinatorial histone modifications. MS has now overcome antibody-based strategies due to its automation, high resolution, and accurate quantitation. Moreover, multiple approaches to analysis have been developed for global quantitation of posttranslational modifications (PTMs), including large-scale characterization of modification coexistence (middle-down and top-down proteomics), which is not currently possible with any other biochemical strategy. Recently, our group and others have simplified and increased the effectiveness of analyzing histone PTMs by improving multiple MS methods and data analysis tools. This review provides an overview of the major achievements in the analysis of histone PTMs using MS with a focus on the most recent improvements. We speculate that the workflow for histone analysis at its state of the art is highly reliable in terms of identification and quantitation accuracy, and it has the potential to become a routine method for systems biology thanks to the possibility of integrating histone MS results with genomics and proteomics datasets. © 2017 Elsevier Inc. All rights reserved.
Identification of urine protein biomarkers with the potential for early detection of lung cancer.
Zhang, Hongjuan; Cao, Jing; Li, Lin; Liu, Yanbin; Zhao, Hong; Li, Nan; Li, Bo; Zhang, Aiqun; Huang, Huanwei; Chen, She; Dong, Mengqiu; Yu, Lei; Zhang, Jian; Chen, Liang
2015-07-02
Lung cancer is the leading cause of cancer-related deaths and has an overall 5-year survival rate lower than 15%. Large-scale clinical trials have demonstrated a significant relative reduction in mortality in high-risk individuals with low-dose computed tomography screening. However, biomarkers capable of identifying the most at-risk population and detecting lung cancer before it becomes clinically apparent are urgently needed in the clinic. Here, we report the identification of urine biomarkers capable of detecting lung cancer. Using the well-characterized inducible Kras (G12D) mouse model of lung cancer, we identified alterations in the urine proteome in tumor-bearing mice compared with sibling controls. Marked differences at the proteomic level were also detected between the urine of patients and that of healthy population controls. Importantly, we identified 7 proteins commonly found to be significantly up-regulated in both tumor-bearing mice and patients. In an independent cohort, we showed that 2 of the 7 proteins were up-regulated in urine samples from lung cancer patients but not in those from controls. The kinetics of these proteins correlated with the disease state in the mouse model. These tumor biomarkers could potentially aid in the early detection of lung cancer.
Lysine acetylome profiling uncovers novel histone deacetylase substrate proteins in Arabidopsis.
Hartl, Markus; Füßl, Magdalena; Boersema, Paul J; Jost, Jan-Oliver; Kramer, Katharina; Bakirbas, Ahmet; Sindlinger, Julia; Plöchinger, Magdalena; Leister, Dario; Uhrig, Glen; Moorhead, Greg Bg; Cox, Jürgen; Salvucci, Michael E; Schwarzer, Dirk; Mann, Matthias; Finkemeier, Iris
2017-10-23
Histone deacetylases have central functions in regulating stress defenses and development in plants. However, the knowledge about the deacetylase functions is largely limited to histones, although these enzymes were found in diverse subcellular compartments. In this study, we determined the proteome-wide signatures of the RPD3/HDA1 class of histone deacetylases in Arabidopsis Relative quantification of the changes in the lysine acetylation levels was determined on a proteome-wide scale after treatment of Arabidopsis leaves with deacetylase inhibitors apicidin and trichostatin A. We identified 91 new acetylated candidate proteins other than histones, which are potential substrates of the RPD3/HDA1-like histone deacetylases in Arabidopsis , of which at least 30 of these proteins function in nucleic acid binding. Furthermore, our analysis revealed that histone deacetylase 14 (HDA14) is the first organellar-localized RPD3/HDA1 class protein found to reside in the chloroplasts and that the majority of its protein targets have functions in photosynthesis. Finally, the analysis of HDA14 loss-of-function mutants revealed that the activation state of RuBisCO is controlled by lysine acetylation of RuBisCO activase under low-light conditions. © 2017 The Authors. Published under the terms of the CC BY 4.0 license.
Säll, Anna; Walle, Maria; Wingren, Christer; Müller, Susanne; Nyman, Tomas; Vala, Andrea; Ohlin, Mats; Borrebaeck, Carl A K; Persson, Helena
2016-10-01
Antibody-based proteomics offers distinct advantages in the analysis of complex samples for discovery and validation of biomarkers associated with disease. However, its large-scale implementation requires tools and technologies that allow development of suitable antibody or antibody fragments in a high-throughput manner. To address this we designed and constructed two human synthetic antibody fragment (scFv) libraries denoted HelL-11 and HelL-13. By the use of phage display technology, in total 466 unique scFv antibodies specific for 114 different antigens were generated. The specificities of these antibodies were analyzed in a variety of immunochemical assays and a subset was further evaluated for functionality in protein microarray applications. This high-throughput approach demonstrates the ability to rapidly generate a wealth of reagents not only for proteome research, but potentially also for diagnostics and therapeutics. In addition, this work provides a great example on how a synthetic approach can be used to optimize library designs. By having precise control of the diversity introduced into the antigen-binding sites, synthetic libraries offer increased understanding of how different diversity contributes to antibody binding reactivity and stability, thereby providing the key to future library optimization. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Waters, Katrina M.; Liu, Tao; Quesenberry, Ryan D.; Willse, Alan R.; Bandyopadhyay, Somnath; Kathmann, Loel E.; Weber, Thomas J.; Smith, Richard D.; Wiley, H. Steven; Thrall, Brian D.
2012-01-01
To understand how integration of multiple data types can help decipher cellular responses at the systems level, we analyzed the mitogenic response of human mammary epithelial cells to epidermal growth factor (EGF) using whole genome microarrays, mass spectrometry-based proteomics and large-scale western blots with over 1000 antibodies. A time course analysis revealed significant differences in the expression of 3172 genes and 596 proteins, including protein phosphorylation changes measured by western blot. Integration of these disparate data types showed that each contributed qualitatively different components to the observed cell response to EGF and that varying degrees of concordance in gene expression and protein abundance measurements could be linked to specific biological processes. Networks inferred from individual data types were relatively limited, whereas networks derived from the integrated data recapitulated the known major cellular responses to EGF and exhibited more highly connected signaling nodes than networks derived from any individual dataset. While cell cycle regulatory pathways were altered as anticipated, we found the most robust response to mitogenic concentrations of EGF was induction of matrix metalloprotease cascades, highlighting the importance of the EGFR system as a regulator of the extracellular environment. These results demonstrate the value of integrating multiple levels of biological information to more accurately reconstruct networks of cellular response. PMID:22479638
Liu, Cheng; Lin, Jen-Jie; Yang, Zih-Yan; Tsai, Chi-Chu; Hsu, Jue-Liang; Wu, Yu-Jen
2014-12-03
Gallic acid (GA) has long been associated with a wide range of biological activities. In this study, its antitumor effect against B16F10 melanoma cells was demonstrated by MTT assay, cell migration assay, wound-healing assay, and flow cytometric analysis. GA with a concentration >200 μM shows apoptotic activity toward B16F10 cells. According to Western blotting data, overexpressions of cleaved forms of caspase-9, caspase-3, and PARP-1 and pro-apoptotic Bax and Bad, accompanied by underexpressed anti-apoptotic Bcl-2 and Bcl-xL indicate that GA induces B16F10 cell apoptosis via mitochondrial pathway. The 2-DE based comparative proteomics was further employed in B16F10 cells with and without GA treatment for a large-scale protein expression profiling. A total of 41 differential protein spots were quantified, and their identities were characterized using LC-MS/MS analysis and database matching. In addition to some regulated proteins that were associated with apoptosis, interestingly, some identified proteins involved in glycolysis such as glucokinase, α-enolase, aldolase, pyruvate kinase, and GAPDH were simultaneously up-regulated, which reveals that the GA-induced cellular apoptosis in B16 melanoma cells is associated with metabolic glycolysis.
Two flagellar BAR domain proteins in Trypanosoma brucei with stage-specific regulation
Cicova, Zdenka; Dejung, Mario; Skalicky, Tomas; Eisenhuth, Nicole; Hanselmann, Steffen; Morriswood, Brooke; Figueiredo, Luisa M.; Butter, Falk; Janzen, Christian J.
2016-01-01
Trypanosomes are masters of adaptation to different host environments during their complex life cycle. Large-scale proteomic approaches provide information on changes at the cellular level, and in a systematic way. However, detailed work on single components is necessary to understand the adaptation mechanisms on a molecular level. Here, we have performed a detailed characterization of a bloodstream form (BSF) stage-specific putative flagellar host adaptation factor Tb927.11.2400, identified previously in a SILAC-based comparative proteome study. Tb927.11.2400 shares 38% amino acid identity with TbFlabarin (Tb927.11.2410), a procyclic form (PCF) stage-specific flagellar BAR domain protein. We named Tb927.11.2400 TbFlabarin-like (TbFlabarinL), and demonstrate that it originates from a gene duplication event, which occurred in the African trypanosomes. TbFlabarinL is not essential for the growth of the parasites under cell culture conditions and it is dispensable for developmental differentiation from BSF to the PCF in vitro. We generated TbFlabarinL-specific antibodies, and showed that it localizes in the flagellum. Co-immunoprecipitation experiments together with a biochemical cell fractionation suggest a dual association of TbFlabarinL with the flagellar membrane and the components of the paraflagellar rod. PMID:27779220
The Proteins API: accessing key integrated protein and genome information.
Nightingale, Andrew; Antunes, Ricardo; Alpi, Emanuele; Bursteinas, Borisas; Gonzales, Leonardo; Liu, Wudong; Luo, Jie; Qi, Guoying; Turner, Edd; Martin, Maria
2017-07-03
The Proteins API provides searching and programmatic access to protein and associated genomics data such as curated protein sequence positional annotations from UniProtKB, as well as mapped variation and proteomics data from large scale data sources (LSS). Using the coordinates service, researchers are able to retrieve the genomic sequence coordinates for proteins in UniProtKB. This, the LSS genomics and proteomics data for UniProt proteins is programmatically only available through this service. A Swagger UI has been implemented to provide documentation, an interface for users, with little or no programming experience, to 'talk' to the services to quickly and easily formulate queries with the services and obtain dynamically generated source code for popular programming languages, such as Java, Perl, Python and Ruby. Search results are returned as standard JSON, XML or GFF data objects. The Proteins API is a scalable, reliable, fast, easy to use RESTful services that provides a broad protein information resource for users to ask questions based upon their field of expertise and allowing them to gain an integrated overview of protein annotations available to aid their knowledge gain on proteins in biological processes. The Proteins API is available at (http://www.ebi.ac.uk/proteins/api/doc). © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Hart, Thomas; Dider, Shihab; Han, Weiwei; Xu, Hua; Zhao, Zhongming; Xie, Lei
2016-01-01
Metformin, a drug prescribed to treat type-2 diabetes, exhibits anti-cancer effects in a portion of patients, but the direct molecular and genetic interactions leading to this pleiotropic effect have not yet been fully explored. To repurpose metformin as a precision anti-cancer therapy, we have developed a novel structural systems pharmacology approach to elucidate metformin’s molecular basis and genetic biomarkers of action. We integrated structural proteome-scale drug target identification with network biology analysis by combining structural genomic, functional genomic, and interactomic data. Through searching the human structural proteome, we identified twenty putative metformin binding targets and their interaction models. We experimentally verified the interactions between metformin and our top-ranked kinase targets. Notably, kinases, particularly SGK1 and EGFR were identified as key molecular targets of metformin. Subsequently, we linked these putative binding targets to genes that do not directly bind to metformin but whose expressions are altered by metformin through protein-protein interactions, and identified network biomarkers of phenotypic response of metformin. The molecular targets and the key nodes in genetic networks are largely consistent with the existing experimental evidence. Their interactions can be affected by the observed cancer mutations. This study will shed new light into repurposing metformin for safe, effective, personalized therapies. PMID:26841718
Functional proteomics within the genus Lactobacillus.
De Angelis, Maria; Calasso, Maria; Cavallo, Noemi; Di Cagno, Raffaella; Gobbetti, Marco
2016-03-01
Lactobacillus are mainly used for the manufacture of fermented dairy, sourdough, meat, and vegetable foods or used as probiotics. Under optimal processing conditions, Lactobacillus strains contribute to food functionality through their enzyme portfolio and the release of metabolites. An extensive genomic diversity analysis was conducted to elucidate the core features of the genus Lactobacillus, and to provide a better comprehension of niche adaptation of the strains. However, proteomics is an indispensable "omics" science to elucidate the proteome diversity, and the mechanisms of regulation and adaptation of Lactobacillus strains. This review focuses on the novel and comprehensive knowledge of functional proteomics and metaproteomics of Lactobacillus species. A large list of proteomic case studies of different Lactobacillus species is provided to illustrate the adaptability of the main metabolic pathways (e.g., carbohydrate transport and metabolism, pyruvate metabolism, proteolytic system, amino acid metabolism, and protein synthesis) to various life conditions. These investigations have highlighted that lactobacilli modulate the level of a complex panel of proteins to growth/survive in different ecological niches. In addition to the general regulation and stress response, specific metabolic pathways can be switched on and off, modifying the behavior of the strains. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Proteomics technique opens new frontiers in mobilome research.
Davidson, Andrew D; Matthews, David A; Maringer, Kevin
2017-01-01
A large proportion of the genome of most eukaryotic organisms consists of highly repetitive mobile genetic elements. The sum of these elements is called the "mobilome," which in eukaryotes is made up mostly of transposons. Transposable elements contribute to disease, evolution, and normal physiology by mediating genetic rearrangement, and through the "domestication" of transposon proteins for cellular functions. Although 'omics studies of mobilome genomes and transcriptomes are common, technical challenges have hampered high-throughput global proteomics analyses of transposons. In a recent paper, we overcame these technical hurdles using a technique called "proteomics informed by transcriptomics" (PIT), and thus published the first unbiased global mobilome-derived proteome for any organism (using cell lines derived from the mosquito Aedes aegypti ). In this commentary, we describe our methods in more detail, and summarise our major findings. We also use new genome sequencing data to show that, in many cases, the specific genomic element expressing a given protein can be identified using PIT. This proteomic technique therefore represents an important technological advance that will open new avenues of research into the role that proteins derived from transposons and other repetitive and sequence diverse genetic elements, such as endogenous retroviruses, play in health and disease.
The pepATTRACT web server for blind, large-scale peptide-protein docking.
de Vries, Sjoerd J; Rey, Julien; Schindler, Christina E M; Zacharias, Martin; Tuffery, Pierre
2017-07-03
Peptide-protein interactions are ubiquitous in the cell and form an important part of the interactome. Computational docking methods can complement experimental characterization of these complexes, but current protocols are not applicable on the proteome scale. pepATTRACT is a novel docking protocol that is fully blind, i.e. it does not require any information about the binding site. In various stages of its development, pepATTRACT has participated in CAPRI, making successful predictions for five out of seven protein-peptide targets. Its performance is similar or better than state-of-the-art local docking protocols that do require binding site information. Here we present a novel web server that carries out the rigid-body stage of pepATTRACT. On the peptiDB benchmark, the web server generates a correct model in the top 50 in 34% of the cases. Compared to the full pepATTRACT protocol, this leads to some loss of performance, but the computation time is reduced from ∼18 h to ∼10 min. Combined with the fact that it is fully blind, this makes the web server well-suited for large-scale in silico protein-peptide docking experiments. The rigid-body pepATTRACT server is freely available at http://bioserv.rpbs.univ-paris-diderot.fr/services/pepATTRACT. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
The Impact of the Glomerular Filtration Rate on the Human Plasma Proteome.
Christensson, Anders; Ash, Jessica A; DeLisle, Robert K; Gaspar, Fraser W; Ostroff, Rachel; Grubb, Anders; Lindström, Veronica; Bruun, Laila; Williams, Steve A
2018-05-01
The application of proteomics in chronic kidney disease (CKD) can potentially uncover biomarkers and pathways that are predictive of disease. Within this context, this study examines the relationship between the human plasma proteome and glomerular filtration rate (GFR) as measured by iohexol clearance in a cohort from Sweden (n = 389; GFR range: 8-100 mL min -1 /1.73 m 2 ). A total of 2893 proteins are quantified using a modified aptamer assay. A large proportion of the proteome is associated with GFR, reinforcing the concept that CKD affects multiple physiological systems (individual protein-GFR correlations listed here). Of these, cystatin C shows the most significant correlation with GFR (rho = -0.85, p = 1.2 × 10 -97 ), establishing strong validation for the use of this biomarker in CKD diagnostics. Among the other highly significant protein markers are insulin-like growth factor-binding protein 6, neuroblastoma suppressor of tumorigenicity 1, follistatin-related protein 3, trefoil factor 3, and beta-2 microglobulin. These proteins may indicate an imbalance in homeostasis across a variety of cellular processes, which may be underlying renal dysfunction. Overall, this study represents the most extensive characterization of the plasma proteome and its relation to GFR to date, and suggests the diagnostic and prognostic value of proteomics for CKD across all stages. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Kitata, Reta Birhanu; Dimayacyac-Esleta, Baby Rorielyn T.; Choong, Wai-Kok; Tsai, Chia-Feng; Lin, Tai-Du; Tsou, Chih-Chiang; Weng, Shao-Hsing; Chen, Yi-Ju; Yang, Pan-Chyr; Arco, Susan D.; Nesvizhskii, Alexey I.; Sung, Ting-Yi; Chen, Yu-Ju
2016-01-01
Despite significant efforts in the past decade towards complete mapping of the human proteome, 3564 proteins (neXtProt, 09-2014) are still “missing proteins”. Over one-third of these missing proteins are annotated as membrane proteins, owing to their relatively challenging accessibility with standard shotgun proteomics. Using non-small cell lung cancer (NSCLC) as a model study, we aim to mine missing proteins from disease-associated membrane proteome, which may be still largely under-represented. To increase identification coverage, we employed Hp-RP StageTip pre-fractionation of membrane-enriched samples from 11 NSCLC cell lines. Analysis of membrane samples from 20 pairs of tumor and adjacent normal lung tissue were incorporated to include physiologically expressed membrane proteins. Using multiple search engines (X!Tandem, Comet and Mascot) and stringent evaluation of FDR (MAYU and PeptideShaker), we identified 7702 proteins (66% membrane proteins) and 178 missing proteins (74 membrane proteins) with PSM-, peptide-, and protein-level FDR of 1%. Through multiple reaction monitoring (MRM) using synthetic peptides, we provided additional evidences for 8 missing proteins including 7 with transmembrane helix domains (TMH). This study demonstrates that mining missing proteins focused on cancer membrane sub-proteome can greatly contribute to map the whole human proteome. All data were deposited into ProteomeXchange with the identifier PXD002224. PMID:26202522
Alvarez, Sophie; Roy Choudhury, Swarup; Pandey, Sona
2014-03-07
Wheat is one of the most highly cultivated cereals in the world. Like other cultivated crops, wheat production is significantly affected by abiotic stresses such as drought. Multiple wheat varieties suitable for different geographical regions of the world have been developed that are adapted to different environmental conditions; however, the molecular basis of such adaptations remains unknown in most cases. We have compared the quantitative proteomics profile of the roots of two different wheat varieties, Nesser (drought-tolerant) and Opata (drought-sensitive), in the absence and presence of abscisic acid (ABA, as a proxy for drought). A labeling LC-based quantitative proteomics approach using iTRAQ was applied to elucidate the changes in protein abundance levels. Quantitative differences in protein levels were analyzed for the evaluation of inherent differences between the two varieties as well as the overall and variety-specific effect of ABA on the root proteome. This study reveals the most elaborate ABA-responsive root proteome identified to date in wheat. A large number of proteins exhibited inherently different expression levels between Nesser and Opata. Additionally, significantly higher numbers of proteins were ABA-responsive in Nesser roots compared with Opata roots. Furthermore, several proteins showed variety-specific regulation by ABA, suggesting their role in drought adaptation.
Plasticity in the proteome of Emiliania huxleyi CCMP 1516 to extremes of light is highly targeted.
McKew, Boyd A; Lefebvre, Stephane C; Achterberg, Eric P; Metodieva, Gergana; Raines, Christine A; Metodiev, Metodi V; Geider, Richard J
2013-10-01
Optimality principles are often applied in theoretical studies of microalgal ecophysiology to predict changes in allocation of resources to different metabolic pathways, and optimal acclimation is likely to involve changes in the proteome, which typically accounts for > 50% of cellular nitrogen (N). We tested the hypothesis that acclimation of the microalga Emiliania huxleyi CCMP 1516 to suboptimal vs supraoptimal light involves large changes in the proteome as cells rebalance the capacities to absorb light, fix CO2 , perform biosynthesis and resist photooxidative stress. Emiliania huxleyi was grown in nutrient-replete continuous culture at 30 (LL) and 1000 μmol photons m(-2) s(-1) (HL), and changes in the proteome were assessed by LC-MS/MS shotgun proteomics. Changes were most evident in proteins involved in the light reactions of photosynthesis; the relative abundance of photosystem I (PSI) and PSII proteins was 70% greater in LL, light-harvesting fucoxanthin-chlorophyll proteins (Lhcfs) were up to 500% greater in LL and photoprotective LI818 proteins were 300% greater in HL. The marked changes in the abundances of Lhcfs and LI818s, together with the limited plasticity in the bulk of the E. huxleyi proteome, probably reflect evolutionary pressures to provide energy to maintain metabolic capabilities in stochastic light environments encountered by this species in nature. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.
Global Proteomics Analysis of the Response to Starvation in C. elegans*
Larance, Mark; Pourkarimi, Ehsan; Wang, Bin; Brenes Murillo, Alejandro; Kent, Robert; Lamond, Angus I.; Gartner, Anton
2015-01-01
Periodic starvation of animals induces large shifts in metabolism but may also influence many other cellular systems and can lead to adaption to prolonged starvation conditions. To date, there is limited understanding of how starvation affects gene expression, particularly at the protein level. Here, we have used mass-spectrometry-based quantitative proteomics to identify global changes in the Caenorhabditis elegans proteome due to acute starvation of young adult animals. Measuring changes in the abundance of over 5,000 proteins, we show that acute starvation rapidly alters the levels of hundreds of proteins, many involved in central metabolic pathways, highlighting key regulatory responses. Surprisingly, we also detect changes in the abundance of chromatin-associated proteins, including specific linker histones, histone variants, and histone posttranslational modifications associated with the epigenetic control of gene expression. To maximize community access to these data, they are presented in an online searchable database, the Encyclopedia of Proteome Dynamics (http://www.peptracker.com/epd/). PMID:25963834
Weston, Andrea D; Hood, Leroy
2004-01-01
The emergence of systems biology is bringing forth a new set of challenges for advancing science and technology. Defining ways of studying biological systems on a global level, integrating large and disparate data types, and dealing with the infrastructural changes necessary to carry out systems biology, are just a few of the extraordinary tasks of this growing discipline. Despite these challenges, the impact of systems biology will be far-reaching, and significant progress has already been made. Moving forward, the issue of how to use systems biology to improve the health of individuals must be a priority. It is becoming increasingly apparent that the field of systems biology and one of its important disciplines, proteomics, will have a major role in creating a predictive, preventative, and personalized approach to medicine. In this review, we define systems biology, discuss the current capabilities of proteomics and highlight some of the necessary milestones for moving systems biology and proteomics into mainstream health care.
A Systematic Analysis of a Deep Mouse Epididymal Sperm Proteome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chauvin, Theodore; Xie, Fang; Liu, Tao
Spermatozoa are highly specialized cells that, when mature, are capable of navigating the female reproductive tract and fertilizing an oocyte. The sperm cell is thought to be largely quiescent in terms of transcriptional and translational activity. As a result, once it has left the male reproductive tract, the sperm cell is essentially operating with a static population of proteins. It is therefore theoretically possible to understand the protein networks contained in a sperm cell and to deduce its cellular function capabilities. To this end we have performed a proteomic analysis of mouse sperm isolated from the cauda epididymis and havemore » confidently identified 2,850 proteins, which is the most comprehensive sperm proteome for any species reported to date. These proteins comprise many complete cellular pathways, including those for energy production via glycolysis, β-oxidation and oxidative phosphorylation, protein folding and transport, and cell signaling systems. This proteome should prove a useful tool for assembly and testing of protein networks important for sperm function.« less
Progress and challenges for abiotic stress proteomics of crop plants.
Barkla, Bronwyn J; Vera-Estrella, Rosario; Pantoja, Omar
2013-06-01
Plants are continually challenged to recognize and respond to adverse changes in their environment to avoid detrimental effects on growth and development. Understanding the mechanisms that crop plants employ to resist and tolerate abiotic stress is of considerable interest for designing agriculture breeding strategies to ensure sustainable productivity. The application of proteomics technologies to advance our knowledge in crop plant abiotic stress tolerance has increased dramatically in the past few years as evidenced by the large amount of publications in this area. This is attributed to advances in various technology platforms associated with MS-based techniques as well as the accessibility of proteomics units to a wider plant research community. This review summarizes the work which has been reported for major crop plants and evaluates the findings in context of the approaches that are widely employed with the aim to encourage broadening the strategies used to increase coverage of the proteome. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Interaction Analysis through Proteomic Phage Display
2014-01-01
Phage display is a powerful technique for profiling specificities of peptide binding domains. The method is suited for the identification of high-affinity ligands with inhibitor potential when using highly diverse combinatorial peptide phage libraries. Such experiments further provide consensus motifs for genome-wide scanning of ligands of potential biological relevance. A complementary but considerably less explored approach is to display expression products of genomic DNA, cDNA, open reading frames (ORFs), or oligonucleotide libraries designed to encode defined regions of a target proteome on phage particles. One of the main applications of such proteomic libraries has been the elucidation of antibody epitopes. This review is focused on the use of proteomic phage display to uncover protein-protein interactions of potential relevance for cellular function. The method is particularly suited for the discovery of interactions between peptide binding domains and their targets. We discuss the largely unexplored potential of this method in the discovery of domain-motif interactions of potential biological relevance. PMID:25295249
Monitoring Peptidase Activities in Complex Proteomes by MALDI-TOF Mass Spectrometry
Villanueva, Josep; Nazarian, Arpi; Lawlor, Kevin; Tempst, Paul
2009-01-01
Measuring enzymatic activities in biological fluids is a form of activity-based proteomics and may be utilized as a means of developing disease biomarkers. Activity-based assays allow amplification of output signals, thus potentially visualizing low-abundant enzymes on a virtually transparent whole-proteome background. The protocol presented here describes a semi-quantitative in vitro assay of proteolytic activities in complex proteomes by monitoring breakdown of designer peptide-substrates using robotic extraction and a MALDI-TOF mass spectrometric read-out. Relative quantitation of the peptide metabolites is done by comparison with spiked internal standards, followed by statistical analysis of the resulting mini-peptidome. Partial automation provides reproducibility and throughput essential for comparing large sample sets. The approach may be employed for diagnostic or predictive purposes and enables profiling of 96 samples in 30 hours. It could be tailored to many diagnostic and pharmaco-dynamic purposes, as a read-out of catalytic and metabolic activities in body fluids or tissues. PMID:19617888
MannDB: A microbial annotation database for protein characterization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, C; Lam, M; Smith, J
2006-05-19
MannDB was created to meet a need for rapid, comprehensive automated protein sequence analyses to support selection of proteins suitable as targets for driving the development of reagents for pathogen or protein toxin detection. Because a large number of open-source tools were needed, it was necessary to produce a software system to scale the computations for whole-proteome analysis. Thus, we built a fully automated system for executing software tools and for storage, integration, and display of automated protein sequence analysis and annotation data. MannDB is a relational database that organizes data resulting from fully automated, high-throughput protein-sequence analyses using open-sourcemore » tools. Types of analyses provided include predictions of cleavage, chemical properties, classification, features, functional assignment, post-translational modifications, motifs, antigenicity, and secondary structure. Proteomes (lists of hypothetical and known proteins) are downloaded and parsed from Genbank and then inserted into MannDB, and annotations from SwissProt are downloaded when identifiers are found in the Genbank entry or when identical sequences are identified. Currently 36 open-source tools are run against MannDB protein sequences either on local systems or by means of batch submission to external servers. In addition, BLAST against protein entries in MvirDB, our database of microbial virulence factors, is performed. A web client browser enables viewing of computational results and downloaded annotations, and a query tool enables structured and free-text search capabilities. When available, links to external databases, including MvirDB, are provided. MannDB contains whole-proteome analyses for at least one representative organism from each category of biological threat organism listed by APHIS, CDC, HHS, NIAID, USDA, USFDA, and WHO. MannDB comprises a large number of genomes and comprehensive protein sequence analyses representing organisms listed as high-priority agents on the websites of several governmental organizations concerned with bio-terrorism. MannDB provides the user with a BLAST interface for comparison of native and non-native sequences and a query tool for conveniently selecting proteins of interest. In addition, the user has access to a web-based browser that compiles comprehensive and extensive reports.« less
Genomics pipelines and data integration: challenges and opportunities in the research setting
Davis-Turak, Jeremy; Courtney, Sean M.; Hazard, E. Starr; Glen, W. Bailey; da Silveira, Willian; Wesselman, Timothy; Harbin, Larry P.; Wolf, Bethany J.; Chung, Dongjun; Hardiman, Gary
2017-01-01
Introduction The emergence and mass utilization of high-throughput (HT) technologies, including sequencing technologies (genomics) and mass spectrometry (proteomics, metabolomics, lipids), has allowed geneticists, biologists, and biostatisticians to bridge the gap between genotype and phenotype on a massive scale. These new technologies have brought rapid advances in our understanding of cell biology, evolutionary history, microbial environments, and are increasingly providing new insights and applications towards clinical care and personalized medicine. Areas covered The very success of this industry also translates into daunting big data challenges for researchers and institutions that extend beyond the traditional academic focus of algorithms and tools. The main obstacles revolve around analysis provenance, data management of massive datasets, ease of use of software, interpretability and reproducibility of results. Expert Commentary The authors review the challenges associated with implementing bioinformatics best practices in a large-scale setting, and highlight the opportunity for establishing bioinformatics pipelines that incorporate data tracking and auditing, enabling greater consistency and reproducibility for basic research, translational or clinical settings. PMID:28092471
Genomics pipelines and data integration: challenges and opportunities in the research setting.
Davis-Turak, Jeremy; Courtney, Sean M; Hazard, E Starr; Glen, W Bailey; da Silveira, Willian A; Wesselman, Timothy; Harbin, Larry P; Wolf, Bethany J; Chung, Dongjun; Hardiman, Gary
2017-03-01
The emergence and mass utilization of high-throughput (HT) technologies, including sequencing technologies (genomics) and mass spectrometry (proteomics, metabolomics, lipids), has allowed geneticists, biologists, and biostatisticians to bridge the gap between genotype and phenotype on a massive scale. These new technologies have brought rapid advances in our understanding of cell biology, evolutionary history, microbial environments, and are increasingly providing new insights and applications towards clinical care and personalized medicine. Areas covered: The very success of this industry also translates into daunting big data challenges for researchers and institutions that extend beyond the traditional academic focus of algorithms and tools. The main obstacles revolve around analysis provenance, data management of massive datasets, ease of use of software, interpretability and reproducibility of results. Expert commentary: The authors review the challenges associated with implementing bioinformatics best practices in a large-scale setting, and highlight the opportunity for establishing bioinformatics pipelines that incorporate data tracking and auditing, enabling greater consistency and reproducibility for basic research, translational or clinical settings.
The Human Skeletal Muscle Proteome Project: a reappraisal of the current literature
Gonzalez‐Freire, Marta; Semba, Richard D.; Ubaida‐Mohien, Ceereena; Fabbri, Elisa; Scalzo, Paul; Højlund, Kurt; Dufresne, Craig; Lyashkov, Alexey
2016-01-01
Abstract Skeletal muscle is a large organ that accounts for up to half the total mass of the human body. A progressive decline in muscle mass and strength occurs with ageing and in some individuals configures the syndrome of ‘sarcopenia’, a condition that impairs mobility, challenges autonomy, and is a risk factor for mortality. The mechanisms leading to sarcopenia as well as myopathies are still little understood. The Human Skeletal Muscle Proteome Project was initiated with the aim to characterize muscle proteins and how they change with ageing and disease. We conducted an extensive review of the literature and analysed publically available protein databases. A systematic search of peer‐reviewed studies was performed using PubMed. Search terms included ‘human’, ‘skeletal muscle’, ‘proteome’, ‘proteomic(s)’, and ‘mass spectrometry’, ‘liquid chromatography‐mass spectrometry (LC‐MS/MS)’. A catalogue of 5431 non‐redundant muscle proteins identified by mass spectrometry‐based proteomics from 38 peer‐reviewed scientific publications from 2002 to November 2015 was created. We also developed a nosology system for the classification of muscle proteins based on localization and function. Such inventory of proteins should serve as a useful background reference for future research on changes in muscle proteome assessed by quantitative mass spectrometry‐based proteomic approaches that occur with ageing and diseases. This classification and compilation of the human skeletal muscle proteome can be used for the identification and quantification of proteins in skeletal muscle to discover new mechanisms for sarcopenia and specific muscle diseases that can be targeted for the prevention and treatment. PMID:27897395
Andromeda: a peptide search engine integrated into the MaxQuant environment.
Cox, Jürgen; Neuhauser, Nadin; Michalski, Annette; Scheltema, Richard A; Olsen, Jesper V; Mann, Matthias
2011-04-01
A key step in mass spectrometry (MS)-based proteomics is the identification of peptides in sequence databases by their fragmentation spectra. Here we describe Andromeda, a novel peptide search engine using a probabilistic scoring model. On proteome data, Andromeda performs as well as Mascot, a widely used commercial search engine, as judged by sensitivity and specificity analysis based on target decoy searches. Furthermore, it can handle data with arbitrarily high fragment mass accuracy, is able to assign and score complex patterns of post-translational modifications, such as highly phosphorylated peptides, and accommodates extremely large databases. The algorithms of Andromeda are provided. Andromeda can function independently or as an integrated search engine of the widely used MaxQuant computational proteomics platform and both are freely available at www.maxquant.org. The combination enables analysis of large data sets in a simple analysis workflow on a desktop computer. For searching individual spectra Andromeda is also accessible via a web server. We demonstrate the flexibility of the system by implementing the capability to identify cofragmented peptides, significantly improving the total number of identified peptides.
Zhang, Ying; Wang, Xi; Cui, Dan; Zhu, Jun
2016-12-01
Human whole saliva is a vital body fluid for studying the physiology and pathology of the oral cavity. As a powerful technique for biomarker discovery, MS-based proteomic strategies have been introduced for saliva analysis and identified hundreds of proteins and N-glycosylation sites. However, there is still a lack of quantitative analysis, which is necessary for biomarker screening and biological research. In this study, we establish an integrated workflow by the combination of stable isotope dimethyl labeling, HILIC enrichment, and high resolution MS for both quantification of the global proteome and N-glycoproteome of human saliva from oral ulcer patients. With the help of advanced bioinformatics, we comprehensively studied oral ulcers at both protein and glycoprotein scales. Bioinformatics analyses revealed that starch digestion and protein degradation activities are inhibited while the immune response is promoted in oral ulcer saliva. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ansong, Charles; Wu, Si; Meng, Da
Characterization of the mature protein complement in cells is crucial for a better understanding of cellular processes on a systems-wide scale. Bottom-up proteomic approaches often lead to loss of critical information about an endogenous protein’s actual state due to post translational modifications (PTMs) and other processes. Top-down approaches that involve analysis of the intact protein can address this concern but present significant analytical challenges related to the separation quality needed, measurement sensitivity, and speed that result in low throughput and limited coverage. Here we used single-dimension ultra high pressure liquid chromatography mass spectrometry to investigate the comprehensive ‘intact’ proteome ofmore » the Gram negative bacterial pathogen Salmonella Typhimurium. Top-down proteomics analysis revealed 563 unique proteins including 1665 proteoforms generated by PTMs, representing the largest microbial top-down dataset reported to date. Our analysis not only confirmed several previously recognized aspects of Salmonella biology and bacterial PTMs in general, but also revealed several novel biological insights. Of particular interest was differential utilization of the protein S-thiolation forms S-glutathionylation and S-cysteinylation in response to infection-like conditions versus basal conditions, which was corroborated by changes in corresponding biosynthetic pathways. This differential utilization highlights underlying metabolic mechanisms that modulate changes in cellular signaling, and represents to our knowledge the first report of S-cysteinylation in Gram negative bacteria. The demonstrated utility of our simple proteome-wide intact protein level measurement strategy for gaining biological insight should promote broader adoption and applications of top-down proteomics approaches.« less
Proteomics profiling of interactome dynamics by colocalisation analysis (COLA).
Mardakheh, Faraz K; Sailem, Heba Z; Kümper, Sandra; Tape, Christopher J; McCully, Ryan R; Paul, Angela; Anjomani-Virmouni, Sara; Jørgensen, Claus; Poulogiannis, George; Marshall, Christopher J; Bakal, Chris
2016-12-20
Localisation and protein function are intimately linked in eukaryotes, as proteins are localised to specific compartments where they come into proximity of other functionally relevant proteins. Significant co-localisation of two proteins can therefore be indicative of their functional association. We here present COLA, a proteomics based strategy coupled with a bioinformatics framework to detect protein-protein co-localisations on a global scale. COLA reveals functional interactions by matching proteins with significant similarity in their subcellular localisation signatures. The rapid nature of COLA allows mapping of interactome dynamics across different conditions or treatments with high precision.
Large-Scale SRM Screen of Urothelial Bladder Cancer Candidate Biomarkers in Urine.
Duriez, Elodie; Masselon, Christophe D; Mesmin, Cédric; Court, Magali; Demeure, Kevin; Allory, Yves; Malats, Núria; Matondo, Mariette; Radvanyi, François; Garin, Jérôme; Domon, Bruno
2017-04-07
Urothelial bladder cancer is a condition associated with high recurrence and substantial morbidity and mortality. Noninvasive urinary tests that would detect bladder cancer and tumor recurrence are required to significantly improve patient care. Over the past decade, numerous bladder cancer candidate biomarkers have been identified in the context of extensive proteomics or transcriptomics studies. To translate these findings in clinically useful biomarkers, the systematic evaluation of these candidates remains the bottleneck. Such evaluation involves large-scale quantitative LC-SRM (liquid chromatography-selected reaction monitoring) measurements, targeting hundreds of signature peptides by monitoring thousands of transitions in a single analysis. The design of highly multiplexed SRM analyses is driven by several factors: throughput, robustness, selectivity and sensitivity. Because of the complexity of the samples to be analyzed, some measurements (transitions) can be interfered by coeluting isobaric species resulting in biased or inconsistent estimated peptide/protein levels. Thus the assessment of the quality of SRM data is critical to allow flagging these inconsistent data. We describe an efficient and robust method to process large SRM data sets, including the processing of the raw data, the detection of low-quality measurements, the normalization of the signals for each protein, and the estimation of protein levels. Using this methodology, a variety of proteins previously associated with bladder cancer have been assessed through the analysis of urine samples from a large cohort of cancer patients and corresponding controls in an effort to establish a priority list of most promising candidates to guide subsequent clinical validation studies.
Shen, Xiaomeng; Hu, Qiang; Li, Jun; Wang, Jianmin; Qu, Jun
2015-10-02
Comprehensive and accurate evaluation of data quality and false-positive biomarker discovery is critical to direct the method development/optimization for quantitative proteomics, which nonetheless remains challenging largely due to the high complexity and unique features of proteomic data. Here we describe an experimental null (EN) method to address this need. Because the method experimentally measures the null distribution (either technical or biological replicates) using the same proteomic samples, the same procedures and the same batch as the case-vs-contol experiment, it correctly reflects the collective effects of technical variability (e.g., variation/bias in sample preparation, LC-MS analysis, and data processing) and project-specific features (e.g., characteristics of the proteome and biological variation) on the performances of quantitative analysis. To show a proof of concept, we employed the EN method to assess the quantitative accuracy and precision and the ability to quantify subtle ratio changes between groups using different experimental and data-processing approaches and in various cellular and tissue proteomes. It was found that choices of quantitative features, sample size, experimental design, data-processing strategies, and quality of chromatographic separation can profoundly affect quantitative precision and accuracy of label-free quantification. The EN method was also demonstrated as a practical tool to determine the optimal experimental parameters and rational ratio cutoff for reliable protein quantification in specific proteomic experiments, for example, to identify the necessary number of technical/biological replicates per group that affords sufficient power for discovery. Furthermore, we assessed the ability of EN method to estimate levels of false-positives in the discovery of altered proteins, using two concocted sample sets mimicking proteomic profiling using technical and biological replicates, respectively, where the true-positives/negatives are known and span a wide concentration range. It was observed that the EN method correctly reflects the null distribution in a proteomic system and accurately measures false altered proteins discovery rate (FADR). In summary, the EN method provides a straightforward, practical, and accurate alternative to statistics-based approaches for the development and evaluation of proteomic experiments and can be universally adapted to various types of quantitative techniques.
Less is More: Membrane Protein Digestion Beyond Urea–Trypsin Solution for Next-level Proteomics*
Zhang, Xi
2015-01-01
The goal of next-level bottom-up membrane proteomics is protein function investigation, via high-coverage high-throughput peptide-centric quantitation of expression, modifications and dynamic structures at systems scale. Yet efficient digestion of mammalian membrane proteins presents a daunting barrier, and prevalent day-long urea–trypsin in-solution digestion proved insufficient to reach this goal. Many efforts contributed incremental advances over past years, but involved protein denaturation that disconnected measurement from functional states. Beyond denaturation, the recent discovery of structure/proteomics omni-compatible detergent n-dodecyl-β-d-maltopyranoside, combined with pepsin and PNGase F columns, enabled breakthroughs in membrane protein digestion: a 2010 DDM-low-TCEP (DLT) method for H/D-exchange (HDX) using human G protein-coupled receptor, and a 2015 flow/detergent-facilitated protease and de-PTM digestions (FDD) for integrative deep sequencing and quantitation using full-length human ion channel complex. Distinguishing protein solubilization from denaturation, protease digestion reliability from theoretical specificity, and reduction from alkylation, these methods shifted day(s)-long paradigms into minutes, and afforded fully automatable (HDX)-protein-peptide-(tandem mass tag)-HPLC pipelines to instantly measure functional proteins at deep coverage, high peptide reproducibility, low artifacts and minimal leakage. Promoting—not destroying—structures and activities harnessed membrane proteins for the next-level streamlined functional proteomics. This review analyzes recent advances in membrane protein digestion methods and highlights critical discoveries for future proteomics. PMID:26081834
Lee, Jinoo; Valkova, Nelly; White, Mark P; Kültz, Dietmar
2006-09-01
We used dogfish shark (Squalus acanthias) as a model for proteome analysis of six different tissues to evaluate tissue-specific protein expression on a global scale and to deduce specific functions and the relatedness of multiple tissues from their proteomes. Proteomes of heart, brain, kidney, intestine, gill, and rectal gland were separated by two-dimensional gel electrophoresis (2DGE), gel images were matched using Delta 2D software and then evaluated for tissue-specific proteins. Sixty-one proteins (4%) were found to be in only a single type of tissue and 535 proteins (36%) were equally abundant in all six tissues. Relatedness between tissues was assessed based on tissue-specific expression patterns of all 1465 consistently resolved protein spots. This analysis revealed that tissues with osmoregulatory function (kidney, intestine, gill, rectal gland) were more similar in their overall proteomes than non-osmoregulatory tissues (heart, brain). Sixty-one proteins were identified by MALDI-TOF/TOF mass spectrometry and biological functions characteristic of osmoregulatory tissues were derived from gene ontology and molecular pathway analysis. Our data demonstrate that the molecular machinery for energy and urea metabolism and the Rho-GTPase/cytoskeleton pathway are enriched in osmoregulatory tissues of sharks. Our work provides a strong rationale for further study of the contribution of these mechanisms to the osmoregulation of marine sharks.
Kume, Hideaki; Muraoka, Satoshi; Kuga, Takahisa; Adachi, Jun; Narumi, Ryohei; Watanabe, Shio; Kuwano, Masayoshi; Kodera, Yoshio; Matsushita, Kazuyuki; Fukuoka, Junya; Masuda, Takeshi; Ishihama, Yasushi; Matsubara, Hisahiro; Nomura, Fumio; Tomonaga, Takeshi
2014-01-01
Recent advances in quantitative proteomic technology have enabled the large-scale validation of biomarkers. We here performed a quantitative proteomic analysis of membrane fractions from colorectal cancer tissue to discover biomarker candidates, and then extensively validated the candidate proteins identified. A total of 5566 proteins were identified in six tissue samples, each of which was obtained from polyps and cancer with and without metastasis. GO cellular component analysis predicted that 3087 of these proteins were membrane proteins, whereas TMHMM algorithm predicted that 1567 proteins had a transmembrane domain. Differences were observed in the expression of 159 membrane proteins and 55 extracellular proteins between polyps and cancer without metastasis, while the expression of 32 membrane proteins and 17 extracellular proteins differed between cancer with and without metastasis. A total of 105 of these biomarker candidates were quantitated using selected (or multiple) reaction monitoring (SRM/MRM) with stable synthetic isotope-labeled peptides as an internal control. The results obtained revealed differences in the expression of 69 of these proteins, and this was subsequently verified in an independent set of patient samples (polyps (n = 10), cancer without metastasis (n = 10), cancer with metastasis (n = 10)). Significant differences were observed in the expression of 44 of these proteins, including ITGA5, GPRC5A, PDGFRB, and TFRC, which have already been shown to be overexpressed in colorectal cancer, as well as proteins with unknown function, such as C8orf55. The expression of C8orf55 was also shown to be high not only in colorectal cancer, but also in several cancer tissues using a multicancer tissue microarray, which included 1150 cores from 14 cancer tissues. This is the largest verification study of biomarker candidate membrane proteins to date; our methods for biomarker discovery and subsequent validation using SRM/MRM will contribute to the identification of useful biomarker candidates for various cancers. Data are available via ProteomeXchange with identifier PXD000851. PMID:24687888
Xiong, Weili; Brown, Christopher T.; Morowitz, Michael J.; ...
2017-07-10
Establishment of the human gut microbiota begins at birth. This early-life microbiota development can impact host physiology during infancy and even across an entire life span. But, the functional stability and population structure of the gut microbiota during initial colonization remain poorly understood. Metaproteomics is an emerging technology for the large-scale characterization of metabolic functions in complex microbial communities (gut microbiota). We applied a metagenome-informed metaproteomic approach to study the temporal and inter-individual differences of metabolic functions during microbial colonization of preterm human infants’ gut. By analyzing 30 individual fecal samples, we identified up to 12,568 protein groups for eachmore » of four infants, including both human and microbial proteins. With genome-resolved matched metagenomics, proteins were confidently identified at the species/strain level. The maximum percentage of the proteome detected for the abundant organisms was ~45%. A time-dependent increase in the relative abundance of microbial versus human proteins suggested increasing microbial colonization during the first few weeks of early life. We observed remarkable variations and temporal shifts in the relative protein abundances of each organism in these preterm gut communities. Given the dissimilarity of the communities, only 81 microbial EggNOG orthologous groups and 57 human proteins were observed across all samples. These conserved microbial proteins were involved in carbohydrate, energy, amino acid and nucleotide metabolism while conserved human proteins were related to immune response and mucosal maturation. We also identified seven proteome clusters for the communities and showed infant gut proteome profiles were unstable across time and not individual-specific. By applying a gut-specific metabolic module (GMM) analysis, we found that gut communities varied primarily in the contribution of nutrient (carbohydrates, lipids, and amino acids) utilization and short-chain fatty acid production. Overall, this study reports species-specific proteome profiles and metabolic functions of human gut microbiota during early colonization. In particular, our work contributes to reveal microbiota-associated shifts and variations in the metabolism of three major nutrient sources and short-chain fatty acid during colonization of preterm infant gut.« less
Kume, Hideaki; Muraoka, Satoshi; Kuga, Takahisa; Adachi, Jun; Narumi, Ryohei; Watanabe, Shio; Kuwano, Masayoshi; Kodera, Yoshio; Matsushita, Kazuyuki; Fukuoka, Junya; Masuda, Takeshi; Ishihama, Yasushi; Matsubara, Hisahiro; Nomura, Fumio; Tomonaga, Takeshi
2014-06-01
Recent advances in quantitative proteomic technology have enabled the large-scale validation of biomarkers. We here performed a quantitative proteomic analysis of membrane fractions from colorectal cancer tissue to discover biomarker candidates, and then extensively validated the candidate proteins identified. A total of 5566 proteins were identified in six tissue samples, each of which was obtained from polyps and cancer with and without metastasis. GO cellular component analysis predicted that 3087 of these proteins were membrane proteins, whereas TMHMM algorithm predicted that 1567 proteins had a transmembrane domain. Differences were observed in the expression of 159 membrane proteins and 55 extracellular proteins between polyps and cancer without metastasis, while the expression of 32 membrane proteins and 17 extracellular proteins differed between cancer with and without metastasis. A total of 105 of these biomarker candidates were quantitated using selected (or multiple) reaction monitoring (SRM/MRM) with stable synthetic isotope-labeled peptides as an internal control. The results obtained revealed differences in the expression of 69 of these proteins, and this was subsequently verified in an independent set of patient samples (polyps (n = 10), cancer without metastasis (n = 10), cancer with metastasis (n = 10)). Significant differences were observed in the expression of 44 of these proteins, including ITGA5, GPRC5A, PDGFRB, and TFRC, which have already been shown to be overexpressed in colorectal cancer, as well as proteins with unknown function, such as C8orf55. The expression of C8orf55 was also shown to be high not only in colorectal cancer, but also in several cancer tissues using a multicancer tissue microarray, which included 1150 cores from 14 cancer tissues. This is the largest verification study of biomarker candidate membrane proteins to date; our methods for biomarker discovery and subsequent validation using SRM/MRM will contribute to the identification of useful biomarker candidates for various cancers. Data are available via ProteomeXchange with identifier PXD000851. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.
Yoneyama, Toshihiro; Ohtsuki, Sumio; Honda, Kazufumi; Kobayashi, Makoto; Iwasaki, Motoki; Uchida, Yasuo; Okusaka, Takuji; Nakamori, Shoji; Shimahara, Masashi; Ueno, Takaaki; Tsuchida, Akihiko; Sata, Naohiro; Ioka, Tatsuya; Yasunami, Yohichi; Kosuge, Tomoo; Kaneda, Takashi; Kato, Takao; Yagihara, Kazuhiro; Fujita, Shigeyuki; Huang, Wilber; Yamada, Tesshi; Tachikawa, Masanori; Terasaki, Tetsuya
2016-01-01
Pancreatic cancer is one of the most lethal tumors, and reliable detection of early-stage pancreatic cancer and risk diseases for pancreatic cancer is essential to improve the prognosis. As 260 genes were previously reported to be upregulated in invasive ductal adenocarcinoma of pancreas (IDACP) cells, quantification of the corresponding proteins in plasma might be useful for IDACP diagnosis. Therefore, the purpose of the present study was to identify plasma biomarkers for early detection of IDACP by using two proteomics strategies: antibody-based proteomics and liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based proteomics. Among the 260 genes, we focused on 130 encoded proteins with known function for which antibodies were available. Twenty-three proteins showed values of the area under the curve (AUC) of more than 0.8 in receiver operating characteristic (ROC) analysis of reverse-phase protein array (RPPA) data of IDACP patients compared with healthy controls, and these proteins were selected as biomarker candidates. We then used our high-throughput selected reaction monitoring or multiple reaction monitoring (SRM/MRM) methodology, together with an automated sample preparation system, micro LC and auto analysis system, to quantify these candidate proteins in plasma from healthy controls and IDACP patients on a large scale. The results revealed that insulin-like growth factor-binding protein (IGFBP)2 and IGFBP3 have the ability to discriminate IDACP patients at an early stage from healthy controls, and IGFBP2 appeared to be increased in risk diseases of pancreatic malignancy, such as intraductal papillary mucinous neoplasms (IPMNs). Furthermore, diagnosis of IDACP using the combination of carbohydrate antigen 19–9 (CA19-9), IGFBP2 and IGFBP3 is significantly more effective than CA19-9 alone. This suggests that IGFBP2 and IGFBP3 may serve as compensatory biomarkers for CA19-9. Early diagnosis with this marker combination may improve the prognosis of IDACP patients. PMID:27579675
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xiong, Weili; Brown, Christopher T.; Morowitz, Michael J.
Establishment of the human gut microbiota begins at birth. This early-life microbiota development can impact host physiology during infancy and even across an entire life span. But, the functional stability and population structure of the gut microbiota during initial colonization remain poorly understood. Metaproteomics is an emerging technology for the large-scale characterization of metabolic functions in complex microbial communities (gut microbiota). We applied a metagenome-informed metaproteomic approach to study the temporal and inter-individual differences of metabolic functions during microbial colonization of preterm human infants’ gut. By analyzing 30 individual fecal samples, we identified up to 12,568 protein groups for eachmore » of four infants, including both human and microbial proteins. With genome-resolved matched metagenomics, proteins were confidently identified at the species/strain level. The maximum percentage of the proteome detected for the abundant organisms was ~45%. A time-dependent increase in the relative abundance of microbial versus human proteins suggested increasing microbial colonization during the first few weeks of early life. We observed remarkable variations and temporal shifts in the relative protein abundances of each organism in these preterm gut communities. Given the dissimilarity of the communities, only 81 microbial EggNOG orthologous groups and 57 human proteins were observed across all samples. These conserved microbial proteins were involved in carbohydrate, energy, amino acid and nucleotide metabolism while conserved human proteins were related to immune response and mucosal maturation. We also identified seven proteome clusters for the communities and showed infant gut proteome profiles were unstable across time and not individual-specific. By applying a gut-specific metabolic module (GMM) analysis, we found that gut communities varied primarily in the contribution of nutrient (carbohydrates, lipids, and amino acids) utilization and short-chain fatty acid production. Overall, this study reports species-specific proteome profiles and metabolic functions of human gut microbiota during early colonization. In particular, our work contributes to reveal microbiota-associated shifts and variations in the metabolism of three major nutrient sources and short-chain fatty acid during colonization of preterm infant gut.« less
Xiong, Weili; Brown, Christopher T; Morowitz, Michael J; Banfield, Jillian F; Hettich, Robert L
2017-07-10
Establishment of the human gut microbiota begins at birth. This early-life microbiota development can impact host physiology during infancy and even across an entire life span. However, the functional stability and population structure of the gut microbiota during initial colonization remain poorly understood. Metaproteomics is an emerging technology for the large-scale characterization of metabolic functions in complex microbial communities (gut microbiota). We applied a metagenome-informed metaproteomic approach to study the temporal and inter-individual differences of metabolic functions during microbial colonization of preterm human infants' gut. By analyzing 30 individual fecal samples, we identified up to 12,568 protein groups for each of four infants, including both human and microbial proteins. With genome-resolved matched metagenomics, proteins were confidently identified at the species/strain level. The maximum percentage of the proteome detected for the abundant organisms was ~45%. A time-dependent increase in the relative abundance of microbial versus human proteins suggested increasing microbial colonization during the first few weeks of early life. We observed remarkable variations and temporal shifts in the relative protein abundances of each organism in these preterm gut communities. Given the dissimilarity of the communities, only 81 microbial EggNOG orthologous groups and 57 human proteins were observed across all samples. These conserved microbial proteins were involved in carbohydrate, energy, amino acid and nucleotide metabolism while conserved human proteins were related to immune response and mucosal maturation. We identified seven proteome clusters for the communities and showed infant gut proteome profiles were unstable across time and not individual-specific. Applying a gut-specific metabolic module (GMM) analysis, we found that gut communities varied primarily in the contribution of nutrient (carbohydrates, lipids, and amino acids) utilization and short-chain fatty acid production. Overall, this study reports species-specific proteome profiles and metabolic functions of human gut microbiota during early colonization. In particular, our work contributes to reveal microbiota-associated shifts and variations in the metabolism of three major nutrient sources and short-chain fatty acid during colonization of preterm infant gut.
2014-01-01
Background KIAA1199 is a recently identified novel gene that is up-regulated in human cancer with poor survival. Our proteomic study on signaling polarity in chemotactic cells revealed KIAA1199 as a novel protein target that may be involved in cellular chemotaxis and motility. In the present study, we examined the functional significance of KIAA1199 expression in breast cancer growth, motility and invasiveness. Methods We validated the previous microarray observation by tissue microarray immunohistochemistry using a TMA slide containing 12 breast tumor tissue cores and 12 corresponding normal tissues. We performed the shRNA-mediated knockdown of KIAA1199 in MDA-MB-231 and HS578T cells to study the role of this protein in cell proliferation, migration and apoptosis in vitro. We studied the effects of KIAA1199 knockdown in vivo in two groups of mice (n = 5). We carried out the SILAC LC-MS/MS based proteomic studies on the involvement of KIAA1199 in breast cancer. Results KIAA1199 mRNA and protein was significantly overexpressed in breast tumor specimens and cell lines as compared with non-neoplastic breast tissues from large-scale microarray and studies of breast cancer cell lines and tumors. To gain deeper insights into the novel role of KIAA1199 in breast cancer, we modulated KIAA1199 expression using shRNA-mediated knockdown in two breast cancer cell lines (MDA-MB-231 and HS578T), expressing higher levels of KIAA1199. The KIAA1199 knockdown cells showed reduced motility and cell proliferation in vitro. Moreover, when the knockdown cells were injected into the mammary fat pads of female athymic nude mice, there was a significant decrease in tumor incidence and growth. In addition, quantitative proteomic analysis revealed that knockdown of KIAA1199 in breast cancer (MDA-MB-231) cells affected a broad range of cellular functions including apoptosis, metabolism and cell motility. Conclusions Our findings indicate that KIAA1199 may play an important role in breast tumor growth and invasiveness, and that it may represent a novel target for biomarker development and a novel therapeutic target for breast cancer. PMID:24628760
Serum Proteomic Profiles In Subjects with Heavy Alcohol Abuse
Liangpunsakul, Suthat; Lai, Xianyin; Ringham, Heather N.; Crabb, David W.; Witzmann, Frank A.
2009-01-01
Objectives The abuse of alcohol is a major public health problem, and the diagnosis and care of patients with alcohol abuse and dependence is hindered by the lack of tests that can detect dangerous levels of drinking or relapse during therapy. Gastroenterologists and other healthcare providers find it very challenging to obtain an accurate alcohol drinking history. We hypothesized that the effects of ethanol on numerous systems may well be reflected in changes in quantity or qualities of constituent or novel plasma proteins or protein fragments. Organ/tissue-specific proteins may be released into the blood stream when cells are injured by alcohol, or when systemic changes are induced by alcohol, and such proteins would be detected using a proteomic approach. The objective of this pilot study was to determine if there are plasma proteome profiles that correlate with heavy alcohol use. Methods Paired serum samples, before and after intensive alcohol treatment, were obtained from subjects who attended an outpatient alcohol treatment program. Serum proteomic profiles using MALDI –OTOF Mass Spectrometry were compared between pre- and post treatment samples. Results Of 16 subjects who enrolled in the study, 8 were females. The mean age of the study subjects was 49 yrs. The baseline laboratory data showed elevated AST (54 ± 37 IU/L), ALT (37 ± 19 IU/L), and MCV (99 ± 5 fl). Self-reported pre-treatment drinking levels for these subjects averaged 17 ± 7drinks/day and 103 ± 37 drinks/week. Mass spectrometry analyses showed a novel 5.9 kDa protein, a fragment of alpha fibrinogen, isoform 1, that might be might be a new novel marker for abusive alcohol drinking. Conclusions We have shown in this pilot study that several potential protein markers have appeared in mass spectral profiles and that they may be useful clinically to determine the status of alcohol drinking by MALDI –OTOF mass spectrometry, especially a fragment of alpha fibrinogen, isoform 1. However, a large-scale study is needed to confirm and validate our current results. PMID:19672327
Schilling, Birgit; Gibson, Bradford W.; Hunter, Christie L.
2017-01-01
Data-independent acquisition is a powerful mass spectrometry technique that enables comprehensive MS and MS/MS analysis of all detectable species, providing an information rich data file that can be mined deeply. Here, we describe how to acquire high-quality SWATH® Acquisition data to be used for large quantitative proteomic studies. We specifically focus on using variable sized Q1 windows for acquisition of MS/MS data for generating higher specificity quantitative data. PMID:28188533
Steiner, Carine; Ducret, Axel; Tille, Jean-Christophe; Thomas, Marlene; McKee, Thomas A; Rubbia-Brandt, Laura A; Scherl, Alexander; Lescuyer, Pierre; Cutler, Paul
2014-01-01
Proteomic analysis of tissues has advanced in recent years as instruments and methodologies have evolved. The ability to retrieve peptides from formalin-fixed paraffin-embedded tissues followed by shotgun or targeted proteomic analysis is offering new opportunities in biomedical research. In particular, access to large collections of clinically annotated samples should enable the detailed analysis of pathologically relevant tissues in a manner previously considered unfeasible. In this paper, we review the current status of proteomic analysis of formalin-fixed paraffin-embedded tissues with a particular focus on targeted approaches and the potential for this technique to be used in clinical research and clinical diagnosis. We also discuss the limitations and perspectives of the technique, particularly with regard to application in clinical diagnosis and drug discovery. PMID:24339433
Basic and clinical proteomics from the EU Health Research perspective.
Dyląg, Tomasz; Jehenson, Philippe; van de Loo, Jan-Willem; Sanne, Jean-Luc
2010-12-01
The European Union (EU) is one of the main public funders of research in Europe and its major instrument for funding is the Seventh Framework Programme for research and technological development (FP7). The bulk of funding in FP7 goes to collaborative research, with the objective of establishing excellent research projects and networks. Understanding the functions of proteins is essential for the rational development of disease prevention, diagnosis and treatment, therefore the EU has largely invested in proteomics, in particular for technology development, data standardisation and sharing efforts, and the application of proteomics in the clinic. The scientific community, including both academia and industry, is encouraged to apply for FP7 funding so that the EU can even more efficiently support innovative health research and ultimately, bring better healthcare to patients.
Li, Yuanyuan; Lian, Hengning; Jia, Qingzhu; Wan, Ying
2015-02-06
Non-small cell lung cancer (NSCLC) is a common malignant disease, and in ~10-20% of patients, pleural effusion is the first symptom. The pleural effusion proteome contains information on pulmonary disease that directly or indirectly reflects pathophysiological status. However, the proteome of pleural effusion in NSCLC patients is not well understood, nor is the variability in protein composition between malignant and benign pleural effusions. Here, we investigated the different proteins in pleural effusions from NSCLC and tuberculosis (TB) patients by using nano-scale liquid chromatography-tandem mass spectrometry (nLC-MS/MS) analysis. In total, 363 proteins were identified in the NSCLC pleural effusion proteome with a low false discovery rate (<1%), and 199 proteins were unique to NSCLC. The proteins in the NSCLC patients' pleural effusion were involved in cell adhesion, proteolysis, and cell migration. Furthermore, interleukin 1 alpha (IL1A), a protein that regulates tumor growth, angiogenesis, and metastasis, was significantly more abundant in the NSCLC group compared to the TB group, a finding that was validated with an ELISA assay. Copyright © 2014 Elsevier Inc. All rights reserved.
Cerebrospinal Fluid Biomarkers for Huntington's Disease.
Byrne, Lauren M; Wild, Edward J
2016-01-01
Cerebrospinal fluid (CSF) is enriched in brain-derived components and represents an accessible and appealing means of interrogating the CNS milieu to study neurodegenerative diseases and identify biomarkers to facilitate the development of novel therapeutics. Many such CSF biomarkers have been proposed for Huntington's disease (HD) but none has been validated for clinical trial use. Across many studies proposing dozens of biomarker candidates, there is a notable lack of statistical power, consistency, rigor and validation. Here we review proposed CSF biomarkers including neurotransmitters, transglutaminase activity, kynurenine pathway metabolites, oxidative stress markers, inflammatory markers, neuroendocrine markers, protein markers of neuronal death, proteomic approaches and mutant huntingtin protein itself. We reflect on the need for large-scale, standardized CSF collections with detailed phenotypic data to validate and qualify much-needed CSF biomarkers for clinical trial use in HD.
Computation as the mechanistic bridge between precision medicine and systems therapeutics.
Hansen, J; Iyengar, R
2013-01-01
Over the past 50 years, like molecular cell biology, medicine and pharmacology have been driven by a reductionist approach. The focus on individual genes and cellular components as disease loci and drug targets has been a necessary step in understanding the basic mechanisms underlying tissue/organ physiology and drug action. Recent progress in genomics and proteomics, as well as advances in other technologies that enable large-scale data gathering and computational approaches, is providing new knowledge of both normal and disease states. Systems-biology approaches enable integration of knowledge from different types of data for precision medicine and systems therapeutics. In this review, we describe recent studies that contribute to these emerging fields and discuss how together these fields can lead to a mechanism-based therapy for individual patients.
Computation as the Mechanistic Bridge Between Precision Medicine and Systems Therapeutics
Hansen, J; Iyengar, R
2014-01-01
Over the past 50 years, like molecular cell biology, medicine and pharmacology have been driven by a reductionist approach. The focus on individual genes and cellular components as disease loci and drug targets has been a necessary step in understanding the basic mechanisms underlying tissue/organ physiology and drug action. Recent progress in genomics and proteomics, as well as advances in other technologies that enable large-scale data gathering and computational approaches, is providing new knowledge of both normal and disease states. Systems-biology approaches enable integration of knowledge from different types of data for precision medicine and systems therapeutics. In this review, we describe recent studies that contribute to these emerging fields and discuss how together these fields can lead to a mechanism-based therapy for individual patients. PMID:23212109
Comparative secretome analyses of two Trichoderma reesei RUT-C30 and CL847 hypersecretory strains
Herpoël-Gimbert, Isabelle; Margeot, Antoine; Dolla, Alain; Jan, Gwénaël; Mollé, Daniel; Lignon, Sabrina; Mathis, Hughes; Sigoillot, Jean-Claude; Monot, Frédéric; Asther, Marcel
2008-01-01
Background Due to its capacity to produce large amounts of cellulases, Trichoderma reesei is increasingly been researched in various fields of white biotechnology, especially in biofuel production from lignocellulosic biomass. The commercial enzyme mixtures produced at industrial scales are not well characterized, and their proteinaceous components are poorly identified and quantified. The development of proteomic methods has made it possible to comprehensively overview the enzymes involved in lignocellulosic biomass degradation which are secreted under various environmental conditions. Results The protein composition of the secretome produced by industrial T. reesei (strain CL847) grown on a medium promoting the production of both cellulases and hemicellulases was explored using two-dimensional electrophoresis and MALDI-TOF or LC-MS/MS protein identification. A total of 22 protein species were identified. As expected, most of them are potentially involved in biomass degradation. The 2D map obtained was then used to compare the secretomes produced by CL847 and another efficient cellulolytic T. reesei strain, Rut-C30, the reference cellulase-overproducing strain using lactose as carbon source and inducer of cellulases. Conclusion This study provides the most complete mapping of the proteins secreted by T. reesei to date. We report on the first use of proteomics to compare secretome composition between two cellulase-overproducing strains Rut-C30 and CL847 grown under similar conditions. Comparison of protein patterns in both strains highlighted many unexpected differences between cellulase cocktails. The results demonstrate that 2D electrophoresis is a promising tool for studying cellulase production profiles, whether for industrial characterization of an entire secretome or for a more fundamental study on cellulase expression at genome-wide scale. PMID:19105830
Functional Genomic Landscape of Human Breast Cancer Drivers, Vulnerabilities, and Resistance.
Marcotte, Richard; Sayad, Azin; Brown, Kevin R; Sanchez-Garcia, Felix; Reimand, Jüri; Haider, Maliha; Virtanen, Carl; Bradner, James E; Bader, Gary D; Mills, Gordon B; Pe'er, Dana; Moffat, Jason; Neel, Benjamin G
2016-01-14
Large-scale genomic studies have identified multiple somatic aberrations in breast cancer, including copy number alterations and point mutations. Still, identifying causal variants and emergent vulnerabilities that arise as a consequence of genetic alterations remain major challenges. We performed whole-genome small hairpin RNA (shRNA) "dropout screens" on 77 breast cancer cell lines. Using a hierarchical linear regression algorithm to score our screen results and integrate them with accompanying detailed genetic and proteomic information, we identify vulnerabilities in breast cancer, including candidate "drivers," and reveal general functional genomic properties of cancer cells. Comparisons of gene essentiality with drug sensitivity data suggest potential resistance mechanisms, effects of existing anti-cancer drugs, and opportunities for combination therapy. Finally, we demonstrate the utility of this large dataset by identifying BRD4 as a potential target in luminal breast cancer and PIK3CA mutations as a resistance determinant for BET-inhibitors. Copyright © 2016 Elsevier Inc. All rights reserved.
Identification of new intrinsic proteins in Arabidopsis plasma membrane proteome.
Marmagne, Anne; Rouet, Marie-Aude; Ferro, Myriam; Rolland, Norbert; Alcon, Carine; Joyard, Jacques; Garin, Jérome; Barbier-Brygoo, Hélène; Ephritikhine, Geneviève
2004-07-01
Identification and characterization of anion channel genes in plants represent a goal for a better understanding of their central role in cell signaling, osmoregulation, nutrition, and metabolism. Though channel activities have been well characterized in plasma membrane by electrophysiology, the corresponding molecular entities are little documented. Indeed, the hydrophobic protein equipment of plant plasma membrane still remains largely unknown, though several proteomic approaches have been reported. To identify new putative transport systems, we developed a new proteomic strategy based on mass spectrometry analyses of a plasma membrane fraction enriched in hydrophobic proteins. We produced from Arabidopsis cell suspensions a highly purified plasma membrane fraction and characterized it in detail by immunological and enzymatic tests. Using complementary methods for the extraction of hydrophobic proteins and mass spectrometry analyses on mono-dimensional gels, about 100 proteins have been identified, 95% of which had never been found in previous proteomic studies. The inventory of the plasma membrane proteome generated by this approach contains numerous plasma membrane integral proteins, one-third displaying at least four transmembrane segments. The plasma membrane localization was confirmed for several proteins, therefore validating such proteomic strategy. An in silico analysis shows a correlation between the putative functions of the identified proteins and the expected roles for plasma membrane in transport, signaling, cellular traffic, and metabolism. This analysis also reveals 10 proteins that display structural properties compatible with transport functions and will constitute interesting targets for further functional studies.
Park, Gun Wook; Hwang, Heeyoun; Kim, Kwang Hoe; Lee, Ju Yeon; Lee, Hyun Kyoung; Park, Ji Yeong; Ji, Eun Sun; Park, Sung-Kyu Robin; Yates, John R; Kwon, Kyung-Hoon; Park, Young Mok; Lee, Hyoung-Joo; Paik, Young-Ki; Kim, Jin Young; Yoo, Jong Shin
2016-11-04
In the Chromosome-Centric Human Proteome Project (C-HPP), false-positive identification by peptide spectrum matches (PSMs) after database searches is a major issue for proteogenomic studies using liquid-chromatography and mass-spectrometry-based large proteomic profiling. Here we developed a simple strategy for protein identification, with a controlled false discovery rate (FDR) at the protein level, using an integrated proteomic pipeline (IPP) that consists of four engrailed steps as follows. First, using three different search engines, SEQUEST, MASCOT, and MS-GF+, individual proteomic searches were performed against the neXtProt database. Second, the search results from the PSMs were combined using statistical evaluation tools including DTASelect and Percolator. Third, the peptide search scores were converted into E-scores normalized using an in-house program. Last, ProteinInferencer was used to filter the proteins containing two or more peptides with a controlled FDR of 1.0% at the protein level. Finally, we compared the performance of the IPP to a conventional proteomic pipeline (CPP) for protein identification using a controlled FDR of <1% at the protein level. Using the IPP, a total of 5756 proteins (vs 4453 using the CPP) including 477 alternative splicing variants (vs 182 using the CPP) were identified from human hippocampal tissue. In addition, a total of 10 missing proteins (vs 7 using the CPP) were identified with two or more unique peptides, and their tryptic peptides were validated using MS/MS spectral pattern from a repository database or their corresponding synthetic peptides. This study shows that the IPP effectively improved the identification of proteins, including alternative splicing variants and missing proteins, in human hippocampal tissues for the C-HPP. All RAW files used in this study were deposited in ProteomeXchange (PXD000395).
Venkataramanan, Keerthi P; Min, Lie; Hou, Shuyu; Jones, Shawn W; Ralston, Matthew T; Lee, Kelvin H; Papoutsakis, E Terry
2015-01-01
Clostridium acetobutylicum is a model organism for both clostridial biology and solvent production. The organism is exposed to its own toxic metabolites butyrate and butanol, which trigger an adaptive stress response. Integrative analysis of proteomic and RNAseq data may provide novel insights into post-transcriptional regulation. The identified iTRAQ-based quantitative stress proteome is made up of 616 proteins with a 15 % genome coverage. The differentially expressed proteome correlated poorly with the corresponding differential RNAseq transcriptome. Up to 31 % of the differentially expressed proteins under stress displayed patterns opposite to those of the transcriptome, thus suggesting significant post-transcriptional regulation. The differential proteome of the translation machinery suggests that cells employ a different subset of ribosomal proteins under stress. Several highly upregulated proteins but with low mRNA levels possessed mRNAs with long 5'UTRs and strong RBS scores, thus supporting the argument that regulatory elements on the long 5'UTRs control their translation. For example, the oxidative stress response rubrerythrin was upregulated only at the protein level up to 40-fold without significant mRNA changes. We also identified many leaderless transcripts, several displaying different transcriptional start sites, thus suggesting mRNA-trimming mechanisms under stress. Downregulation of Rho and partner proteins pointed to changes in transcriptional elongation and termination under stress. The integrative proteomic-transcriptomic analysis demonstrated complex expression patterns of a large fraction of the proteome. Such patterns could not have been detected with one or the other omic analyses. Our analysis proposes the involvement of specific molecular mechanisms of post-transcriptional regulation to explain the observed complex stress response.
Rokyta, Darin R; Ward, Micaiah J
2017-03-15
The order Scorpiones is one of the most ancient and diverse lineages of venomous animals, having originated approximately 430 million years ago and diversified into 14 extant families. Although partial venom characterizations have been described for numerous scorpion species, we provided the first quantitative transcriptome/proteome comparison for a scorpion species using single-animal approaches. We sequenced the venom-gland transcriptomes of a male and female black-back scorpion (Hadrurus spadix) from the family Caraboctonidae using the Illumina sequencing platform and conducted independent quantitative mass-spectrometry analyses of their venoms. We identified 79 proteomically confirmed venom proteins, an additional 69 transcripts with homology to toxins from other species, and 596 nontoxin proteins expressed at high levels in the venom glands. The venom of H. spadix was rich in antimicrobial peptides, K + -channel toxins, and several classes of peptidases. However, the most diverse and one of the most abundant classes of putative toxins could not be assigned even a tentative functional role on the basis of homology, indicating that this venom contained a wealth of previously unexplored animal toxin diversity. We found good agreement between both transcriptomic and proteomic abundances across individuals, but transcriptomic and proteomic abundandances differed substantially within each individual. Small peptide toxins such as K + -channel toxins and antimicrobial peptides proved challenging to detect proteomically, at least in part due to the significant proteolytic processing involved in their maturation. In addition, we found a significant tendency for our proteomic approach to overestimate the abundances of large putative toxins and underestimate the abundances of smaller toxins. Copyright © 2017 Elsevier Ltd. All rights reserved.
Jeromson, Stewart; Mackenzie, Ivor; Doherty, Mary K; Whitfield, Phillip D; Bell, Gordon; Dick, James; Shaw, Andy; Rao, Francesco V; Ashcroft, Stephen P; Philp, Andrew; Galloway, Stuart D R; Gallagher, Iain; Hamilton, D Lee
2018-06-01
In striated muscle, eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) have differential effects on the metabolism of glucose and differential effects on the metabolism of protein. We have shown that, despite similar incorporation, treatment of C 2 C 12 myotubes (CM) with EPA but not DHA improves glucose uptake and protein accretion. We hypothesized that these differential effects of EPA and DHA may be due to divergent shifts in lipidomic profiles leading to altered proteomic profiles. We therefore carried out an assessment of the impact of treating CM with EPA and DHA on lipidomic and proteomic profiles. Fatty acid methyl esters (FAME) analysis revealed that both EPA and DHA led to similar but substantials changes in fatty acid profiles with the exception of arachidonic acid, which was decreased only by DHA, and docosapentanoic acid (DPA), which was increased only by EPA treatment. Global lipidomic analysis showed that EPA and DHA induced large alterations in the cellular lipid profiles and in particular, the phospholipid classes. Subsequent targeted analysis confirmed that the most differentially regulated species were phosphatidylcholines and phosphatidylethanolamines containing long-chain fatty acids with five (EPA treatment) or six (DHA treatment) double bonds. As these are typically membrane-associated lipid species we hypothesized that these treatments differentially altered the membrane-associated proteome. Stable isotope labeling by amino acids in cell culture (SILAC)-based proteomics of the membrane fraction revealed significant divergence in the effects of EPA and DHA on the membrane-associated proteome. We conclude that the EPA-specific increase in polyunsaturated long-chain fatty acids in the phospholipid fraction is associated with an altered membrane-associated proteome and these may be critical events in the metabolic remodeling induced by EPA treatment.
Kitata, Reta Birhanu; Dimayacyac-Esleta, Baby Rorielyn T; Choong, Wai-Kok; Tsai, Chia-Feng; Lin, Tai-Du; Tsou, Chih-Chiang; Weng, Shao-Hsing; Chen, Yi-Ju; Yang, Pan-Chyr; Arco, Susan D; Nesvizhskii, Alexey I; Sung, Ting-Yi; Chen, Yu-Ju
2015-09-04
Despite significant efforts in the past decade toward complete mapping of the human proteome, 3564 proteins (neXtProt, 09-2014) are still "missing proteins". Over one-third of these missing proteins are annotated as membrane proteins, owing to their relatively challenging accessibility with standard shotgun proteomics. Using nonsmall cell lung cancer (NSCLC) as a model study, we aim to mine missing proteins from disease-associated membrane proteome, which may be still largely under-represented. To increase identification coverage, we employed Hp-RP StageTip prefractionation of membrane-enriched samples from 11 NSCLC cell lines. Analysis of membrane samples from 20 pairs of tumor and adjacent normal lung tissue was incorporated to include physiologically expressed membrane proteins. Using multiple search engines (X!Tandem, Comet, and Mascot) and stringent evaluation of FDR (MAYU and PeptideShaker), we identified 7702 proteins (66% membrane proteins) and 178 missing proteins (74 membrane proteins) with PSM-, peptide-, and protein-level FDR of 1%. Through multiple reaction monitoring using synthetic peptides, we provided additional evidence of eight missing proteins including seven with transmembrane helix domains. This study demonstrates that mining missing proteins focused on cancer membrane subproteome can greatly contribute to map the whole human proteome. All data were deposited into ProteomeXchange with the identifier PXD002224.
The “Dark Side” of Food Stuff Proteomics: The CPLL-Marshals Investigate
Righetti, Pier Giorgio; Fasoli, Elisa; D’Amato, Alfonsina; Boschetti, Egisto
2014-01-01
The present review deals with analysis of the proteome of animal and plant-derived food stuff, as well as of non-alcoholic and alcoholic beverages. The survey is limited to those systems investigated with the help of combinatorial peptide ligand libraries, a most powerful technique allowing access to low- to very-low-abundance proteins, i.e., to those proteins that might characterize univocally a given biological system and, in the case of commercial food preparations, attest their genuineness or adulteration. Among animal foods the analysis of cow’s and donkey’s milk is reported, together with the proteomic composition of egg white and yolk, as well as of honey, considered as a hybrid between floral and animal origin. In terms of plant and fruits, a survey is offered of spinach, artichoke, banana, avocado, mango and lemon proteomics, considered as recalcitrant tissues in that small amounts of proteins are dispersed into a large body of plant polymers and metabolites. As examples of non-alcoholic beverages, ginger ale, coconut milk, a cola drink, almond milk and orgeat syrup are analyzed. Finally, the trace proteome of white and red wines, beer and aperitifs is reported, with the aim of tracing the industrial manipulations and herbal usage prior to their commercialization. PMID:28234315
Sharma, Mukut; Halligan, Brian D; Wakim, Bassam T; Savin, Virginia J; Cohen, Eric P; Moulder, John E
2008-06-18
Terrorist attacks or nuclear accidents could expose large numbers of people to ionizing radiation, and early biomarkers of radiation injury would be critical for triage, treatment and follow-up of such individuals. However, no such biomarkers have yet been proven to exist. We tested the potential of high throughput proteomics to identify protein biomarkers of radiation injury after total body X-ray irradiation in a rat model. Subtle functional changes in the kidney are suggested by an increased glomerular permeability for macromolecules measured within 24 hours after TBI. Ultrastructural changes in glomerular podocytes include partial loss of the interdigitating organization of foot processes. Analysis of urine by LC-MS/MS and 2D-GE showed significant changes in the urine proteome within 24 hours after TBI. Tissue kallikrein 1-related peptidase, cysteine proteinase inhibitor cystatin C and oxidized histidine were found to be increased while a number of proteinase inhibitors including kallikrein-binding protein and albumin were found to be decreased post-irradiation. Thus, TBI causes immediately detectable changes in renal structure and function and in the urinary protein profile. This suggests that both systemic and renal changes are induced by radiation and it may be possible to identify a set of biomarkers unique to radiation injury.
NASA Astrophysics Data System (ADS)
Raphael, Itay; Mahesula, Swetha; Purkar, Anjali; Black, David; Catala, Alexis; Gelfond, Jonathon A. L.; Forsthuber, Thomas G.; Haskins, William E.
2014-09-01
Central nervous system-specific proteins (CSPs), transported across the damaged blood-brain-barrier (BBB) to cerebrospinal fluid (CSF) and blood (serum), might be promising diagnostic, prognostic and predictive protein biomarkers of disease in individual multiple sclerosis (MS) patients because they are not expected to be present at appreciable levels in the circulation of healthy subjects. We hypothesized that microwave & magnetic (M2) proteomics of CSPs in brain tissue might be an effective means to prioritize putative CSP biomarkers for future immunoassays in serum. To test this hypothesis, we used M2 proteomics to longitudinally assess CSP expression in brain tissue from mice during experimental autoimmune encephalomyelitis (EAE), a mouse model of MS. Confirmation of central nervous system (CNS)-infiltrating inflammatory cell response and CSP expression in serum was achieved with cytokine ELISPOT and ELISA immunoassays, respectively, for selected CSPs. M2 proteomics (and ELISA) revealed characteristic CSP expression waves, including synapsin-1 and α-II-spectrin, which peaked at day 7 in brain tissue (and serum) and preceded clinical EAE symptoms that began at day 10 and peaked at day 20. Moreover, M2 proteomics supports the concept that relatively few CNS-infiltrating inflammatory cells can have a disproportionally large impact on CSP expression prior to clinical manifestation of EAE.
Fang, Yu; Feng, Mao; Han, Bin; Qi, Yuping; Hu, Han; Fan, Pei; Huo, Xinmei; Meng, Lifeng; Li, Jianke
2015-09-04
The worker and drone bees each contain a separate diploid and haploid genetic makeup, respectively. Mechanisms regulating the embryogenesis of the drone and its mechanistic difference with the worker are still poorly understood. The proteomes of the two embryos at three time-points throughout development were analyzed by applying mass spectrometry-based proteomics. We identified 2788 and 2840 proteins in the worker and drone embryos, respectively. The age-dependent proteome driving the drone embryogenesis generally follows the worker's. The two embryos however evolve a distinct proteome setting to prime their respective embryogenesis. The strongly expressed proteins and pathways related to transcriptional-translational machinery and morphogenesis at 24 h drone embryo relative to the worker, illustrating the earlier occurrence of morphogenesis in the drone than worker. These morphogenesis differences remain through to the middle-late stage in the two embryos. The two embryos employ distinct antioxidant mechanisms coinciding with the temporal-difference organogenesis. The drone embryo's strongly expressed cytoskeletal proteins signify key roles to match its large body size. The RNAi induced knockdown of the ribosomal protein offers evidence for the functional investigation of gene regulating of honeybee embryogenesis. The data significantly expand novel regulatory mechanisms governing the embryogenesis, which is potentially important for honeybee and other insects.
Zeng, Yunliu; Pan, Zhiyong; Ding, Yuduan; Zhu, Andan; Cao, Hongbo; Xu, Qiang; Deng, Xiuxin
2011-11-01
Here, a comprehensive proteomic analysis of the chromoplasts purified from sweet orange using Nycodenz density gradient centrifugation is reported. A GeLC-MS/MS shotgun approach was used to identify the proteins of pooled chromoplast samples. A total of 493 proteins were identified from purified chromoplasts, of which 418 are putative plastid proteins based on in silico sequence homology and functional analyses. Based on the predicted functions of these identified plastid proteins, a large proportion (∼60%) of the chromoplast proteome of sweet orange is constituted by proteins involved in carbohydrate metabolism, amino acid/protein synthesis, and secondary metabolism. Of note, HDS (hydroxymethylbutenyl 4-diphosphate synthase), PAP (plastid-lipid-associated protein), and psHSPs (plastid small heat shock proteins) involved in the synthesis or storage of carotenoid and stress response are among the most abundant proteins identified. A comparison of chromoplast proteomes between sweet orange and tomato suggested a high level of conservation in a broad range of metabolic pathways. However, the citrus chromoplast was characterized by more extensive carotenoid synthesis, extensive amino acid synthesis without nitrogen assimilation, and evidence for lipid metabolism concerning jasmonic acid synthesis. In conclusion, this study provides an insight into the major metabolic pathways as well as some unique characteristics of the sweet orange chromoplasts at the whole proteome level.
Proteomics technique opens new frontiers in mobilome research
Davidson, Andrew D.; Matthews, David A.
2017-01-01
ABSTRACT A large proportion of the genome of most eukaryotic organisms consists of highly repetitive mobile genetic elements. The sum of these elements is called the “mobilome,” which in eukaryotes is made up mostly of transposons. Transposable elements contribute to disease, evolution, and normal physiology by mediating genetic rearrangement, and through the “domestication” of transposon proteins for cellular functions. Although ‘omics studies of mobilome genomes and transcriptomes are common, technical challenges have hampered high-throughput global proteomics analyses of transposons. In a recent paper, we overcame these technical hurdles using a technique called “proteomics informed by transcriptomics” (PIT), and thus published the first unbiased global mobilome-derived proteome for any organism (using cell lines derived from the mosquito Aedes aegypti). In this commentary, we describe our methods in more detail, and summarise our major findings. We also use new genome sequencing data to show that, in many cases, the specific genomic element expressing a given protein can be identified using PIT. This proteomic technique therefore represents an important technological advance that will open new avenues of research into the role that proteins derived from transposons and other repetitive and sequence diverse genetic elements, such as endogenous retroviruses, play in health and disease. PMID:28932623
Bostanci, Nagihan; Selevsek, Nathalie; Wolski, Witold; Grossmann, Jonas; Bao, Kai; Wahlander, Asa; Trachsel, Christian; Schlapbach, Ralph; Özturk, Veli Özgen; Afacan, Beral; Emingil, Gulnur; Belibasakis, Georgios N
2018-04-02
Periodontal diseases are among the most prevalent worldwide, but largely silent, chronic diseases. They affect the tooth-supporting tissues with multiple ramifications on life quality. Their early diagnosis is still challenging, due to lack of appropriate molecular diagnostic methods. Saliva offers a non-invasively collectable reservoir of clinically relevant biomarkers, which, if utilized efficiently, could facilitate early diagnosis and monitoring of ongoing disease. Despite several novel protein markers being recently enlisted by discovery proteomics, their routine diagnostic application is hampered by the lack of validation platforms that allow for rapid, accurate and simultaneous quantification of multiple proteins in large cohorts. We carried out a pipeline of two proteomic platforms; firstly, we applied open ended label-free quantitative (LFQ) proteomics for discovery in saliva (n=67, health, gingivitis, and periodontitis), followed by selected-reaction monitoring (SRM)-targeted proteomics for validation in an independent cohort (n=82). The LFQ platform led to the discovery of 119 proteins with at least two-fold significant difference between health and disease. The 65 proteins chosen for the subsequent SRM platform included 50 related proteins derived from the significantly enriched processes of the LFQ data, 11 from literature-mining, and four house-keeping ones. Among those, 60 were reproducibly quantifiable proteins (92% success rate), represented by a total of 143 peptides. Machine-learning modeling led to a narrowed-down panel of five proteins of high predictive value for periodontal diseases (higher in disease: Matrix metalloproteinase-9, Ras-related protein-1, Actin-related protein 2/3 complex subunit 5; lower in disease: Clusterin, Deleted in Malignant Brain Tumors 1), with maximum area under the receiver operating curve >0.97. This panel enriches the pool of credible clinical biomarker candidates for diagnostic assay development. Yet, the quantum leap brought in periodontal diagnostics by this study lies in the introduction of the well established discovery-through-verification pipeline for periodontal biomarker discovery and validation in further periodontal patient cohorts. Published under license by The American Society for Biochemistry and Molecular Biology, Inc.