large-scale proteome evolution: Topics by Science.gov

Sample records for large-scale proteome evolution

ProteomeVis: a web app for exploration of protein properties from structure to sequence evolution across organisms' proteomes.

PubMed

Razban, Rostam M; Gilson, Amy I; Durfee, Niamh; Strobelt, Hendrik; Dinkla, Kasper; Choi, Jeong-Mo; Pfister, Hanspeter; Shakhnovich, Eugene I

2018-05-08

Protein evolution spans time scales and its effects span the length of an organism. A web app named ProteomeVis is developed to provide a comprehensive view of protein evolution in the S. cerevisiae and E. coli proteomes. ProteomeVis interactively creates protein chain graphs, where edges between nodes represent structure and sequence similarities within user-defined ranges, to study the long time scale effects of protein structure evolution. The short time scale effects of protein sequence evolution are studied by sequence evolutionary rate (ER) correlation analyses with protein properties that span from the molecular to the organismal level. We demonstrate the utility and versatility of ProteomeVis by investigating the distribution of edges per node in organismal protein chain universe graphs (oPCUGs) and putative ER determinants. S. cerevisiae and E. coli oPCUGs are scale-free with scaling constants of 1.79 and 1.56, respectively. Both scaling constants can be explained by a previously reported theoretical model describing protein structure evolution (Dokholyan et al., 2002). Protein abundance most strongly correlates with ER among properties in ProteomeVis, with Spearman correlations of -0.49 (p-value<10-10) and -0.46 (p-value<10-10) for S. cerevisiae and E. coli, respectively. This result is consistent with previous reports that found protein expression to be the most important ER determinant (Zhang and Yang, 2015). ProteomeVis is freely accessible at http://proteomevis.chem.harvard.edu. Supplementary data are available at Bioinformatics. shakhnovich@chemistry.harvard.edu.
Evolution of complete proteomes: guanine-cytosine pressure, phylogeny and environmental influences blend the proteomic architecture

PubMed Central

2013-01-01

Background Guanine-cytosine (GC) composition is an important feature of genomes. Likewise, amino acid composition is a distinct, but less valued, feature of proteomes. A major concern is that it is not clear what valuable information can be acquired from amino acid composition data. To address this concern, in-depth analyses of the amino acid composition of the complete proteomes from 63 archaea, 270 bacteria, and 128 eukaryotes were performed. Results Principal component analysis of the amino acid matrices showed that the main contributors to proteomic architecture were genomic GC variation, phylogeny, and environmental influences. GC pressure drove positive selection on Ala, Arg, Gly, Pro, Trp, and Val, and adverse selection on Asn, Lys, Ile, Phe, and Tyr. The physico-chemical framework of the complete proteomes withstood GC pressure by frequency complementation of GC-dependent amino acid pairs with similar physico-chemical properties. Gln, His, Ser, and Val were responsible for phylogeny and their constituted components could differentiate archaea, bacteria, and eukaryotes. Environmental niche was also a significant factor in determining proteomic architecture, especially for archaea for which the main amino acids were Cys, Leu, and Thr. In archaea, hyperthermophiles, acidophiles, mesophiles, psychrophiles, and halophiles gathered successively along the environment-based principal component. Concordance between proteomic architecture and the genetic code was also related closely to genomic GC content, phylogeny, and lifestyles. Conclusions Large-scale analyses of the complete proteomes of a wide range of organisms suggested that amino acid composition retained the trace of GC variation, phylogeny, and environmental influences during evolution. The findings from this study will help in the development of a global understanding of proteome evolution, and even biological evolution. PMID:24088322
Age distribution of human gene families shows significant roles of both large- and small-scale duplications in vertebrate evolution.

PubMed

Gu, Xun; Wang, Yufeng; Gu, Jianying

2002-06-01

The classical (two-round) hypothesis of vertebrate genome duplication proposes two successive whole-genome duplication(s) (polyploidizations) predating the origin of fishes, a view now being seriously challenged. As the debate largely concerns the relative merits of the 'big-bang mode' theory (large-scale duplication) and the 'continuous mode' theory (constant creation by small-scale duplications), we tested whether a significant proportion of paralogous genes in the contemporary human genome was indeed generated in the early stage of vertebrate evolution. After an extensive search of major databases, we dated 1,739 gene duplication events from the phylogenetic analysis of 749 vertebrate gene families. We found a pattern characterized by two waves (I, II) and an ancient component. Wave I represents a recent gene family expansion by tandem or segmental duplications, whereas wave II, a rapid paralogous gene increase in the early stage of vertebrate evolution, supports the idea of genome duplication(s) (the big-bang mode). Further analysis indicated that large- and small-scale gene duplications both make a significant contribution during the early stage of vertebrate evolution to build the current hierarchy of the human proteome.
Statistical Model to Analyze Quantitative Proteomics Data Obtained by 18O/16O Labeling and Linear Ion Trap Mass Spectrometry

PubMed Central

Jorge, Inmaculada; Navarro, Pedro; Martínez-Acedo, Pablo; Núñez, Estefanía; Serrano, Horacio; Alfranca, Arántzazu; Redondo, Juan Miguel; Vázquez, Jesús

2009-01-01

Statistical models for the analysis of protein expression changes by stable isotope labeling are still poorly developed, particularly for data obtained by 16O/18O labeling. Besides large scale test experiments to validate the null hypothesis are lacking. Although the study of mechanisms underlying biological actions promoted by vascular endothelial growth factor (VEGF) on endothelial cells is of considerable interest, quantitative proteomics studies on this subject are scarce and have been performed after exposing cells to the factor for long periods of time. In this work we present the largest quantitative proteomics study to date on the short term effects of VEGF on human umbilical vein endothelial cells by 18O/16O labeling. Current statistical models based on normality and variance homogeneity were found unsuitable to describe the null hypothesis in a large scale test experiment performed on these cells, producing false expression changes. A random effects model was developed including four different sources of variance at the spectrum-fitting, scan, peptide, and protein levels. With the new model the number of outliers at scan and peptide levels was negligible in three large scale experiments, and only one false protein expression change was observed in the test experiment among more than 1000 proteins. The new model allowed the detection of significant protein expression changes upon VEGF stimulation for 4 and 8 h. The consistency of the changes observed at 4 h was confirmed by a replica at a smaller scale and further validated by Western blot analysis of some proteins. Most of the observed changes have not been described previously and are consistent with a pattern of protein expression that dynamically changes over time following the evolution of the angiogenic response. With this statistical model the 18O labeling approach emerges as a very promising and robust alternative to perform quantitative proteomics studies at a depth of several thousand proteins. PMID:19181660
Mildew-Omics: How Global Analyses Aid the Understanding of Life and Evolution of Powdery Mildews.

PubMed

Bindschedler, Laurence V; Panstruga, Ralph; Spanu, Pietro D

2016-01-01

The common powdery mildew plant diseases are caused by ascomycete fungi of the order Erysiphales. Their characteristic life style as obligate biotrophs renders functional analyses in these species challenging, mainly because of experimental constraints to genetic manipulation. Global large-scale ("-omics") approaches are thus particularly valuable and insightful for the characterisation of the life and evolution of powdery mildews. Here we review the knowledge obtained so far from genomic, transcriptomic and proteomic studies in these fungi. We consider current limitations and challenges regarding these surveys and provide an outlook on desired future investigations on the basis of the various -omics technologies.
Phylogenetic Tracings of Proteome Size Support the Gradual Accretion of Protein Structural Domains and the Early Origin of Viruses from Primordial Cells

PubMed Central

Nasir, Arshan; Kim, Kyung Mo; Caetano-Anollés, Gustavo

2017-01-01

Untangling the origin and evolution of viruses remains a challenging proposition. We recently studied the global distribution of protein domain structures in thousands of completely sequenced viral and cellular proteomes with comparative genomics, phylogenomics, and multidimensional scaling methods. A tree of life describing the evolution of proteomes revealed viruses emerging from the base of the tree as a fourth supergroup of life. A tree of domains indicated an early origin of modern viral lineages from ancient cells that co-existed with the cellular ancestors. However, it was recently argued that the rooting of our trees and the basal placement of viruses was artifactually induced by small genome (proteome) size. Here we show that these claims arise from misunderstanding and misinterpretations of cladistic methodology. Trees are reconstructed unrooted, and thus, their topologies cannot be distorted a posteriori by the rooting methodology. Tracing proteome size in trees and multidimensional views of evolutionary relationships as well as tests of leaf stability and exclusion/inclusion of taxa demonstrated that the smallest proteomes were neither attracted toward the root nor caused any topological distortions of the trees. Simulations confirmed that taxa clustering patterns were independent of proteome size and were determined by the presence of known evolutionary relatives in data matrices, highlighting the need for broader taxon sampling in phylogeny reconstruction. Instead, phylogenetic tracings of proteome size revealed a slowdown in innovation of the structural domain vocabulary and four regimes of allometric scaling that reflected a Heaps law. These regimes explained increasing economies of scale in the evolutionary growth and accretion of kernel proteome repertoires of viruses and cellular organisms that resemble growth of human languages with limited vocabulary sizes. Results reconcile dynamic and static views of domain frequency distributions that are consistent with the axiom of spatiotemporal continuity that is tenet of evolutionary thinking. PMID:28690608
The Skeleton Forming Proteome of an Early Branching Metazoan: A Molecular Survey of the Biomineralization Components Employed by the Coralline Sponge Vaceletia Sp.

PubMed Central

Wörheide, Gert; Jackson, Daniel John

2015-01-01

The ability to construct a mineralized skeleton was a major innovation for the Metazoa during their evolution in the late Precambrian/early Cambrian. Porifera (sponges) hold an informative position for efforts aimed at unraveling the origins of this ability because they are widely regarded to be the earliest branching metazoans, and are among the first multi-cellular animals to display the ability to biomineralize in the fossil record. Very few biomineralization associated proteins have been identified in sponges so far, with no transcriptome or proteome scale surveys yet available. In order to understand what genetic repertoire may have been present in the last common ancestor of the Metazoa (LCAM), and that may have contributed to the evolution of the ability to biocalcify, we have studied the skeletal proteome of the coralline demosponge Vaceletia sp. and compare this to other metazoan biomineralizing proteomes. We bring some spatial resolution to this analysis by dividing Vaceletia’s aragonitic calcium carbonate skeleton into “head” and “stalk” regions. With our approach we were able to identify 40 proteins from both the head and stalk regions, with many of these sharing some similarity to previously identified gene products from other organisms. Among these proteins are known biomineralization compounds, such as carbonic anhydrase, spherulin, extracellular matrix proteins and very acidic proteins. This report provides the first proteome scale analysis of a calcified poriferan skeletal proteome, and its composition clearly demonstrates that the LCAM contributed several key enzymes and matrix proteins to its descendants that supported the metazoan ability to biocalcify. However, lineage specific evolution is also likely to have contributed significantly to the ability of disparate metazoan lineages to biocalcify. PMID:26536128
The role of internal duplication in the evolution of multi-domain proteins.

PubMed

Nacher, J C; Hayashida, M; Akutsu, T

2010-08-01

Many proteins consist of several structural domains. These multi-domain proteins have likely been generated by selective genome growth dynamics during evolution to perform new functions as well as to create structures that fold on a biologically feasible time scale. Domain units frequently evolved through a variety of genetic shuffling mechanisms. Here we examine the protein domain statistics of more than 1000 organisms including eukaryotic, archaeal and bacterial species. The analysis extends earlier findings on asymmetric statistical laws for proteome to a wider variety of species. While proteins are composed of a wide range of domains, displaying a power-law decay, the computation of domain families for each protein reveals an exponential distribution, characterizing a protein universe composed of a thin number of unique families. Structural studies in proteomics have shown that domain repeats, or internal duplicated domains, represent a small but significant fraction of genome. In spite of its importance, this observation has been largely overlooked until recently. We model the evolutionary dynamics of proteome and demonstrate that these distinct distributions are in fact rooted in an internal duplication mechanism. This process generates the contemporary protein structural domain universe, determines its reduced thickness, and tames its growth. These findings have important implications, ranging from protein interaction network modeling to evolutionary studies based on fundamental mechanisms governing genome expansion.
Mildew-Omics: How Global Analyses Aid the Understanding of Life and Evolution of Powdery Mildews

PubMed Central

Bindschedler, Laurence V.; Panstruga, Ralph; Spanu, Pietro D.

2016-01-01

The common powdery mildew plant diseases are caused by ascomycete fungi of the order Erysiphales. Their characteristic life style as obligate biotrophs renders functional analyses in these species challenging, mainly because of experimental constraints to genetic manipulation. Global large-scale (“-omics”) approaches are thus particularly valuable and insightful for the characterisation of the life and evolution of powdery mildews. Here we review the knowledge obtained so far from genomic, transcriptomic and proteomic studies in these fungi. We consider current limitations and challenges regarding these surveys and provide an outlook on desired future investigations on the basis of the various –omics technologies. PMID:26913042
iTRAQ-based quantitative proteomic analysis reveals proteomic changes in three fenoxaprop-P-ethyl-resistant Beckmannia syzigachne biotypes with differing ACCase mutations.

PubMed

Pan, Lang; Zhang, Jian; Wang, Junzhi; Yu, Qin; Bai, Lianyang; Dong, Liyao

2017-05-08

American sloughgrass (Beckmannia syzigachne Steud.) is a weed widely distributed in wheat fields of China. In recent years, the evolution of herbicide (fenoxaprop-P-ethyl)-resistant populations has decreased the susceptibility of B. syzigachne. This study compared 4 B. syzigachne populations (3 resistant and 1 susceptible) using iTRAQ to characterize fenoxaprop-P-ethyl resistance in B. syzigachne at the proteomic level. Through searching the UniProt database, 3104 protein species were identified from 13,335 unique peptides. Approximately 2834 protein species were assigned to 23 functional classifications provided by the COG database. Among these, 2299 protein species were assigned to 125 predicted pathways. The resistant biotype contained 8 protein species that changed in abundance relative to the susceptible biotype; they were involved in photosynthesis, oxidative phosphorylation, and fatty acid biosynthesis pathways. In contrast to previous studies comparing only 1 resistant and 1 susceptible population, our use of 3 fenoxaprop-resistant B. syzigachne populations with different genetic backgrounds minimized irrelevant differential expression and eliminated false positives. Therefore, we could more confidently link the differentially expressed proteins to herbicide resistance. Proteomic analysis demonstrated that fenoxaprop-P-ethyl resistance is associated with photosynthetic capacity, a connection that might be related to the target-site mutations in resistant B. syzigachne. This is the first large-scale proteomics study examining herbicide stress responses in different B. syzigachne biotypes. This study has biological relevance because it is the first to employ proteomic analysis for understanding the mechanisms underlying Beckmannia syzigachne herbicide resistance. The plant is a major weed in China and negatively affects crop yield, but has developed considerable resistance to the most common herbicide, fenoxaprop-P-ethyl. Through comparisons of resistant and sensitive biotypes, our study identified multiple proteins (involved in photosynthesis, oxidative phosphorylation, and fatty acid biosynthesis) that are putatively linked to B. syzigachne herbicide response. This large-scale proteomics study, sorely lacking in weed science, contributes valuable data that can be applied to more fine-tuned analyses on the functions of specific proteins in herbicide resistance. Copyright © 2017 Elsevier B.V. All rights reserved.
Large-Scale and Deep Quantitative Proteome Profiling Using Isobaric Labeling Coupled with Two-Dimensional LC-MS/MS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gritsenko, Marina A.; Xu, Zhe; Liu, Tao

Comprehensive, quantitative information on abundances of proteins and their post-translational modifications (PTMs) can potentially provide novel biological insights into diseases pathogenesis and therapeutic intervention. Herein, we introduce a quantitative strategy utilizing isobaric stable isotope-labelling techniques combined with two-dimensional liquid chromatography-tandem mass spectrometry (2D-LC-MS/MS) for large-scale, deep quantitative proteome profiling of biological samples or clinical specimens such as tumor tissues. The workflow includes isobaric labeling of tryptic peptides for multiplexed and accurate quantitative analysis, basic reversed-phase LC fractionation and concatenation for reduced sample complexity, and nano-LC coupled to high resolution and high mass accuracy MS analysis for high confidence identification andmore » quantification of proteins. This proteomic analysis strategy has been successfully applied for in-depth quantitative proteomic analysis of tumor samples, and can also be used for integrated proteome and PTM characterization, as well as comprehensive quantitative proteomic analysis across samples from large clinical cohorts.« less
Large-Scale and Deep Quantitative Proteome Profiling Using Isobaric Labeling Coupled with Two-Dimensional LC-MS/MS.

PubMed

Gritsenko, Marina A; Xu, Zhe; Liu, Tao; Smith, Richard D

2016-01-01

Comprehensive, quantitative information on abundances of proteins and their posttranslational modifications (PTMs) can potentially provide novel biological insights into diseases pathogenesis and therapeutic intervention. Herein, we introduce a quantitative strategy utilizing isobaric stable isotope-labeling techniques combined with two-dimensional liquid chromatography-tandem mass spectrometry (2D-LC-MS/MS) for large-scale, deep quantitative proteome profiling of biological samples or clinical specimens such as tumor tissues. The workflow includes isobaric labeling of tryptic peptides for multiplexed and accurate quantitative analysis, basic reversed-phase LC fractionation and concatenation for reduced sample complexity, and nano-LC coupled to high resolution and high mass accuracy MS analysis for high confidence identification and quantification of proteins. This proteomic analysis strategy has been successfully applied for in-depth quantitative proteomic analysis of tumor samples and can also be used for integrated proteome and PTM characterization, as well as comprehensive quantitative proteomic analysis across samples from large clinical cohorts.
Processing Shotgun Proteomics Data on the Amazon Cloud with the Trans-Proteomic Pipeline*

PubMed Central

Slagel, Joseph; Mendoza, Luis; Shteynberg, David; Deutsch, Eric W.; Moritz, Robert L.

2015-01-01

Cloud computing, where scalable, on-demand compute cycles and storage are available as a service, has the potential to accelerate mass spectrometry-based proteomics research by providing simple, expandable, and affordable large-scale computing to all laboratories regardless of location or information technology expertise. We present new cloud computing functionality for the Trans-Proteomic Pipeline, a free and open-source suite of tools for the processing and analysis of tandem mass spectrometry datasets. Enabled with Amazon Web Services cloud computing, the Trans-Proteomic Pipeline now accesses large scale computing resources, limited only by the available Amazon Web Services infrastructure, for all users. The Trans-Proteomic Pipeline runs in an environment fully hosted on Amazon Web Services, where all software and data reside on cloud resources to tackle large search studies. In addition, it can also be run on a local computer with computationally intensive tasks launched onto the Amazon Elastic Compute Cloud service to greatly decrease analysis times. We describe the new Trans-Proteomic Pipeline cloud service components, compare the relative performance and costs of various Elastic Compute Cloud service instance types, and present on-line tutorials that enable users to learn how to deploy cloud computing technology rapidly with the Trans-Proteomic Pipeline. We provide tools for estimating the necessary computing resources and costs given the scale of a job and demonstrate the use of cloud enabled Trans-Proteomic Pipeline by performing over 1100 tandem mass spectrometry files through four proteomic search engines in 9 h and at a very low cost. PMID:25418363
Processing shotgun proteomics data on the Amazon cloud with the trans-proteomic pipeline.

PubMed

Slagel, Joseph; Mendoza, Luis; Shteynberg, David; Deutsch, Eric W; Moritz, Robert L

2015-02-01

Cloud computing, where scalable, on-demand compute cycles and storage are available as a service, has the potential to accelerate mass spectrometry-based proteomics research by providing simple, expandable, and affordable large-scale computing to all laboratories regardless of location or information technology expertise. We present new cloud computing functionality for the Trans-Proteomic Pipeline, a free and open-source suite of tools for the processing and analysis of tandem mass spectrometry datasets. Enabled with Amazon Web Services cloud computing, the Trans-Proteomic Pipeline now accesses large scale computing resources, limited only by the available Amazon Web Services infrastructure, for all users. The Trans-Proteomic Pipeline runs in an environment fully hosted on Amazon Web Services, where all software and data reside on cloud resources to tackle large search studies. In addition, it can also be run on a local computer with computationally intensive tasks launched onto the Amazon Elastic Compute Cloud service to greatly decrease analysis times. We describe the new Trans-Proteomic Pipeline cloud service components, compare the relative performance and costs of various Elastic Compute Cloud service instance types, and present on-line tutorials that enable users to learn how to deploy cloud computing technology rapidly with the Trans-Proteomic Pipeline. We provide tools for estimating the necessary computing resources and costs given the scale of a job and demonstrate the use of cloud enabled Trans-Proteomic Pipeline by performing over 1100 tandem mass spectrometry files through four proteomic search engines in 9 h and at a very low cost. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
CPTAC | Office of Cancer Clinical Proteomics Research

Cancer.gov

The National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) is a national effort to accelerate the understanding of the molecular basis of cancer through the application of large-scale proteome and genome analysis, or proteogenomics.
Applications of Proteomic Technologies to Toxicology

EPA Science Inventory

Proteomics is the large-scale study of gene expression at the protein level. This cutting edge technology has been extensively applied to toxicology research recently. The up-to-date development of proteomics has presented the toxicology community with an unprecedented opportunit...
HiQuant: Rapid Postquantification Analysis of Large-Scale MS-Generated Proteomics Data.

PubMed

Bryan, Kenneth; Jarboui, Mohamed-Ali; Raso, Cinzia; Bernal-Llinares, Manuel; McCann, Brendan; Rauch, Jens; Boldt, Karsten; Lynn, David J

2016-06-03

Recent advances in mass-spectrometry-based proteomics are now facilitating ambitious large-scale investigations of the spatial and temporal dynamics of the proteome; however, the increasing size and complexity of these data sets is overwhelming current downstream computational methods, specifically those that support the postquantification analysis pipeline. Here we present HiQuant, a novel application that enables the design and execution of a postquantification workflow, including common data-processing steps, such as assay normalization and grouping, and experimental replicate quality control and statistical analysis. HiQuant also enables the interpretation of results generated from large-scale data sets by supporting interactive heatmap analysis and also the direct export to Cytoscape and Gephi, two leading network analysis platforms. HiQuant may be run via a user-friendly graphical interface and also supports complete one-touch automation via a command-line mode. We evaluate HiQuant's performance by analyzing a large-scale, complex interactome mapping data set and demonstrate a 200-fold improvement in the execution time over current methods. We also demonstrate HiQuant's general utility by analyzing proteome-wide quantification data generated from both a large-scale public tyrosine kinase siRNA knock-down study and an in-house investigation into the temporal dynamics of the KSR1 and KSR2 interactomes. Download HiQuant, sample data sets, and supporting documentation at http://hiquant.primesdb.eu .
Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics

PubMed Central

Deutsch, Eric W.; Mendoza, Luis; Shteynberg, David; Slagel, Joseph; Sun, Zhi; Moritz, Robert L.

2015-01-01

Democratization of genomics technologies has enabled the rapid determination of genotypes. More recently the democratization of comprehensive proteomics technologies is enabling the determination of the cellular phenotype and the molecular events that define its dynamic state. Core proteomic technologies include mass spectrometry to define protein sequence, protein:protein interactions, and protein post-translational modifications. Key enabling technologies for proteomics are bioinformatic pipelines to identify, quantitate, and summarize these events. The Trans-Proteomics Pipeline (TPP) is a robust open-source standardized data processing pipeline for large-scale reproducible quantitative mass spectrometry proteomics. It supports all major operating systems and instrument vendors via open data formats. Here we provide a review of the overall proteomics workflow supported by the TPP, its major tools, and how it can be used in its various modes from desktop to cloud computing. We describe new features for the TPP, including data visualization functionality. We conclude by describing some common perils that affect the analysis of tandem mass spectrometry datasets, as well as some major upcoming features. PMID:25631240
Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics.

PubMed

Deutsch, Eric W; Mendoza, Luis; Shteynberg, David; Slagel, Joseph; Sun, Zhi; Moritz, Robert L

2015-08-01

Democratization of genomics technologies has enabled the rapid determination of genotypes. More recently the democratization of comprehensive proteomics technologies is enabling the determination of the cellular phenotype and the molecular events that define its dynamic state. Core proteomic technologies include MS to define protein sequence, protein:protein interactions, and protein PTMs. Key enabling technologies for proteomics are bioinformatic pipelines to identify, quantitate, and summarize these events. The Trans-Proteomics Pipeline (TPP) is a robust open-source standardized data processing pipeline for large-scale reproducible quantitative MS proteomics. It supports all major operating systems and instrument vendors via open data formats. Here, we provide a review of the overall proteomics workflow supported by the TPP, its major tools, and how it can be used in its various modes from desktop to cloud computing. We describe new features for the TPP, including data visualization functionality. We conclude by describing some common perils that affect the analysis of MS/MS datasets, as well as some major upcoming features. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
EvoluCode: Evolutionary Barcodes as a Unifying Framework for Multilevel Evolutionary Data.

PubMed

Linard, Benjamin; Nguyen, Ngoc Hoan; Prosdocimi, Francisco; Poch, Olivier; Thompson, Julie D

2012-01-01

Evolutionary systems biology aims to uncover the general trends and principles governing the evolution of biological networks. An essential part of this process is the reconstruction and analysis of the evolutionary histories of these complex, dynamic networks. Unfortunately, the methodologies for representing and exploiting such complex evolutionary histories in large scale studies are currently limited. Here, we propose a new formalism, called EvoluCode (Evolutionary barCode), which allows the integration of different evolutionary parameters (eg, sequence conservation, orthology, synteny …) in a unifying format and facilitates the multilevel analysis and visualization of complex evolutionary histories at the genome scale. The advantages of the approach are demonstrated by constructing barcodes representing the evolution of the complete human proteome. Two large-scale studies are then described: (i) the mapping and visualization of the barcodes on the human chromosomes and (ii) automatic clustering of the barcodes to highlight protein subsets sharing similar evolutionary histories and their functional analysis. The methodologies developed here open the way to the efficient application of other data mining and knowledge extraction techniques in evolutionary systems biology studies. A database containing all EvoluCode data is available at: http://lbgi.igbmc.fr/barcodes.

CPTC and KIST Join Efforts to Solve Complex Proteomic Issues | Office of Cancer Clinical Proteomics Research

Cancer.gov

The National Cancer Institute's (NCI) Clinical Proteomic Technologies for Cancer (CPTC) initiative at the National Institutes of Health has entered into a memorandum of understanding (MOU) with the Korea Institute of Science and Technology (KIST). This MOU promotes proteomic technology optimization and standards implementation in large-scale international programs.
Analyzing large-scale proteomics projects with latent semantic indexing.

PubMed

Klie, Sebastian; Martens, Lennart; Vizcaíno, Juan Antonio; Côté, Richard; Jones, Phil; Apweiler, Rolf; Hinneburg, Alexander; Hermjakob, Henning

2008-01-01

Since the advent of public data repositories for proteomics data, readily accessible results from high-throughput experiments have been accumulating steadily. Several large-scale projects in particular have contributed substantially to the amount of identifications available to the community. Despite the considerable body of information amassed, very few successful analyses have been performed and published on this data, leveling off the ultimate value of these projects far below their potential. A prominent reason published proteomics data is seldom reanalyzed lies in the heterogeneous nature of the original sample collection and the subsequent data recording and processing. To illustrate that at least part of this heterogeneity can be compensated for, we here apply a latent semantic analysis to the data contributed by the Human Proteome Organization's Plasma Proteome Project (HUPO PPP). Interestingly, despite the broad spectrum of instruments and methodologies applied in the HUPO PPP, our analysis reveals several obvious patterns that can be used to formulate concrete recommendations for optimizing proteomics project planning as well as the choice of technologies used in future experiments. It is clear from these results that the analysis of large bodies of publicly available proteomics data by noise-tolerant algorithms such as the latent semantic analysis holds great promise and is currently underexploited.
Intact mass detection, interpretation, and visualization to automate Top-Down proteomics on a large scale

PubMed Central

Durbin, Kenneth R.; Tran, John C.; Zamdborg, Leonid; Sweet, Steve M. M.; Catherman, Adam D.; Lee, Ji Eun; Li, Mingxi; Kellie, John F.; Kelleher, Neil L.

2011-01-01

Applying high-throughput Top-Down MS to an entire proteome requires a yet-to-be-established model for data processing. Since Top-Down is becoming possible on a large scale, we report our latest software pipeline dedicated to capturing the full value of intact protein data in automated fashion. For intact mass detection, we combine algorithms for processing MS1 data from both isotopically resolved (FT) and charge-state resolved (ion trap) LC-MS data, which are then linked to their fragment ions for database searching using ProSight. Automated determination of human keratin and tubulin isoforms is one result. Optimized for the intricacies of whole proteins, new software modules visualize proteome-scale data based on the LC retention time and intensity of intact masses and enable selective detection of PTMs to automatically screen for acetylation, phosphorylation, and methylation. Software functionality was demonstrated using comparative LC-MS data from yeast strains in addition to human cells undergoing chemical stress. We further these advances as a key aspect of realizing Top-Down MS on a proteomic scale. PMID:20848673
Proteomics wants cRacker: automated standardized data analysis of LC-MS derived proteomic data.

PubMed

Zauber, Henrik; Schulze, Waltraud X

2012-11-02

The large-scale analysis of thousands of proteins under various experimental conditions or in mutant lines has gained more and more importance in hypothesis-driven scientific research and systems biology in the past years. Quantitative analysis by large scale proteomics using modern mass spectrometry usually results in long lists of peptide ion intensities. The main interest for most researchers, however, is to draw conclusions on the protein level. Postprocessing and combining peptide intensities of a proteomic data set requires expert knowledge, and the often repetitive and standardized manual calculations can be time-consuming. The analysis of complex samples can result in very large data sets (lists with several 1000s to 100,000 entries of different peptides) that cannot easily be analyzed using standard spreadsheet programs. To improve speed and consistency of the data analysis of LC-MS derived proteomic data, we developed cRacker. cRacker is an R-based program for automated downstream proteomic data analysis including data normalization strategies for metabolic labeling and label free quantitation. In addition, cRacker includes basic statistical analysis, such as clustering of data, or ANOVA and t tests for comparison between treatments. Results are presented in editable graphic formats and in list files.
Explore, Visualize, and Analyze Functional Cancer Proteomic Data Using the Cancer Proteome Atlas. | Office of Cancer Genomics

Cancer.gov

Reverse-phase protein arrays (RPPA) represent a powerful functional proteomic approach to elucidate cancer-related molecular mechanisms and to develop novel cancer therapies. To facilitate community-based investigation of the large-scale protein expression data generated by this platform, we have developed a user-friendly, open-access bioinformatic resource, The Cancer Proteome Atlas (TCPA, http://tcpaportal.org), which contains two separate web applications.
A Review: Proteomics in Retinal Artery Occlusion, Retinal Vein Occlusion, Diabetic Retinopathy and Acquired Macular Disorders.

PubMed

Cehofski, Lasse Jørgensen; Honoré, Bent; Vorum, Henrik

2017-04-28

Retinal artery occlusion (RAO), retinal vein occlusion (RVO), diabetic retinopathy (DR) and age-related macular degeneration (AMD) are frequent ocular diseases with potentially sight-threatening outcomes. In the present review we discuss major findings of proteomic studies of RAO, RVO, DR and AMD, including an overview of ocular proteome changes associated with anti-vascular endothelial growth factor (VEGF) treatments. Despite the severe outcomes of RAO, the proteome of the disease remains largely unstudied. There is also limited knowledge about the proteome of RVO, but proteomic studies suggest that RVO is associated with remodeling of the extracellular matrix and adhesion processes. Proteomic studies of DR have resulted in the identification of potential therapeutic targets such as carbonic anhydrase-I. Proliferative diabetic retinopathy is the most intensively studied stage of DR. Proteomic studies have established VEGF, pigment epithelium-derived factor (PEDF) and complement components as key factors associated with AMD. The aim of this review is to highlight the major milestones in proteomics in RAO, RVO, DR and AMD. Through large-scale protein analyses, proteomics is bringing new important insights into these complex pathological conditions.
ApoptoProteomics, an integrated database for analysis of proteomics data obtained from apoptotic cells.

PubMed

Arntzen, Magnus Ø; Thiede, Bernd

2012-02-01

Apoptosis is the most commonly described form of programmed cell death, and dysfunction is implicated in a large number of human diseases. Many quantitative proteome analyses of apoptosis have been performed to gain insight in proteins involved in the process. This resulted in large and complex data sets that are difficult to evaluate. Therefore, we developed the ApoptoProteomics database for storage, browsing, and analysis of the outcome of large scale proteome analyses of apoptosis derived from human, mouse, and rat. The proteomics data of 52 publications were integrated and unified with protein annotations from UniProt-KB, the caspase substrate database homepage (CASBAH), and gene ontology. Currently, more than 2300 records of more than 1500 unique proteins were included, covering a large proportion of the core signaling pathways of apoptosis. Analysis of the data set revealed a high level of agreement between the reported changes in directionality reported in proteomics studies and expected apoptosis-related function and may disclose proteins without a current recognized involvement in apoptosis based on gene ontology. Comparison between induction of apoptosis by the intrinsic and the extrinsic apoptotic signaling pathway revealed slight differences. Furthermore, proteomics has significantly contributed to the field of apoptosis in identifying hundreds of caspase substrates. The database is available at http://apoptoproteomics.uio.no.
ApoptoProteomics, an Integrated Database for Analysis of Proteomics Data Obtained from Apoptotic Cells*

PubMed Central

Arntzen, Magnus Ø.; Thiede, Bernd

2012-01-01

Apoptosis is the most commonly described form of programmed cell death, and dysfunction is implicated in a large number of human diseases. Many quantitative proteome analyses of apoptosis have been performed to gain insight in proteins involved in the process. This resulted in large and complex data sets that are difficult to evaluate. Therefore, we developed the ApoptoProteomics database for storage, browsing, and analysis of the outcome of large scale proteome analyses of apoptosis derived from human, mouse, and rat. The proteomics data of 52 publications were integrated and unified with protein annotations from UniProt-KB, the caspase substrate database homepage (CASBAH), and gene ontology. Currently, more than 2300 records of more than 1500 unique proteins were included, covering a large proportion of the core signaling pathways of apoptosis. Analysis of the data set revealed a high level of agreement between the reported changes in directionality reported in proteomics studies and expected apoptosis-related function and may disclose proteins without a current recognized involvement in apoptosis based on gene ontology. Comparison between induction of apoptosis by the intrinsic and the extrinsic apoptotic signaling pathway revealed slight differences. Furthermore, proteomics has significantly contributed to the field of apoptosis in identifying hundreds of caspase substrates. The database is available at http://apoptoproteomics.uio.no. PMID:22067098
Systems Proteomics for Translational Network Medicine

PubMed Central

Arrell, D. Kent; Terzic, Andre

2012-01-01

Universal principles underlying network science, and their ever-increasing applications in biomedicine, underscore the unprecedented capacity of systems biology based strategies to synthesize and resolve massive high throughput generated datasets. Enabling previously unattainable comprehension of biological complexity, systems approaches have accelerated progress in elucidating disease prediction, progression, and outcome. Applied to the spectrum of states spanning health and disease, network proteomics establishes a collation, integration, and prioritization algorithm to guide mapping and decoding of proteome landscapes from large-scale raw data. Providing unparalleled deconvolution of protein lists into global interactomes, integrative systems proteomics enables objective, multi-modal interpretation at molecular, pathway, and network scales, merging individual molecular components, their plurality of interactions, and functional contributions for systems comprehension. As such, network systems approaches are increasingly exploited for objective interpretation of cardiovascular proteomics studies. Here, we highlight network systems proteomic analysis pipelines for integration and biological interpretation through protein cartography, ontological categorization, pathway and functional enrichment and complex network analysis. PMID:22896016
Affordable proteomics: the two-hybrid systems.

PubMed

Gillespie, Marc

2003-06-01

Numerous proteomic methodologies exist, but most require a heavy investment in expertise and technology. This puts these approaches out of reach for many laboratories and small companies, rarely allowing proteomics to be used as a pilot approach for biomarker or target identification. Two proteomic approaches, 2D gel electrophoresis and the two-hybrid systems, are currently available to most researchers. The two-hybrid systems, though accommodating to large-scale experiments, were originally designed as practical screens, that by comparison to current proteomics tools were small-scale, affordable and technically feasible. The screens rapidly generated data, identifying protein interactions that were previously uncharacterized. The foundation for a two-hybrid proteomic investigation can be purchased as separate kits from a number of companies. The true power of the technique lies not in its affordability, but rather in its portability. The two-hybrid system puts proteomics back into laboratories where the output of the screens can be evaluated by researchers with experience in the particular fields of basic research, cancer biology, toxicology or drug development.
Low Cost, Scalable Proteomics Data Analysis Using Amazon's Cloud Computing Services and Open Source Search Algorithms

PubMed Central

Halligan, Brian D.; Geiger, Joey F.; Vallejos, Andrew K.; Greene, Andrew S.; Twigger, Simon N.

2009-01-01

One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step by step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center website (http://proteomics.mcw.edu/vipdac). PMID:19358578
Low cost, scalable proteomics data analysis using Amazon's cloud computing services and open source search algorithms.

PubMed

Halligan, Brian D; Geiger, Joey F; Vallejos, Andrew K; Greene, Andrew S; Twigger, Simon N

2009-06-01

One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step-by-step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center Web site ( http://proteomics.mcw.edu/vipdac ).
Ribosomal mutations promote the evolution of antibiotic resistance in a multidrug environment.

PubMed

Gomez, James E; Kaufmann-Malaga, Benjamin B; Wivagg, Carl N; Kim, Peter B; Silvis, Melanie R; Renedo, Nikolai; Ioerger, Thomas R; Ahmad, Rushdy; Livny, Jonathan; Fishbein, Skye; Sacchettini, James C; Carr, Steven A; Hung, Deborah T

2017-02-21

Antibiotic resistance arising via chromosomal mutations is typically specific to a particular antibiotic or class of antibiotics. We have identified mutations in genes encoding ribosomal components in Mycobacterium smegmatis that confer resistance to several structurally and mechanistically unrelated classes of antibiotics and enhance survival following heat shock and membrane stress. These mutations affect ribosome assembly and cause large-scale transcriptomic and proteomic changes, including the downregulation of the catalase KatG, an activating enzyme required for isoniazid sensitivity, and upregulation of WhiB7, a transcription factor involved in innate antibiotic resistance. Importantly, while these ribosomal mutations have a fitness cost in antibiotic-free medium, in a multidrug environment they promote the evolution of high-level, target-based resistance. Further, suppressor mutations can then be easily acquired to restore wild-type growth. Thus, ribosomal mutations can serve as stepping-stones in an evolutionary path leading to the emergence of high-level, multidrug resistance.
pGlyco 2.0 enables precision N-glycoproteomics with comprehensive quality control and one-step mass spectrometry for intact glycopeptide identification.

PubMed

Liu, Ming-Qi; Zeng, Wen-Feng; Fang, Pan; Cao, Wei-Qian; Liu, Chao; Yan, Guo-Quan; Zhang, Yang; Peng, Chao; Wu, Jian-Qiang; Zhang, Xiao-Jin; Tu, Hui-Jun; Chi, Hao; Sun, Rui-Xiang; Cao, Yong; Dong, Meng-Qiu; Jiang, Bi-Yun; Huang, Jiang-Ming; Shen, Hua-Li; Wong, Catherine C L; He, Si-Min; Yang, Peng-Yuan

2017-09-05

The precise and large-scale identification of intact glycopeptides is a critical step in glycoproteomics. Owing to the complexity of glycosylation, the current overall throughput, data quality and accessibility of intact glycopeptide identification lack behind those in routine proteomic analyses. Here, we propose a workflow for the precise high-throughput identification of intact N-glycopeptides at the proteome scale using stepped-energy fragmentation and a dedicated search engine. pGlyco 2.0 conducts comprehensive quality control including false discovery rate evaluation at all three levels of matches to glycans, peptides and glycopeptides, improving the current level of accuracy of intact glycopeptide identification. The N-glycoproteome of samples metabolically labeled with 15 N/ 13 C were analyzed quantitatively and utilized to validate the glycopeptide identification, which could be used as a novel benchmark pipeline to compare different search engines. Finally, we report a large-scale glycoproteome dataset consisting of 10,009 distinct site-specific N-glycans on 1988 glycosylation sites from 955 glycoproteins in five mouse tissues.Protein glycosylation is a heterogeneous post-translational modification that generates greater proteomic diversity that is difficult to analyze. Here the authors describe pGlyco 2.0, a workflow for the precise one step identification of intact N-glycopeptides at the proteome scale.
Determination of burn patient outcome by large-scale quantitative discovery proteomics

PubMed Central

Finnerty, Celeste C.; Jeschke, Marc G.; Qian, Wei-Jun; Kaushal, Amit; Xiao, Wenzhong; Liu, Tao; Gritsenko, Marina A.; Moore, Ronald J.; Camp, David G.; Moldawer, Lyle L.; Elson, Constance; Schoenfeld, David; Gamelli, Richard; Gibran, Nicole; Klein, Matthew; Arnoldo, Brett; Remick, Daniel; Smith, Richard D.; Davis, Ronald; Tompkins, Ronald G.; Herndon, David N.

2013-01-01

Objective Emerging proteomics techniques can be used to establish proteomic outcome signatures and to identify candidate biomarkers for survival following traumatic injury. We applied high-resolution liquid chromatography-mass spectrometry (LC-MS) and multiplex cytokine analysis to profile the plasma proteome of survivors and non-survivors of massive burn injury to determine the proteomic survival signature following a major burn injury. Design Proteomic discovery study. Setting Five burn hospitals across the U.S. Patients Thirty-two burn patients (16 non-survivors and 16 survivors), 19–89 years of age, were admitted within 96 h of injury to the participating hospitals with burns covering >20% of the total body surface area and required at least one surgical intervention. Interventions None. Measurements and Main Results We found differences in circulating levels of 43 proteins involved in the acute phase response, hepatic signaling, the complement cascade, inflammation, and insulin resistance. Thirty-two of the proteins identified were not previously known to play a role in the response to burn. IL-4, IL-8, GM-CSF, MCP-1, and β2-microglobulin correlated well with survival and may serve as clinical biomarkers. Conclusions These results demonstrate the utility of these techniques for establishing proteomic survival signatures and for use as a discovery tool to identify candidate biomarkers for survival. This is the first clinical application of a high-throughput, large-scale LC-MS-based quantitative plasma proteomic approach for biomarker discovery for the prediction of patient outcome following burn, trauma or critical illness. PMID:23507713
A Review: Proteomics in Retinal Artery Occlusion, Retinal Vein Occlusion, Diabetic Retinopathy and Acquired Macular Disorders

PubMed Central

Cehofski, Lasse Jørgensen; Honoré, Bent; Vorum, Henrik

2017-01-01

Retinal artery occlusion (RAO), retinal vein occlusion (RVO), diabetic retinopathy (DR) and age-related macular degeneration (AMD) are frequent ocular diseases with potentially sight-threatening outcomes. In the present review we discuss major findings of proteomic studies of RAO, RVO, DR and AMD, including an overview of ocular proteome changes associated with anti-vascular endothelial growth factor (VEGF) treatments. Despite the severe outcomes of RAO, the proteome of the disease remains largely unstudied. There is also limited knowledge about the proteome of RVO, but proteomic studies suggest that RVO is associated with remodeling of the extracellular matrix and adhesion processes. Proteomic studies of DR have resulted in the identification of potential therapeutic targets such as carbonic anhydrase-I. Proliferative diabetic retinopathy is the most intensively studied stage of DR. Proteomic studies have established VEGF, pigment epithelium-derived factor (PEDF) and complement components as key factors associated with AMD. The aim of this review is to highlight the major milestones in proteomics in RAO, RVO, DR and AMD. Through large-scale protein analyses, proteomics is bringing new important insights into these complex pathological conditions. PMID:28452939
Content Is King: Databases Preserve the Collective Information of Science.

PubMed

Yates, John R

2018-04-01

Databases store sequence information experimentally gathered to create resources that further science. In the last 20 years databases have become critical components of fields like proteomics where they provide the basis for large-scale and high-throughput proteomic informatics. Amos Bairoch, winner of the Association of Biomolecular Resource Facilities Frederick Sanger Award, has created some of the important databases proteomic research depends upon for accurate interpretation of data.
Current algorithmic solutions for peptide-based proteomics data generation and identification.

PubMed

Hoopmann, Michael R; Moritz, Robert L

2013-02-01

Peptide-based proteomic data sets are ever increasing in size and complexity. These data sets provide computational challenges when attempting to quickly analyze spectra and obtain correct protein identifications. Database search and de novo algorithms must consider high-resolution MS/MS spectra and alternative fragmentation methods. Protein inference is a tricky problem when analyzing large data sets of degenerate peptide identifications. Combining multiple algorithms for improved peptide identification puts significant strain on computational systems when investigating large data sets. This review highlights some of the recent developments in peptide and protein identification algorithms for analyzing shotgun mass spectrometry data when encountering the aforementioned hurdles. Also explored are the roles that analytical pipelines, public spectral libraries, and cloud computing play in the evolution of peptide-based proteomics. Copyright © 2012 Elsevier Ltd. All rights reserved.
Mosaic nature of the mitochondrial proteome: Implications for the origin and evolution of mitochondria.

PubMed

Gray, Michael W

2015-08-18

Comparative studies of the mitochondrial proteome have identified a conserved core of proteins descended from the α-proteobacterial endosymbiont that gave rise to the mitochondrion and was the source of the mitochondrial genome in contemporary eukaryotes. A surprising result of phylogenetic analyses is the relatively small proportion (10-20%) of the mitochondrial proteome displaying a clear α-proteobacterial ancestry. A large fraction of mitochondrial proteins typically has detectable homologs only in other eukaryotes and is presumed to represent proteins that emerged specifically within eukaryotes. A further significant fraction of the mitochondrial proteome consists of proteins with homologs in prokaryotes, but without a robust phylogenetic signal affiliating them with specific prokaryotic lineages. The presumptive evolutionary source of these proteins is quite different in contending models of mitochondrial origin.
A Community Standard Format for the Representation of Protein Affinity Reagents*

PubMed Central

Gloriam, David E.; Orchard, Sandra; Bertinetti, Daniela; Björling, Erik; Bongcam-Rudloff, Erik; Borrebaeck, Carl A. K.; Bourbeillon, Julie; Bradbury, Andrew R. M.; de Daruvar, Antoine; Dübel, Stefan; Frank, Ronald; Gibson, Toby J.; Gold, Larry; Haslam, Niall; Herberg, Friedrich W.; Hiltke, Tara; Hoheisel, Jörg D.; Kerrien, Samuel; Koegl, Manfred; Konthur, Zoltán; Korn, Bernhard; Landegren, Ulf; Montecchi-Palazzi, Luisa; Palcy, Sandrine; Rodriguez, Henry; Schweinsberg, Sonja; Sievert, Volker; Stoevesandt, Oda; Taussig, Michael J.; Ueffing, Marius; Uhlén, Mathias; van der Maarel, Silvère; Wingren, Christer; Woollard, Peter; Sherman, David J.; Hermjakob, Henning

2010-01-01

Protein affinity reagents (PARs), most commonly antibodies, are essential reagents for protein characterization in basic research, biotechnology, and diagnostics as well as the fastest growing class of therapeutics. Large numbers of PARs are available commercially; however, their quality is often uncertain. In addition, currently available PARs cover only a fraction of the human proteome, and their cost is prohibitive for proteome scale applications. This situation has triggered several initiatives involving large scale generation and validation of antibodies, for example the Swedish Human Protein Atlas and the German Antibody Factory. Antibodies targeting specific subproteomes are being pursued by members of Human Proteome Organisation (plasma and liver proteome projects) and the United States National Cancer Institute (cancer-associated antigens). ProteomeBinders, a European consortium, aims to set up a resource of consistently quality-controlled protein-binding reagents for the whole human proteome. An ultimate PAR database resource would allow consumers to visit one on-line warehouse and find all available affinity reagents from different providers together with documentation that facilitates easy comparison of their cost and quality. However, in contrast to, for example, nucleotide databases among which data are synchronized between the major data providers, current PAR producers, quality control centers, and commercial companies all use incompatible formats, hindering data exchange. Here we propose Proteomics Standards Initiative (PSI)-PAR as a global community standard format for the representation and exchange of protein affinity reagent data. The PSI-PAR format is maintained by the Human Proteome Organisation PSI and was developed within the context of ProteomeBinders by building on a mature proteomics standard format, PSI-molecular interaction, which is a widely accepted and established community standard for molecular interaction data. Further information and documentation are available on the PSI-PAR web site. PMID:19674966

CPTAC researchers report first large-scale integrated proteomic and genomic analysis of a human cancer | Office of Cancer Clinical Proteomics Research

Cancer.gov

Investigators from the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) who comprehensively analyzed 95 human colorectal tumor samples, have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, provides a more comprehensive view of the biological features that drive cancer than genomic analysis alone and may help identify the most important targets for cancer detection and intervention.
PACOM: A Versatile Tool for Integrating, Filtering, Visualizing, and Comparing Multiple Large Mass Spectrometry Proteomics Data Sets.

PubMed

Martínez-Bartolomé, Salvador; Medina-Aunon, J Alberto; López-García, Miguel Ángel; González-Tejedo, Carmen; Prieto, Gorka; Navajas, Rosana; Salazar-Donate, Emilio; Fernández-Costa, Carolina; Yates, John R; Albar, Juan Pablo

2018-04-06

Mass-spectrometry-based proteomics has evolved into a high-throughput technology in which numerous large-scale data sets are generated from diverse analytical platforms. Furthermore, several scientific journals and funding agencies have emphasized the storage of proteomics data in public repositories to facilitate its evaluation, inspection, and reanalysis. (1) As a consequence, public proteomics data repositories are growing rapidly. However, tools are needed to integrate multiple proteomics data sets to compare different experimental features or to perform quality control analysis. Here, we present a new Java stand-alone tool, Proteomics Assay COMparator (PACOM), that is able to import, combine, and simultaneously compare numerous proteomics experiments to check the integrity of the proteomic data as well as verify data quality. With PACOM, the user can detect source of errors that may have been introduced in any step of a proteomics workflow and that influence the final results. Data sets can be easily compared and integrated, and data quality and reproducibility can be visually assessed through a rich set of graphical representations of proteomics data features as well as a wide variety of data filters. Its flexibility and easy-to-use interface make PACOM a unique tool for daily use in a proteomics laboratory. PACOM is available at https://github.com/smdb21/pacom .
Background | Office of Cancer Clinical Proteomics Research

Cancer.gov

The term "proteomics" refers to a large-scale comprehensive study of a specific proteome resulting from its genome, including abundances of proteins, their variations and modifications, and interacting partners and networks in order to understand cellular processes involved. Similarly, “Cancer proteomics” refers to comprehensive analyses of proteins and their derivatives translated from a specific cancer genome using a human biospecimen or a preclinical model (e.g., cultured cell or animal model).
Assembling proteomics data as a prerequisite for the analysis of large scale experiments

PubMed Central

Schmidt, Frank; Schmid, Monika; Thiede, Bernd; Pleißner, Klaus-Peter; Böhme, Martina; Jungblut, Peter R

2009-01-01

Background Despite the complete determination of the genome sequence of a huge number of bacteria, their proteomes remain relatively poorly defined. Beside new methods to increase the number of identified proteins new database applications are necessary to store and present results of large- scale proteomics experiments. Results In the present study, a database concept has been developed to address these issues and to offer complete information via a web interface. In our concept, the Oracle based data repository system SQL-LIMS plays the central role in the proteomics workflow and was applied to the proteomes of Mycobacterium tuberculosis, Helicobacter pylori, Salmonella typhimurium and protein complexes such as 20S proteasome. Technical operations of our proteomics labs were used as the standard for SQL-LIMS template creation. By means of a Java based data parser, post-processed data of different approaches, such as LC/ESI-MS, MALDI-MS and 2-D gel electrophoresis (2-DE), were stored in SQL-LIMS. A minimum set of the proteomics data were transferred in our public 2D-PAGE database using a Java based interface (Data Transfer Tool) with the requirements of the PEDRo standardization. Furthermore, the stored proteomics data were extractable out of SQL-LIMS via XML. Conclusion The Oracle based data repository system SQL-LIMS played the central role in the proteomics workflow concept. Technical operations of our proteomics labs were used as standards for SQL-LIMS templates. Using a Java based parser, post-processed data of different approaches such as LC/ESI-MS, MALDI-MS and 1-DE and 2-DE were stored in SQL-LIMS. Thus, unique data formats of different instruments were unified and stored in SQL-LIMS tables. Moreover, a unique submission identifier allowed fast access to all experimental data. This was the main advantage compared to multi software solutions, especially if personnel fluctuations are high. Moreover, large scale and high-throughput experiments must be managed in a comprehensive repository system such as SQL-LIMS, to query results in a systematic manner. On the other hand, these database systems are expensive and require at least one full time administrator and specialized lab manager. Moreover, the high technical dynamics in proteomics may cause problems to adjust new data formats. To summarize, SQL-LIMS met the requirements of proteomics data handling especially in skilled processes such as gel-electrophoresis or mass spectrometry and fulfilled the PSI standardization criteria. The data transfer into a public domain via DTT facilitated validation of proteomics data. Additionally, evaluation of mass spectra by post-processing using MS-Screener improved the reliability of mass analysis and prevented storage of data junk. PMID:19166578
Interlaboratory Study Characterizing a Yeast Performance Standard for Benchmarking LC-MS Platform Performance*

PubMed Central

Paulovich, Amanda G.; Billheimer, Dean; Ham, Amy-Joan L.; Vega-Montoto, Lorenzo; Rudnick, Paul A.; Tabb, David L.; Wang, Pei; Blackman, Ronald K.; Bunk, David M.; Cardasis, Helene L.; Clauser, Karl R.; Kinsinger, Christopher R.; Schilling, Birgit; Tegeler, Tony J.; Variyath, Asokan Mulayath; Wang, Mu; Whiteaker, Jeffrey R.; Zimmerman, Lisa J.; Fenyo, David; Carr, Steven A.; Fisher, Susan J.; Gibson, Bradford W.; Mesri, Mehdi; Neubert, Thomas A.; Regnier, Fred E.; Rodriguez, Henry; Spiegelman, Cliff; Stein, Stephen E.; Tempst, Paul; Liebler, Daniel C.

2010-01-01

Optimal performance of LC-MS/MS platforms is critical to generating high quality proteomics data. Although individual laboratories have developed quality control samples, there is no widely available performance standard of biological complexity (and associated reference data sets) for benchmarking of platform performance for analysis of complex biological proteomes across different laboratories in the community. Individual preparations of the yeast Saccharomyces cerevisiae proteome have been used extensively by laboratories in the proteomics community to characterize LC-MS platform performance. The yeast proteome is uniquely attractive as a performance standard because it is the most extensively characterized complex biological proteome and the only one associated with several large scale studies estimating the abundance of all detectable proteins. In this study, we describe a standard operating protocol for large scale production of the yeast performance standard and offer aliquots to the community through the National Institute of Standards and Technology where the yeast proteome is under development as a certified reference material to meet the long term needs of the community. Using a series of metrics that characterize LC-MS performance, we provide a reference data set demonstrating typical performance of commonly used ion trap instrument platforms in expert laboratories; the results provide a basis for laboratories to benchmark their own performance, to improve upon current methods, and to evaluate new technologies. Additionally, we demonstrate how the yeast reference, spiked with human proteins, can be used to benchmark the power of proteomics platforms for detection of differentially expressed proteins at different levels of concentration in a complex matrix, thereby providing a metric to evaluate and minimize preanalytical and analytical variation in comparative proteomics experiments. PMID:19858499
Enrichment and separation techniques for large-scale proteomics analysis of the protein post-translational modifications.

PubMed

Huang, Junfeng; Wang, Fangjun; Ye, Mingliang; Zou, Hanfa

2014-11-06

Comprehensive analysis of the post-translational modifications (PTMs) on proteins at proteome level is crucial to elucidate the regulatory mechanisms of various biological processes. In the past decades, thanks to the development of specific PTM enrichment techniques and efficient multidimensional liquid chromatography (LC) separation strategy, the identification of protein PTMs have made tremendous progress. A huge number of modification sites for some major protein PTMs have been identified by proteomics analysis. In this review, we first introduced the recent progresses of PTM enrichment methods for the analysis of several major PTMs including phosphorylation, glycosylation, ubiquitination, acetylation, methylation, and oxidation/reduction status. We then briefly summarized the challenges for PTM enrichment. Finally, we introduced the fractionation and separation techniques for efficient separation of PTM peptides in large-scale PTM analysis. Copyright © 2014 Elsevier B.V. All rights reserved.
Activity-based protein profiling for biochemical pathway discovery in cancer

PubMed Central

Nomura, Daniel K.; Dix, Melissa M.; Cravatt, Benjamin F.

2011-01-01

Large-scale profiling methods have uncovered numerous gene and protein expression changes that correlate with tumorigenesis. However, determining the relevance of these expression changes and which biochemical pathways they affect has been hindered by our incomplete understanding of the proteome and its myriad functions and modes of regulation. Activity-based profiling platforms enable both the discovery of cancer-relevant enzymes and selective pharmacological probes to perturb and characterize these proteins in tumour cells. When integrated with other large-scale profiling methods, activity-based proteomics can provide insight into the metabolic and signalling pathways that support cancer pathogenesis and illuminate new strategies for disease diagnosis and treatment. PMID:20703252
Large-Scale Interaction Profiling of Protein Domains Through Proteomic Peptide-Phage Display Using Custom Peptidomes.

PubMed

Seo, Moon-Hyeong; Nim, Satra; Jeon, Jouhyun; Kim, Philip M

2017-01-01

Protein-protein interactions are essential to cellular functions and signaling pathways. We recently combined bioinformatics and custom oligonucleotide arrays to construct custom-made peptide-phage libraries for screening peptide-protein interactions, an approach we call proteomic peptide-phage display (ProP-PD). In this chapter, we describe protocols for phage display for the identification of natural peptide binders for a given protein. We finally describe deep sequencing for the analysis of the proteomic peptide-phage display.
CPTC and NIST-sponsored Yeast Reference Material Now Publicly Available | Office of Cancer Clinical Proteomics Research

Cancer.gov

The yeast protein extract (RM8323) developed by National Institute of Standards and Technology (NIST) under the auspices of NCI's CPTC initiative is currently available to the public at https://www-s.nist.gov/srmors/view_detail.cfm?srm=8323. The yeast proteome offers researchers a unique biological reference material. RM8323 is the most extensively characterized complex biological proteome and the only one associated with several large-scale studies to estimate protein abundance across a wide concentration range.
Science, marketing and wishful thinking in quantitative proteomics.

PubMed

Hackett, Murray

2008-11-01

In a recent editorial (J. Proteome Res. 2007, 6, 1633) and elsewhere questions have been raised regarding the lack of attention paid to good analytical practice with respect to the reporting of quantitative results in proteomics. Using those comments as a starting point, several issues are discussed that relate to the challenges involved in achieving adequate sampling with MS-based methods in order to generate valid data for large-scale studies. The discussion touches on the relationships that connect sampling depth and the power to detect protein abundance change, conflict of interest, and strategies to overcome bureaucratic obstacles that impede the use of peer-to-peer technologies for transfer and storage of large data files generated in such experiments.
From the genome sequence to the protein inventory of Bacillus subtilis.

PubMed

Becher, Dörte; Büttner, Knut; Moche, Martin; Hessling, Bernd; Hecker, Michael

2011-08-01

Owing to the low number of proteins necessary to render a bacterial cell viable, bacteria are extremely attractive model systems to understand how the genome sequence is translated into actual life processes. One of the most intensively investigated model organisms is Bacillus subtilis. It has attracted world-wide research interest, addressing cell differentiation and adaptation on a molecular scale as well as biotechnological production processes. Meanwhile, we are looking back on more than 25 years of B. subtilis proteomics. A wide range of methods have been developed during this period for the large-scale qualitative and quantitative proteome analysis. Currently, it is possible to identify and quantify more than 50% of the predicted proteome in different cellular subfractions. In this review, we summarize the development of B. subtilis proteomics during the past 25 years. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Proteomics and circadian rhythms: It’s all about signaling!

PubMed Central

Mauvoisin, Daniel; Dayon, Loïc; Gachon, Frédéric; Kussmann, Martin

2014-01-01

1. Abstract Proteomic technologies using mass spectrometry (MS) offer new perspectives in circadian biology, in particular the possibility to study posttranslational modifications (PTMs). To date, only very few studies have been carried out to decipher the rhythmicity of protein expression in mammals with large-scale proteomics. Although signaling has been shown to be of high relevance, comprehensive characterization studies of PTMs are even more rare. This review aims at describing the actual landscape of circadian proteomics and the opportunities and challenges appearing on the horizon. Emphasis was given to signaling processes for their role in metabolic heath as regulated by circadian clocks and environmental factors. Those signaling processes are expected to be better and more deeply characterized in the coming years with proteomics. PMID:25103677
ProteinInferencer: Confident protein identification and multiple experiment comparison for large scale proteomics projects.

PubMed

Zhang, Yaoyang; Xu, Tao; Shan, Bing; Hart, Jonathan; Aslanian, Aaron; Han, Xuemei; Zong, Nobel; Li, Haomin; Choi, Howard; Wang, Dong; Acharya, Lipi; Du, Lisa; Vogt, Peter K; Ping, Peipei; Yates, John R

2015-11-03

Shotgun proteomics generates valuable information from large-scale and target protein characterizations, including protein expression, protein quantification, protein post-translational modifications (PTMs), protein localization, and protein-protein interactions. Typically, peptides derived from proteolytic digestion, rather than intact proteins, are analyzed by mass spectrometers because peptides are more readily separated, ionized and fragmented. The amino acid sequences of peptides can be interpreted by matching the observed tandem mass spectra to theoretical spectra derived from a protein sequence database. Identified peptides serve as surrogates for their proteins and are often used to establish what proteins were present in the original mixture and to quantify protein abundance. Two major issues exist for assigning peptides to their originating protein. The first issue is maintaining a desired false discovery rate (FDR) when comparing or combining multiple large datasets generated by shotgun analysis and the second issue is properly assigning peptides to proteins when homologous proteins are present in the database. Herein we demonstrate a new computational tool, ProteinInferencer, which can be used for protein inference with both small- or large-scale data sets to produce a well-controlled protein FDR. In addition, ProteinInferencer introduces confidence scoring for individual proteins, which makes protein identifications evaluable. This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015. Published by Elsevier B.V.
Mechanism of Arachidonic Acid Accumulation during Aging in Mortierella alpina: A Large-Scale Label-Free Comparative Proteomics Study.

PubMed

Yu, Yadong; Li, Tao; Wu, Na; Ren, Lujing; Jiang, Ling; Ji, Xiaojun; Huang, He

2016-11-30

Arachidonic acid (ARA) is an important polyunsaturated fatty acid having various beneficial physiological effects on the human body. The aging of Mortierella alpina has long been known to significantly improve ARA yield, but the exact mechanism is still elusive. Herein, multiple approaches including large-scale label-free comparative proteomics were employed to systematically investigate the mechanism mentioned above. Upon ultrastructural observation, abnormal mitochondria were found to aggregate around shrunken lipid droplets. Proteomics analysis revealed a total of 171 proteins with significant alterations of expression during aging. Pathway analysis suggested that reactive oxygen species (ROS) were accumulated and stimulated the activation of the malate/pyruvate cycle and isocitrate dehydrogenase, which might provide additional NADPH for ARA synthesis. EC 4.2.1.17-hydratase might be a key player in ARA accumulation during aging. These findings provide a valuable resource for efforts to further improve the ARA content in the oil produced by aging M. alpina.
CPTAC Evaluates Long-Term Reproducibility of Quantitative Proteomics Using Breast Cancer Xenografts | Office of Cancer Clinical Proteomics Research

Cancer.gov

Liquid chromatography tandem-mass spectrometry (LC-MS/MS)- based methods such as isobaric tags for relative and absolute quantification (iTRAQ) and tandem mass tags (TMT) have been shown to provide overall better quantification accuracy and reproducibility over other LC-MS/MS techniques. However, large scale projects like the Clinical Proteomic Tumor Analysis Consortium (CPTAC) require comparisons across many genomically characterized clinical specimens in a single study and often exceed the capability of traditional iTRAQ-based quantification.
Stepwise Evolution of Coral Biomineralization Revealed with Genome-Wide Proteomics and Transcriptomics

PubMed Central

Sawada, Hitoshi; Satoh, Noriyuki

2016-01-01

Despite the importance of stony corals in many research fields related to global issues, such as marine ecology, climate change, paleoclimatogy, and metazoan evolution, very little is known about the evolutionary origin of coral skeleton formation. In order to investigate the evolution of coral biomineralization, we have identified skeletal organic matrix proteins (SOMPs) in the skeletal proteome of the scleractinian coral, Acropora digitifera, for which large genomic and transcriptomic datasets are available. Scrupulous gene annotation was conducted based on comparisons of functional domain structures among metazoans. We found that SOMPs include not only coral-specific proteins, but also protein families that are widely conserved among cnidarians and other metazoans. We also identified several conserved transmembrane proteins in the skeletal proteome. Gene expression analysis revealed that expression of these conserved genes continues throughout development. Therefore, these genes are involved not only skeleton formation, but also in basic cellular functions, such as cell-cell interaction and signaling. On the other hand, genes encoding coral-specific proteins, including extracellular matrix domain-containing proteins, galaxins, and acidic proteins, were prominently expressed in post-settlement stages, indicating their role in skeleton formation. Taken together, the process of coral skeleton formation is hypothesized as: 1) formation of initial extracellular matrix between epithelial cells and substrate, employing pre-existing transmembrane proteins; 2) additional extracellular matrix formation using novel proteins that have emerged by domain shuffling and rapid molecular evolution and; 3) calcification controlled by coral-specific SOMPs. PMID:27253604
Predicting protein-protein interactions on a proteome scale by matching evolutionary and structural similarities at interfaces using PRISM.

PubMed

Tuncbag, Nurcan; Gursoy, Attila; Nussinov, Ruth; Keskin, Ozlem

2011-08-11

Prediction of protein-protein interactions at the structural level on the proteome scale is important because it allows prediction of protein function, helps drug discovery and takes steps toward genome-wide structural systems biology. We provide a protocol (termed PRISM, protein interactions by structural matching) for large-scale prediction of protein-protein interactions and assembly of protein complex structures. The method consists of two components: rigid-body structural comparisons of target proteins to known template protein-protein interfaces and flexible refinement using a docking energy function. The PRISM rationale follows our observation that globally different protein structures can interact via similar architectural motifs. PRISM predicts binding residues by using structural similarity and evolutionary conservation of putative binding residue 'hot spots'. Ultimately, PRISM could help to construct cellular pathways and functional, proteome-scale annotation. PRISM is implemented in Python and runs in a UNIX environment. The program accepts Protein Data Bank-formatted protein structures and is available at http://prism.ccbb.ku.edu.tr/prism_protocol/.
FunRich proteomics software analysis, let the fun begin!

PubMed

Benito-Martin, Alberto; Peinado, Héctor

2015-08-01

Protein MS analysis is the preferred method for unbiased protein identification. It is normally applied to a large number of both small-scale and high-throughput studies. However, user-friendly computational tools for protein analysis are still needed. In this issue, Mathivanan and colleagues (Proteomics 2015, 15, 2597-2601) report the development of FunRich software, an open-access software that facilitates the analysis of proteomics data, providing tools for functional enrichment and interaction network analysis of genes and proteins. FunRich is a reinterpretation of proteomic software, a standalone tool combining ease of use with customizable databases, free access, and graphical representations. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The Escherichia coli Proteome: Past, Present, and Future Prospects†

PubMed Central

Han, Mee-Jung; Lee, Sang Yup

2006-01-01

Proteomics has emerged as an indispensable methodology for large-scale protein analysis in functional genomics. The Escherichia coli proteome has been extensively studied and is well defined in terms of biochemical, biological, and biotechnological data. Even before the entire E. coli proteome was fully elucidated, the largest available data set had been integrated to decipher regulatory circuits and metabolic pathways, providing valuable insights into global cellular physiology and the development of metabolic and cellular engineering strategies. With the recent advent of advanced proteomic technologies, the E. coli proteome has been used for the validation of new technologies and methodologies such as sample prefractionation, protein enrichment, two-dimensional gel electrophoresis, protein detection, mass spectrometry (MS), combinatorial assays with n-dimensional chromatographies and MS, and image analysis software. These important technologies will not only provide a great amount of additional information on the E. coli proteome but also synergistically contribute to other proteomic studies. Here, we review the past development and current status of E. coli proteome research in terms of its biological, biotechnological, and methodological significance and suggest future prospects. PMID:16760308
MAPU: Max-Planck Unified database of organellar, cellular, tissue and body fluid proteomes

PubMed Central

Zhang, Yanling; Zhang, Yong; Adachi, Jun; Olsen, Jesper V.; Shi, Rong; de Souza, Gustavo; Pasini, Erica; Foster, Leonard J.; Macek, Boris; Zougman, Alexandre; Kumar, Chanchal; Wiśniewski, Jacek R.; Jun, Wang; Mann, Matthias

2007-01-01

Mass spectrometry (MS)-based proteomics has become a powerful technology to map the protein composition of organelles, cell types and tissues. In our department, a large-scale effort to map these proteomes is complemented by the Max-Planck Unified (MAPU) proteome database. MAPU contains several body fluid proteomes; including plasma, urine, and cerebrospinal fluid. Cell lines have been mapped to a depth of several thousand proteins and the red blood cell proteome has also been analyzed in depth. The liver proteome is represented with 3200 proteins. By employing high resolution MS and stringent validation criteria, false positive identification rates in MAPU are lower than 1:1000. Thus MAPU datasets can serve as reference proteomes in biomarker discovery. MAPU contains the peptides identifying each protein, measured masses, scores and intensities and is freely available at using a clickable interface of cell or body parts. Proteome data can be queried across proteomes by protein name, accession number, sequence similarity, peptide sequence and annotation information. More than 4500 mouse and 2500 human proteins have already been identified in at least one proteome. Basic annotation information and links to other public databases are provided in MAPU and we plan to add further analysis tools. PMID:17090601

First Large-Scale Proteogenomic Study of Breast Cancer Provides Insight into Potential Therapeutic Targets | Office of Cancer Clinical Proteomics Research

Cancer.gov

News Release: May 25, 2016 — Building on data from The Cancer Genome Atlas (TCGA) project, a multi-institutional team of scientists has completed the first large-scale “proteogenomic” study of breast cancer, linking DNA mutations to protein signaling and helping pinpoint the genes that drive cancer.
Computational Omics Pre-Awardees | Office of Cancer Clinical Proteomics Research

Cancer.gov

The National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) is pleased to announce the pre-awardees of the Computational Omics solicitation. Working with NVIDIA Foundation's Compute the Cure initiative and Leidos Biomedical Research Inc., the NCI, through this solicitation, seeks to leverage computational efforts to provide tools for the mining and interpretation of large-scale publicly available ‘omics’ datasets.
Comparative evaluation of saliva collection methods for proteome analysis.

PubMed

Golatowski, Claas; Salazar, Manuela Gesell; Dhople, Vishnu Mukund; Hammer, Elke; Kocher, Thomas; Jehmlich, Nico; Völker, Uwe

2013-04-18

Saliva collection devices are widely used for large-scale screening approaches. This study was designed to compare the suitability of three different whole-saliva collection approaches for subsequent proteome analyses. From 9 young healthy volunteers (4 women and 5 men) saliva samples were collected either unstimulated by passive drooling or stimulated using a paraffin gum or Salivette® (cotton swab). Saliva volume, protein concentration and salivary protein patterns were analyzed comparatively. Samples collected using paraffin gum showed the highest saliva volume (4.1±1.5 ml) followed by Salivette® collection (1.8±0.4 ml) and drooling (1.0±0.4 ml). Saliva protein concentrations (average 1145 μg/ml) showed no significant differences between the three sampling schemes. Each collection approach facilitated the identification of about 160 proteins (≥2 distinct peptides) per subject, but collection-method dependent variations in protein composition were observed. Passive drooling, paraffin gum and Salivette® each allows similar coverage of the whole saliva proteome, but the specific proteins observed depended on the collection approach. Thus, only one type of collection device should be used for quantitative proteome analysis in one experiment, especially when performing large-scale cross-sectional or multi-centric studies. Copyright © 2013 Elsevier B.V. All rights reserved.
Alternative Splicing May Not Be the Key to Proteome Complexity.

PubMed

Tress, Michael L; Abascal, Federico; Valencia, Alfonso

2017-02-01

Alternative splicing is commonly believed to be a major source of cellular protein diversity. However, although many thousands of alternatively spliced transcripts are routinely detected in RNA-seq studies, reliable large-scale mass spectrometry-based proteomics analyses identify only a small fraction of annotated alternative isoforms. The clearest finding from proteomics experiments is that most human genes have a single main protein isoform, while those alternative isoforms that are identified tend to be the most biologically plausible: those with the most cross-species conservation and those that do not compromise functional domains. Indeed, most alternative exons do not seem to be under selective pressure, suggesting that a large majority of predicted alternative transcripts may not even be translated into proteins. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Quantitative proteomics in Giardia duodenalis-Achievements and challenges.

PubMed

Emery, Samantha J; Lacey, Ernest; Haynes, Paul A

2016-08-01

Giardia duodenalis (syn. G. lamblia and G. intestinalis) is a protozoan parasite of vertebrates and a major contributor to the global burden of diarrheal diseases and gastroenteritis. The publication of multiple genome sequences in the G. duodenalis species complex has provided important insights into parasite biology, and made post-genomic technologies, including proteomics, significantly more accessible. The aims of proteomics are to identify and quantify proteins present in a cell, and assign functions to them within the context of dynamic biological systems. In Giardia, proteomics in the post-genomic era has transitioned from reliance on gel-based systems to utilisation of a diverse array of techniques based on bottom-up LC-MS/MS technologies. Together, these have generated crucial foundations for subcellular proteomes, elucidated intra- and inter-assemblage isolate variation, and identified pathways and markers in differentiation, host-parasite interactions and drug resistance. However, in Giardia, proteomics remains an emerging field, with considerable shortcomings evident from the published research. These include a bias towards assemblage A, a lack of emphasis on quantitative analytical techniques, and limited information on post-translational protein modifications. Additionally, there are multiple areas of research for which proteomic data is not available to add value to published transcriptomic data. The challenge of amalgamating data in the systems biology paradigm necessitates the further generation of large, high-quality quantitative datasets to accurately model parasite biology. This review surveys the current proteomic research available for Giardia and evaluates their technical and quantitative approaches, while contextualising their biological insights into parasite pathology, isolate variation and eukaryotic evolution. Finally, we propose areas of priority for the generation of future proteomic data to explore fundamental questions in Giardia, including the analysis of post-translational modifications, and the design of MS-based assays for validation of differentially expressed proteins in large datasets. Copyright © 2016 Elsevier B.V. All rights reserved.
A reference guide for tree analysis and visualization

PubMed Central

2010-01-01

The quantities of data obtained by the new high-throughput technologies, such as microarrays or ChIP-Chip arrays, and the large-scale OMICS-approaches, such as genomics, proteomics and transcriptomics, are becoming vast. Sequencing technologies become cheaper and easier to use and, thus, large-scale evolutionary studies towards the origins of life for all species and their evolution becomes more and more challenging. Databases holding information about how data are related and how they are hierarchically organized expand rapidly. Clustering analysis is becoming more and more difficult to be applied on very large amounts of data since the results of these algorithms cannot be efficiently visualized. Most of the available visualization tools that are able to represent such hierarchies, project data in 2D and are lacking often the necessary user friendliness and interactivity. For example, the current phylogenetic tree visualization tools are not able to display easy to understand large scale trees with more than a few thousand nodes. In this study, we review tools that are currently available for the visualization of biological trees and analysis, mainly developed during the last decade. We describe the uniform and standard computer readable formats to represent tree hierarchies and we comment on the functionality and the limitations of these tools. We also discuss on how these tools can be developed further and should become integrated with various data sources. Here we focus on freely available software that offers to the users various tree-representation methodologies for biological data analysis. PMID:20175922
The HUPO proteomics standards initiative--overcoming the fragmentation of proteomics data.

PubMed

Hermjakob, Henning

2006-09-01

Proteomics is a key field of modern biomolecular research, with many small and large scale efforts producing a wealth of proteomics data. However, the vast majority of this data is never exploited to its full potential. Even in publicly funded projects, often the raw data generated in a specific context is analysed, conclusions are drawn and published, but little attention is paid to systematic documentation, archiving, and public access to the data supporting the scientific results. It is often difficult to validate the results stated in a particular publication, and even simple global questions like "In which cellular contexts has my protein of interest been observed?" can currently not be answered with realistic effort, due to a lack of standardised reporting and collection of proteomics data. The Proteomics Standards Initiative (PSI), a work group of the Human Proteome Organisation (HUPO), defines community standards for data representation in proteomics to facilitate systematic data capture, comparison, exchange and verification. In this article we provide an overview of PSI organisational structure, activities, and current results, as well as ways to get involved in the broad-based, open PSI process.
Large scale systematic proteomic quantification from non-metastatic to metastatic colorectal cancer

NASA Astrophysics Data System (ADS)

Yin, Xuefei; Zhang, Yang; Guo, Shaowen; Jin, Hong; Wang, Wenhai; Yang, Pengyuan

2015-07-01

A systematic proteomic quantification of formalin-fixed, paraffin-embedded (FFPE) colorectal cancer tissues from stage I to stage IIIC was performed in large scale. 1017 proteins were identified with 338 proteins in quantitative changes by label free method, while 341 proteins were quantified with significant expression changes among 6294 proteins by iTRAQ method. We found that proteins related to migration expression increased and those for binding and adherent decreased during the colorectal cancer development according to the gene ontology (GO) annotation and ingenuity pathway analysis (IPA). The integrin alpha 5 (ITA5) in integrin family was focused, which was consistent with the metastasis related pathway. The expression level of ITA5 decreased in metastasis tissues and the result has been further verified by Western blotting. Another two cell migration related proteins vitronectin (VTN) and actin-related protein (ARP3) were also proved to be up-regulated by both mass spectrometry (MS) based quantification results and Western blotting. Up to now, our result shows one of the largest dataset in colorectal cancer proteomics research. Our strategy reveals a disease driven omics-pattern for the metastasis colorectal cancer.
MEERCAT: Multiplexed Efficient Cell Free Expression of Recombinant QconCATs For Large Scale Absolute Proteome Quantification*

PubMed Central

Takemori, Nobuaki; Takemori, Ayako; Tanaka, Yuki; Endo, Yaeta; Hurst, Jane L.; Gómez-Baena, Guadalupe; Harman, Victoria M.; Beynon, Robert J.

2017-01-01

A major challenge in proteomics is the absolute accurate quantification of large numbers of proteins. QconCATs, artificial proteins that are concatenations of multiple standard peptides, are well established as an efficient means to generate standards for proteome quantification. Previously, QconCATs have been expressed in bacteria, but we now describe QconCAT expression in a robust, cell-free system. The new expression approach rescues QconCATs that previously were unable to be expressed in bacteria and can reduce the incidence of proteolytic damage to QconCATs. Moreover, it is possible to cosynthesize QconCATs in a highly-multiplexed translation reaction, coexpressing tens or hundreds of QconCATs simultaneously. By obviating bacterial culture and through the gain of high level multiplexing, it is now possible to generate tens of thousands of standard peptides in a matter of weeks, rendering absolute quantification of a complex proteome highly achievable in a reproducible, broadly deployable system. PMID:29055021
Convergent evolution of gene networks by single-gene duplications in higher eukaryotes.

PubMed

Amoutzias, Gregory D; Robertson, David L; Oliver, Stephen G; Bornberg-Bauer, Erich

2004-03-01

By combining phylogenetic, proteomic and structural information, we have elucidated the evolutionary driving forces for the gene-regulatory interaction networks of basic helix-loop-helix transcription factors. We infer that recurrent events of single-gene duplication and domain rearrangement repeatedly gave rise to distinct networks with almost identical hub-based topologies, and multiple activators and repressors. We thus provide the first empirical evidence for scale-free protein networks emerging through single-gene duplications, the dominant importance of molecular modularity in the bottom-up construction of complex biological entities, and the convergent evolution of networks.
Assessment and improvement of statistical tools for comparative proteomics analysis of sparse data sets with few experimental replicates.

PubMed

Schwämmle, Veit; León, Ileana Rodríguez; Jensen, Ole Nørregaard

2013-09-06

Large-scale quantitative analyses of biological systems are often performed with few replicate experiments, leading to multiple nonidentical data sets due to missing values. For example, mass spectrometry driven proteomics experiments are frequently performed with few biological or technical replicates due to sample-scarcity or due to duty-cycle or sensitivity constraints, or limited capacity of the available instrumentation, leading to incomplete results where detection of significant feature changes becomes a challenge. This problem is further exacerbated for the detection of significant changes on the peptide level, for example, in phospho-proteomics experiments. In order to assess the extent of this problem and the implications for large-scale proteome analysis, we investigated and optimized the performance of three statistical approaches by using simulated and experimental data sets with varying numbers of missing values. We applied three tools, including standard t test, moderated t test, also known as limma, and rank products for the detection of significantly changing features in simulated and experimental proteomics data sets with missing values. The rank product method was improved to work with data sets containing missing values. Extensive analysis of simulated and experimental data sets revealed that the performance of the statistical analysis tools depended on simple properties of the data sets. High-confidence results were obtained by using the limma and rank products methods for analyses of triplicate data sets that exhibited more than 1000 features and more than 50% missing values. The maximum number of differentially represented features was identified by using limma and rank products methods in a complementary manner. We therefore recommend combined usage of these methods as a novel and optimal way to detect significantly changing features in these data sets. This approach is suitable for large quantitative data sets from stable isotope labeling and mass spectrometry experiments and should be applicable to large data sets of any type. An R script that implements the improved rank products algorithm and the combined analysis is available.
Dentistry proteomics: from laboratory development to clinical practice.

PubMed

Rezende, Taia M B; Lima, Stella M F; Petriz, Bernardo A; Silva, Osmar N; Freire, Mirna S; Franco, Octávio L

2013-12-01

Despite all the dental information acquired over centuries and the importance of proteome research, the cross-link between these two areas only emerged around mid-nineties. Proteomic tools can help dentistry in the identification of risk factors, early diagnosis, prevention, and systematic control that will promote the evolution of treatment in all dentistry specialties. This review mainly focuses on the evolution of dentistry in different specialties based on proteomic research and how these tools can improve knowledge in dentistry. The subjects covered are an overview of proteomics in dentistry, specific information on different fields in dentistry (dental structure, restorative dentistry, endodontics, periodontics, oral pathology, oral surgery, and orthodontics) and future directions. There are many new proteomic technologies that have never been used in dentistry studies and some dentistry areas that have never been explored by proteomic tools. It is expected that a greater integration of these areas will help to understand what is still unknown in oral health and disease. Copyright © 2013 Wiley Periodicals, Inc.
MAPU: Max-Planck Unified database of organellar, cellular, tissue and body fluid proteomes.

PubMed

Zhang, Yanling; Zhang, Yong; Adachi, Jun; Olsen, Jesper V; Shi, Rong; de Souza, Gustavo; Pasini, Erica; Foster, Leonard J; Macek, Boris; Zougman, Alexandre; Kumar, Chanchal; Wisniewski, Jacek R; Jun, Wang; Mann, Matthias

2007-01-01

Mass spectrometry (MS)-based proteomics has become a powerful technology to map the protein composition of organelles, cell types and tissues. In our department, a large-scale effort to map these proteomes is complemented by the Max-Planck Unified (MAPU) proteome database. MAPU contains several body fluid proteomes; including plasma, urine, and cerebrospinal fluid. Cell lines have been mapped to a depth of several thousand proteins and the red blood cell proteome has also been analyzed in depth. The liver proteome is represented with 3200 proteins. By employing high resolution MS and stringent validation criteria, false positive identification rates in MAPU are lower than 1:1000. Thus MAPU datasets can serve as reference proteomes in biomarker discovery. MAPU contains the peptides identifying each protein, measured masses, scores and intensities and is freely available at http://www.mapuproteome.com using a clickable interface of cell or body parts. Proteome data can be queried across proteomes by protein name, accession number, sequence similarity, peptide sequence and annotation information. More than 4500 mouse and 2500 human proteins have already been identified in at least one proteome. Basic annotation information and links to other public databases are provided in MAPU and we plan to add further analysis tools.
Exploring metazoan evolution through dynamic and holistic changes in protein families and domains

PubMed Central

2012-01-01

Background Proteins convey the majority of biochemical and cellular activities in organisms. Over the course of evolution, proteins undergo normal sequence mutations as well as large scale mutations involving domain duplication and/or domain shuffling. These events result in the generation of new proteins and protein families. Processes that affect proteome evolution drive species diversity and adaptation. Herein, change over the course of metazoan evolution, as defined by birth/death and duplication/deletion events within protein families and domains, was examined using the proteomes of 9 metazoan and two outgroup species. Results In studying members of the three major metazoan groups, the vertebrates, arthropods, and nematodes, we found that the number of protein families increased at the majority of lineages over the course of metazoan evolution where the magnitude of these increases was greatest at the lineages leading to mammals. In contrast, the number of protein domains decreased at most lineages and at all terminal lineages. This resulted in a weak correlation between protein family birth and domain birth; however, the correlation between domain birth and domain member duplication was quite strong. These data suggest that domain birth and protein family birth occur via different mechanisms, and that domain shuffling plays a role in the formation of protein families. The ratio of protein family birth to protein domain birth (domain shuffling index) suggests that shuffling had a more demonstrable effect on protein families in nematodes and arthropods than in vertebrates. Through the contrast of high and low domain shuffling indices at the lineages of Trichinella spiralis and Gallus gallus, we propose a link between protein redundancy and evolutionary changes controlled by domain shuffling; however, the speed of adaptation among the different lineages was relatively invariant. Evaluating the functions of protein families that appeared or disappeared at the last common ancestors (LCAs) of the three metazoan clades supports a correlation with organism adaptation. Furthermore, bursts of new protein families and domains in the LCAs of metazoans and vertebrates are consistent with whole genome duplications. Conclusion Metazoan speciation and adaptation were explored by birth/death and duplication/deletion events among protein families and domains. Our results provide insights into protein evolution and its bearing on metazoan evolution. PMID:22862991
Proteome Characterization of Leaves in Common Bean

PubMed Central

Robison, Faith M.; Heuberger, Adam L.; Brick, Mark A.; Prenni, Jessica E.

2015-01-01

Dry edible bean (Phaseolus vulgaris L.) is a globally relevant food crop. The bean genome was recently sequenced and annotated allowing for proteomics investigations aimed at characterization of leaf phenotypes important to agriculture. The objective of this study was to utilize a shotgun proteomics approach to characterize the leaf proteome and to identify protein abundance differences between two bean lines with known variation in their physiological resistance to biotic stresses. Overall, 640 proteins were confidently identified. Among these are proteins known to be involved in a variety of molecular functions including oxidoreductase activity, binding peroxidase activity, and hydrolase activity. Twenty nine proteins were found to significantly vary in abundance (p-value < 0.05) between the two bean lines, including proteins associated with biotic stress. To our knowledge, this work represents the first large scale shotgun proteomic analysis of beans and our results lay the groundwork for future studies designed to investigate the molecular mechanisms involved in pathogen resistance. PMID:28248269
Automation, parallelism, and robotics for proteomics.

PubMed

Alterovitz, Gil; Liu, Jonathan; Chow, Jijun; Ramoni, Marco F

2006-07-01

The speed of the human genome project (Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C. et al., Nature 2001, 409, 860-921) was made possible, in part, by developments in automation of sequencing technologies. Before these technologies, sequencing was a laborious, expensive, and personnel-intensive task. Similarly, automation and robotics are changing the field of proteomics today. Proteomics is defined as the effort to understand and characterize proteins in the categories of structure, function and interaction (Englbrecht, C. C., Facius, A., Comb. Chem. High Throughput Screen. 2005, 8, 705-715). As such, this field nicely lends itself to automation technologies since these methods often require large economies of scale in order to achieve cost and time-saving benefits. This article describes some of the technologies and methods being applied in proteomics in order to facilitate automation within the field as well as in linking proteomics-based information with other related research areas.
Large-scale inference of protein tissue origin in gram-positive sepsis plasma using quantitative targeted proteomics

PubMed Central

Malmström, Erik; Kilsgård, Ola; Hauri, Simon; Smeds, Emanuel; Herwald, Heiko; Malmström, Lars; Malmström, Johan

2016-01-01

The plasma proteome is highly dynamic and variable, composed of proteins derived from surrounding tissues and cells. To investigate the complex processes that control the composition of the plasma proteome, we developed a mass spectrometry-based proteomics strategy to infer the origin of proteins detected in murine plasma. The strategy relies on the construction of a comprehensive protein tissue atlas from cells and highly vascularized organs using shotgun mass spectrometry. The protein tissue atlas was transformed to a spectral library for highly reproducible quantification of tissue-specific proteins directly in plasma using SWATH-like data-independent mass spectrometry analysis. We show that the method can determine drastic changes of tissue-specific protein profiles in blood plasma from mouse animal models with sepsis. The strategy can be extended to several other species advancing our understanding of the complex processes that contribute to the plasma proteome dynamics. PMID:26732734
A Method for Label-Free, Differential Top-Down Proteomics.

PubMed

Ntai, Ioanna; Toby, Timothy K; LeDuc, Richard D; Kelleher, Neil L

2016-01-01

Biomarker discovery in the translational research has heavily relied on labeled and label-free quantitative bottom-up proteomics. Here, we describe a new approach to biomarker studies that utilizes high-throughput top-down proteomics and is the first to offer whole protein characterization and relative quantitation within the same experiment. Using yeast as a model, we report procedures for a label-free approach to quantify the relative abundance of intact proteins ranging from 0 to 30 kDa in two different states. In this chapter, we describe the integrated methodology for the large-scale profiling and quantitation of the intact proteome by liquid chromatography-mass spectrometry (LC-MS) without the need for metabolic or chemical labeling. This recent advance for quantitative top-down proteomics is best implemented with a robust and highly controlled sample preparation workflow before data acquisition on a high-resolution mass spectrometer, and the application of a hierarchical linear statistical model to account for the multiple levels of variance contained in quantitative proteomic comparisons of samples for basic and clinical research.
Proteomics technique opens new frontiers in mobilome research.

PubMed

Davidson, Andrew D; Matthews, David A; Maringer, Kevin

2017-01-01

A large proportion of the genome of most eukaryotic organisms consists of highly repetitive mobile genetic elements. The sum of these elements is called the "mobilome," which in eukaryotes is made up mostly of transposons. Transposable elements contribute to disease, evolution, and normal physiology by mediating genetic rearrangement, and through the "domestication" of transposon proteins for cellular functions. Although 'omics studies of mobilome genomes and transcriptomes are common, technical challenges have hampered high-throughput global proteomics analyses of transposons. In a recent paper, we overcame these technical hurdles using a technique called "proteomics informed by transcriptomics" (PIT), and thus published the first unbiased global mobilome-derived proteome for any organism (using cell lines derived from the mosquito Aedes aegypti ). In this commentary, we describe our methods in more detail, and summarise our major findings. We also use new genome sequencing data to show that, in many cases, the specific genomic element expressing a given protein can be identified using PIT. This proteomic technique therefore represents an important technological advance that will open new avenues of research into the role that proteins derived from transposons and other repetitive and sequence diverse genetic elements, such as endogenous retroviruses, play in health and disease.
Recent advances in stable isotope labeling based techniques for proteome relative quantification.

PubMed

Zhou, Yuan; Shan, Yichu; Zhang, Lihua; Zhang, Yukui

2014-10-24

The large scale relative quantification of all proteins expressed in biological samples under different states is of great importance for discovering proteins with important biological functions, as well as screening disease related biomarkers and drug targets. Therefore, the accurate quantification of proteins at proteome level has become one of the key issues in protein science. Herein, the recent advances in stable isotope labeling based techniques for proteome relative quantification were reviewed, from the aspects of metabolic labeling, chemical labeling and enzyme-catalyzed labeling. Furthermore, the future research direction in this field was prospected. Copyright © 2014 Elsevier B.V. All rights reserved.

High throughput profile-profile based fold recognition for the entire human proteome.

PubMed

McGuffin, Liam J; Smith, Richard T; Bryson, Kevin; Sørensen, Søren-Aksel; Jones, David T

2006-06-07

In order to maintain the most comprehensive structural annotation databases we must carry out regular updates for each proteome using the latest profile-profile fold recognition methods. The ability to carry out these updates on demand is necessary to keep pace with the regular updates of sequence and structure databases. Providing the highest quality structural models requires the most intensive profile-profile fold recognition methods running with the very latest available sequence databases and fold libraries. However, running these methods on such a regular basis for every sequenced proteome requires large amounts of processing power. In this paper we describe and benchmark the JYDE (Job Yield Distribution Environment) system, which is a meta-scheduler designed to work above cluster schedulers, such as Sun Grid Engine (SGE) or Condor. We demonstrate the ability of JYDE to distribute the load of genomic-scale fold recognition across multiple independent Grid domains. We use the most recent profile-profile version of our mGenTHREADER software in order to annotate the latest version of the Human proteome against the latest sequence and structure databases in as short a time as possible. We show that our JYDE system is able to scale to large numbers of intensive fold recognition jobs running across several independent computer clusters. Using our JYDE system we have been able to annotate 99.9% of the protein sequences within the Human proteome in less than 24 hours, by harnessing over 500 CPUs from 3 independent Grid domains. This study clearly demonstrates the feasibility of carrying out on demand high quality structural annotations for the proteomes of major eukaryotic organisms. Specifically, we have shown that it is now possible to provide complete regular updates of profile-profile based fold recognition models for entire eukaryotic proteomes, through the use of Grid middleware such as JYDE.
Spermatogenesis in mammals: proteomic insights.

PubMed

Chocu, Sophie; Calvel, Pierre; Rolland, Antoine D; Pineau, Charles

2012-08-01

Spermatogenesis is a highly sophisticated process involved in the transmission of genetic heritage. It includes halving ploidy, repackaging of the chromatin for transport, and the equipment of developing spermatids and eventually spermatozoa with the advanced apparatus (e.g., tightly packed mitochondrial sheat in the mid piece, elongating of the tail, reduction of cytoplasmic volume) to elicit motility once they reach the epididymis. Mammalian spermatogenesis is divided into three phases. In the first the primitive germ cells or spermatogonia undergo a series of mitotic divisions. In the second the spermatocytes undergo two consecutive divisions in meiosis to produce haploid spermatids. In the third the spermatids differentiate into spermatozoa in a process called spermiogenesis. Paracrine, autocrine, juxtacrine, and endocrine pathways all contribute to the regulation of the process. The array of structural elements and chemical factors modulating somatic and germ cell activity is such that the network linking the various cellular activities during spermatogenesis is unimaginably complex. Over the past two decades, advances in genomics have greatly improved our knowledge of spermatogenesis, by identifying numerous genes essential for the development of functional male gametes. Large-scale analyses of testicular function have deepened our insight into normal and pathological spermatogenesis. Progress in genome sequencing and microarray technology have been exploited for genome-wide expression studies, leading to the identification of hundreds of genes differentially expressed within the testis. However, although proteomics has now come of age, the proteomics-based investigation of spermatogenesis remains in its infancy. Here, we review the state-of-the-art of large-scale proteomic analyses of spermatogenesis, from germ cell development during sex determination to spermatogenesis in the adult. Indeed, a few laboratories have undertaken differential protein profiling expression studies and/or systematic analyses of testicular proteomes in entire organs or isolated cells from various species. We consider the pros and cons of proteomics for studying the testicular germ cell gene expression program. Finally, we address the use of protein datasets, through integrative genomics (i.e., combining genomics, transcriptomics, and proteomics), bioinformatics, and modelling.
Next-Generation Proteomics and Its Application to Clinical Breast Cancer Research.

PubMed

Mardamshina, Mariya; Geiger, Tamar

2017-10-01

Proteomics technology aims to map the protein landscapes of biological samples, and it can be applied to a variety of samples, including cells, tissues, and body fluids. Because the proteins are the main functional molecules in the cells, their levels reflect much more accurately the cellular phenotype and the regulatory processes within them than gene levels, mutations, and even mRNA levels. With the advancement in the technology, it is possible now to obtain comprehensive views of the biological systems and to study large patient cohorts in a streamlined manner. In this review we discuss the technological advancements in mass spectrometry-based proteomics, which allow analysis of breast cancer tissue samples, leading to the first large-scale breast cancer proteomics studies. Furthermore, we discuss the technological developments in blood-based biomarker discovery, which provide the basis for future development of assays for routine clinical use. Although these are only the first steps in implementation of proteomics into the clinic, extensive collaborative work between these worlds will undoubtedly lead to major discoveries and advances in clinical practice. Copyright © 2017 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.
The online Tabloid Proteome: an annotated database of protein associations

PubMed Central

Turan, Demet; Tavernier, Jan

2018-01-01

Abstract A complete knowledge of the proteome can only be attained by determining the associations between proteins, along with the nature of these associations (e.g. physical contact in protein–protein interactions, participation in complex formation or different roles in the same pathway). Despite extensive efforts in elucidating direct protein interactions, our knowledge on the complete spectrum of protein associations remains limited. We therefore developed a new approach that detects protein associations from identifications obtained after re-processing of large-scale, public mass spectrometry-based proteomics data. Our approach infers protein association based on the co-occurrence of proteins across many different proteomics experiments, and provides information that is almost completely complementary to traditional direct protein interaction studies. We here present a web interface to query and explore the associations derived from this method, called the online Tabloid Proteome. The online Tabloid Proteome also integrates biological knowledge from several existing resources to annotate our derived protein associations. The online Tabloid Proteome is freely available through a user-friendly web interface, which provides intuitive navigation and data exploration options for the user at http://iomics.ugent.be/tabloidproteome. PMID:29040688
Developmental and Subcellular Organization of Single-Cell C₄ Photosynthesis in Bienertia sinuspersici Determined by Large-Scale Proteomics and cDNA Assembly from 454 DNA Sequencing.

PubMed

Offermann, Sascha; Friso, Giulia; Doroshenk, Kelly A; Sun, Qi; Sharpe, Richard M; Okita, Thomas W; Wimmer, Diana; Edwards, Gerald E; van Wijk, Klaas J

2015-05-01

Kranz C4 species strictly depend on separation of primary and secondary carbon fixation reactions in different cell types. In contrast, the single-cell C4 (SCC4) species Bienertia sinuspersici utilizes intracellular compartmentation including two physiologically and biochemically different chloroplast types; however, information on identity, localization, and induction of proteins required for this SCC4 system is currently very limited. In this study, we determined the distribution of photosynthesis-related proteins and the induction of the C4 system during development by label-free proteomics of subcellular fractions and leaves of different developmental stages. This was enabled by inferring a protein sequence database from 454 sequencing of Bienertia cDNAs. Large-scale proteome rearrangements were observed as C4 photosynthesis developed during leaf maturation. The proteomes of the two chloroplasts are different with differential accumulation of linear and cyclic electron transport components, primary and secondary carbon fixation reactions, and a triose-phosphate shuttle that is shared between the two chloroplast types. This differential protein distribution pattern suggests the presence of a mRNA or protein-sorting mechanism for nuclear-encoded, chloroplast-targeted proteins in SCC4 species. The combined information was used to provide a comprehensive model for NAD-ME type carbon fixation in SCC4 species.
Evolution of Clinical Proteomics and its Role in Medicine | Office of Cancer Clinical Proteomics Research

Cancer.gov

NCI's Office of Cancer Clinical Proteomics Research authored a review of the current state of clinical proteomics in the peer-reviewed Journal of Proteome Research. The review highlights outcomes from the CPTC program and also provides a thorough overview of the different technologies that have pushed the field forward. Additionally, the review provides a vision for moving the field forward through linking advances in genomic and proteomic analysis to develop new, molecularly targeted interventions.
Comparative bioinformatics analyses and profiling of lysosome-related organelle proteomes

NASA Astrophysics Data System (ADS)

Hu, Zhang-Zhi; Valencia, Julio C.; Huang, Hongzhan; Chi, An; Shabanowitz, Jeffrey; Hearing, Vincent J.; Appella, Ettore; Wu, Cathy

2007-01-01

Complete and accurate profiling of cellular organelle proteomes, while challenging, is important for the understanding of detailed cellular processes at the organelle level. Mass spectrometry technologies coupled with bioinformatics analysis provide an effective approach for protein identification and functional interpretation of organelle proteomes. In this study, we have compiled human organelle reference datasets from large-scale proteomic studies and protein databases for seven lysosome-related organelles (LROs), as well as the endoplasmic reticulum and mitochondria, for comparative organelle proteome analysis. Heterogeneous sources of human organelle proteins and rodent homologs are mapped to human UniProtKB protein entries based on ID and/or peptide mappings, followed by functional annotation and categorization using the iProXpress proteomic expression analysis system. Cataloging organelle proteomes allows close examination of both shared and unique proteins among various LROs and reveals their functional relevance. The proteomic comparisons show that LROs are a closely related family of organelles. The shared proteins indicate the dynamic and hybrid nature of LROs, while the unique transmembrane proteins may represent additional candidate marker proteins for LROs. This comparative analysis, therefore, provides a basis for hypothesis formulation and experimental validation of organelle proteins and their functional roles.
Frequently Asked Questions about Genetic and Genomic Science

MedlinePlus

... of the new genetic and genomic techniques and technologies? Proteomics The suffix "-ome" comes from the Greek ... pharmacogenomics is one of the large-scale "omic" technologies, it can examine the entirety of the genome, ...
Cell-free protein synthesis: applications in proteomics and biotechnology.

PubMed

He, Mingyue

2008-01-01

Protein production is one of the key steps in biotechnology and functional proteomics. Expression of proteins in heterologous hosts (such as in E. coli) is generally lengthy and costly. Cell-free protein synthesis is thus emerging as an attractive alternative. In addition to the simplicity and speed for protein production, cell-free expression allows generation of functional proteins that are difficult to produce by in vivo systems. Recent exploitation of cell-free systems enables novel development of technologies for rapid discovery of proteins with desirable properties from very large libraries. This article reviews the recent development in cell-free systems and their application in the large scale protein analysis.
Computer aided manual validation of mass spectrometry-based proteomic data.

PubMed

Curran, Timothy G; Bryson, Bryan D; Reigelhaupt, Michael; Johnson, Hannah; White, Forest M

2013-06-15

Advances in mass spectrometry-based proteomic technologies have increased the speed of analysis and the depth provided by a single analysis. Computational tools to evaluate the accuracy of peptide identifications from these high-throughput analyses have not kept pace with technological advances; currently the most common quality evaluation methods are based on statistical analysis of the likelihood of false positive identifications in large-scale data sets. While helpful, these calculations do not consider the accuracy of each identification, thus creating a precarious situation for biologists relying on the data to inform experimental design. Manual validation is the gold standard approach to confirm accuracy of database identifications, but is extremely time-intensive. To palliate the increasing time required to manually validate large proteomic datasets, we provide computer aided manual validation software (CAMV) to expedite the process. Relevant spectra are collected, catalogued, and pre-labeled, allowing users to efficiently judge the quality of each identification and summarize applicable quantitative information. CAMV significantly reduces the burden associated with manual validation and will hopefully encourage broader adoption of manual validation in mass spectrometry-based proteomics. Copyright © 2013 Elsevier Inc. All rights reserved.
Application of proteomics to ecology and population biology.

PubMed

Karr, T L

2008-02-01

Proteomics is a relatively new scientific discipline that merges protein biochemistry, genome biology and bioinformatics to determine the spatial and temporal expression of proteins in cells, tissues and whole organisms. There has been very little application of proteomics to the fields of behavioral genetics, evolution, ecology and population dynamics, and has only recently been effectively applied to the closely allied fields of molecular evolution and genetics. However, there exists considerable potential for proteomics to impact in areas related to functional ecology; this review will introduce the general concepts and methodologies that define the field of proteomics and compare and contrast the advantages and disadvantages with other methods. Examples of how proteomics can aid, complement and indeed extend the study of functional ecology will be discussed including the main tool of ecological studies, population genetics with an emphasis on metapopulation structure analysis. Because proteomic analyses provide a direct measure of gene expression, it obviates some of the limitations associated with other genomic approaches, such as microarray and EST analyses. Likewise, in conjunction with associated bioinformatics and molecular evolutionary tools, proteomics can provide the foundation of a systems-level integration approach that can enhance ecological studies. It can be envisioned that proteomics will provide important new information on issues specific to metapopulation biology and adaptive processes in nature. A specific example of the application of proteomics to sperm ageing is provided to illustrate the potential utility of the approach.
Ubiquitinated Proteome: Ready for Global?*

PubMed Central

Shi, Yi; Xu, Ping; Qin, Jun

2011-01-01

Ubiquitin (Ub) is a small and highly conserved protein that can covalently modify protein substrates. Ubiquitination is one of the major post-translational modifications that regulate a broad spectrum of cellular functions. The advancement of mass spectrometers as well as the development of new affinity purification tools has greatly expedited proteome-wide analysis of several post-translational modifications (e.g. phosphorylation, glycosylation, and acetylation). In contrast, large-scale profiling of lysine ubiquitination remains a challenge. Most recently, new Ub affinity reagents such as Ub remnant antibody and tandem Ub binding domains have been developed, allowing for relatively large-scale detection of several hundreds of lysine ubiquitination events in human cells. Here we review different strategies for the identification of ubiquitination site and discuss several issues associated with data analysis. We suggest that careful interpretation and orthogonal confirmation of MS spectra is necessary to minimize false positive assignments by automatic searching algorithms. PMID:21339389
Tools for phospho- and glycoproteomics of plasma membranes.

PubMed

Wiśniewski, Jacek R

2011-07-01

Analysis of plasma membrane proteins and their posttranslational modifications is considered as important for identification of disease markers and targets for drug treatment. Due to their insolubility in water, studying of plasma membrane proteins using mass spectrometry has been difficult for a long time. Recent technological developments in sample preparation together with important improvements in mass spectrometric analysis have facilitated analysis of these proteins and their posttranslational modifications. Now, large scale proteomic analyses allow identification of thousands of membrane proteins from minute amounts of sample. Optimized protocols for affinity enrichment of phosphorylated and glycosylated peptides have set new dimensions in the depth of characterization of these posttranslational modifications of plasma membrane proteins. Here, I summarize recent advances in proteomic technology for the characterization of the cell surface proteins and their modifications. In the focus are approaches allowing large scale mapping rather than analytical methods suitable for studying individual proteins or non-complex mixtures.
A Scalable Approach for Protein False Discovery Rate Estimation in Large Proteomic Data Sets.

PubMed

Savitski, Mikhail M; Wilhelm, Mathias; Hahne, Hannes; Kuster, Bernhard; Bantscheff, Marcus

2015-09-01

Calculating the number of confidently identified proteins and estimating false discovery rate (FDR) is a challenge when analyzing very large proteomic data sets such as entire human proteomes. Biological and technical heterogeneity in proteomic experiments further add to the challenge and there are strong differences in opinion regarding the conceptual validity of a protein FDR and no consensus regarding the methodology for protein FDR determination. There are also limitations inherent to the widely used classic target-decoy strategy that particularly show when analyzing very large data sets and that lead to a strong over-representation of decoy identifications. In this study, we investigated the merits of the classic, as well as a novel target-decoy-based protein FDR estimation approach, taking advantage of a heterogeneous data collection comprised of ∼19,000 LC-MS/MS runs deposited in ProteomicsDB (https://www.proteomicsdb.org). The "picked" protein FDR approach treats target and decoy sequences of the same protein as a pair rather than as individual entities and chooses either the target or the decoy sequence depending on which receives the highest score. We investigated the performance of this approach in combination with q-value based peptide scoring to normalize sample-, instrument-, and search engine-specific differences. The "picked" target-decoy strategy performed best when protein scoring was based on the best peptide q-value for each protein yielding a stable number of true positive protein identifications over a wide range of q-value thresholds. We show that this simple and unbiased strategy eliminates a conceptual issue in the commonly used "classic" protein FDR approach that causes overprediction of false-positive protein identification in large data sets. The approach scales from small to very large data sets without losing performance, consistently increases the number of true-positive protein identifications and is readily implemented in proteomics analysis software. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
A Scalable Approach for Protein False Discovery Rate Estimation in Large Proteomic Data Sets

PubMed Central

Savitski, Mikhail M.; Wilhelm, Mathias; Hahne, Hannes; Kuster, Bernhard; Bantscheff, Marcus

2015-01-01

Calculating the number of confidently identified proteins and estimating false discovery rate (FDR) is a challenge when analyzing very large proteomic data sets such as entire human proteomes. Biological and technical heterogeneity in proteomic experiments further add to the challenge and there are strong differences in opinion regarding the conceptual validity of a protein FDR and no consensus regarding the methodology for protein FDR determination. There are also limitations inherent to the widely used classic target–decoy strategy that particularly show when analyzing very large data sets and that lead to a strong over-representation of decoy identifications. In this study, we investigated the merits of the classic, as well as a novel target–decoy-based protein FDR estimation approach, taking advantage of a heterogeneous data collection comprised of ∼19,000 LC-MS/MS runs deposited in ProteomicsDB (https://www.proteomicsdb.org). The “picked” protein FDR approach treats target and decoy sequences of the same protein as a pair rather than as individual entities and chooses either the target or the decoy sequence depending on which receives the highest score. We investigated the performance of this approach in combination with q-value based peptide scoring to normalize sample-, instrument-, and search engine-specific differences. The “picked” target–decoy strategy performed best when protein scoring was based on the best peptide q-value for each protein yielding a stable number of true positive protein identifications over a wide range of q-value thresholds. We show that this simple and unbiased strategy eliminates a conceptual issue in the commonly used “classic” protein FDR approach that causes overprediction of false-positive protein identification in large data sets. The approach scales from small to very large data sets without losing performance, consistently increases the number of true-positive protein identifications and is readily implemented in proteomics analysis software. PMID:25987413
Proteomics technique opens new frontiers in mobilome research

PubMed Central

Davidson, Andrew D.; Matthews, David A.

2017-01-01

ABSTRACT A large proportion of the genome of most eukaryotic organisms consists of highly repetitive mobile genetic elements. The sum of these elements is called the “mobilome,” which in eukaryotes is made up mostly of transposons. Transposable elements contribute to disease, evolution, and normal physiology by mediating genetic rearrangement, and through the “domestication” of transposon proteins for cellular functions. Although ‘omics studies of mobilome genomes and transcriptomes are common, technical challenges have hampered high-throughput global proteomics analyses of transposons. In a recent paper, we overcame these technical hurdles using a technique called “proteomics informed by transcriptomics” (PIT), and thus published the first unbiased global mobilome-derived proteome for any organism (using cell lines derived from the mosquito Aedes aegypti). In this commentary, we describe our methods in more detail, and summarise our major findings. We also use new genome sequencing data to show that, in many cases, the specific genomic element expressing a given protein can be identified using PIT. This proteomic technique therefore represents an important technological advance that will open new avenues of research into the role that proteins derived from transposons and other repetitive and sequence diverse genetic elements, such as endogenous retroviruses, play in health and disease. PMID:28932623
Current trends in quantitative proteomics - an update.

PubMed

Li, H; Han, J; Pan, J; Liu, T; Parker, C E; Borchers, C H

2017-05-01

Proteins can provide insights into biological processes at the functional level, so they are very promising biomarker candidates. The quantification of proteins in biological samples has been routinely used for the diagnosis of diseases and monitoring the treatment. Although large-scale protein quantification in complex samples is still a challenging task, a great amount of effort has been made to advance the technologies that enable quantitative proteomics. Seven years ago, in 2009, we wrote an article about the current trends in quantitative proteomics. In writing this current paper, we realized that, today, we have an even wider selection of potential tools for quantitative proteomics. These tools include new derivatization reagents, novel sampling formats, new types of analyzers and scanning techniques, and recently developed software to assist in assay development and data analysis. In this review article, we will discuss these innovative methods, and their current and potential applications in proteomics. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
A multi-protease, multi-dissociation, bottom-up-to-top-down proteomic view of the Loxosceles intermedia venom

PubMed Central

Trevisan-Silva, Dilza; Bednaski, Aline V.; Fischer, Juliana S.G.; Veiga, Silvio S.; Bandeira, Nuno; Guthals, Adrian; Marchini, Fabricio K.; Leprevost, Felipe V.; Barbosa, Valmir C.; Senff-Ribeiro, Andrea; Carvalho, Paulo C.

2017-01-01

Venoms are a rich source for the discovery of molecules with biotechnological applications, but their analysis is challenging even for state-of-the-art proteomics. Here we report on a large-scale proteomic assessment of the venom of Loxosceles intermedia, the so-called brown spider. Venom was extracted from 200 spiders and fractioned into two aliquots relative to a 10 kDa cutoff mass. Each of these was further fractioned and digested with trypsin (4 h), trypsin (18 h), pepsin (18 h), and chymotrypsin (18 h), then analyzed by MudPIT on an LTQ-Orbitrap XL ETD mass spectrometer fragmenting precursors by CID, HCD, and ETD. Aliquots of undigested samples were also analyzed. Our experimental design allowed us to apply spectral networks, thus enabling us to obtain meta-contig assemblies, and consequently de novo sequencing of practically complete proteins, culminating in a deep proteome assessment of the venom. Data are available via ProteomeXchange, with identifier PXD005523. PMID:28696408
Comparing Simplification Strategies for the Skeletal Muscle Proteome

PubMed Central

Geary, Bethany; Young, Iain S.; Cash, Phillip; Whitfield, Phillip D.; Doherty, Mary K.

2016-01-01

Skeletal muscle is a complex tissue that is dominated by the presence of a few abundant proteins. This wide dynamic range can mask the presence of lower abundance proteins, which can be a confounding factor in large-scale proteomic experiments. In this study, we have investigated a number of pre-fractionation methods, at both the protein and peptide level, for the characterization of the skeletal muscle proteome. The analyses revealed that the use of OFFGEL isoelectric focusing yielded the largest number of protein identifications (>750) compared to alternative gel-based and protein equalization strategies. Further, OFFGEL led to a substantial enrichment of a different sub-population of the proteome. Filter-aided sample preparation (FASP), coupled to peptide-level OFFGEL provided more confidence in the results due to a substantial increase in the number of peptides assigned to each protein. The findings presented here support the use of a multiplexed approach to proteome characterization of skeletal muscle, which has a recognized imbalance in the dynamic range of its protein complement. PMID:28248220
Proteomic Profiling of Mitochondrial Enzymes during Skeletal Muscle Aging.

PubMed

Staunton, Lisa; O'Connell, Kathleen; Ohlendieck, Kay

2011-03-07

Mitochondria are of central importance for energy generation in skeletal muscles. Expression changes or functional alterations in mitochondrial enzymes play a key role during myogenesis, fibre maturation, and various neuromuscular pathologies, as well as natural fibre aging. Mass spectrometry-based proteomics suggests itself as a convenient large-scale and high-throughput approach to catalogue the mitochondrial protein complement and determine global changes during health and disease. This paper gives a brief overview of the relatively new field of mitochondrial proteomics and discusses the findings from recent proteomic surveys of mitochondrial elements in aged skeletal muscles. Changes in the abundance, biochemical activity, subcellular localization, and/or posttranslational modifications in key mitochondrial enzymes might be useful as novel biomarkers of aging. In the long term, this may advance diagnostic procedures, improve the monitoring of disease progression, help in the testing of side effects due to new drug regimes, and enhance our molecular understanding of age-related muscle degeneration.

Mass spectrometry-based biomarker discovery: toward a global proteome index of individuality.

PubMed

Hawkridge, Adam M; Muddiman, David C

2009-01-01

Biomarker discovery and proteomics have become synonymous with mass spectrometry in recent years. Although this conflation is an injustice to the many essential biomolecular techniques widely used in biomarker-discovery platforms, it underscores the power and potential of contemporary mass spectrometry. Numerous novel and powerful technologies have been developed around mass spectrometry, proteomics, and biomarker discovery over the past 20 years to globally study complex proteomes (e.g., plasma). However, very few large-scale longitudinal studies have been carried out using these platforms to establish the analytical variability relative to true biological variability. The purpose of this review is not to cover exhaustively the applications of mass spectrometry to biomarker discovery, but rather to discuss the analytical methods and strategies that have been developed for mass spectrometry-based biomarker-discovery platforms and to place them in the context of the many challenges and opportunities yet to be addressed.
Investigating the Role of Large-Scale Domain Dynamics in Protein-Protein Interactions.

PubMed

Delaforge, Elise; Milles, Sigrid; Huang, Jie-Rong; Bouvier, Denis; Jensen, Malene Ringkjøbing; Sattler, Michael; Hart, Darren J; Blackledge, Martin

2016-01-01

Intrinsically disordered linkers provide multi-domain proteins with degrees of conformational freedom that are often essential for function. These highly dynamic assemblies represent a significant fraction of all proteomes, and deciphering the physical basis of their interactions represents a considerable challenge. Here we describe the difficulties associated with mapping the large-scale domain dynamics and describe two recent examples where solution state methods, in particular NMR spectroscopy, are used to investigate conformational exchange on very different timescales.
Investigating the Role of Large-Scale Domain Dynamics in Protein-Protein Interactions

PubMed Central

Delaforge, Elise; Milles, Sigrid; Huang, Jie-rong; Bouvier, Denis; Jensen, Malene Ringkjøbing; Sattler, Michael; Hart, Darren J.; Blackledge, Martin

2016-01-01

Intrinsically disordered linkers provide multi-domain proteins with degrees of conformational freedom that are often essential for function. These highly dynamic assemblies represent a significant fraction of all proteomes, and deciphering the physical basis of their interactions represents a considerable challenge. Here we describe the difficulties associated with mapping the large-scale domain dynamics and describe two recent examples where solution state methods, in particular NMR spectroscopy, are used to investigate conformational exchange on very different timescales. PMID:27679800
An in-depth snake venom proteopeptidome characterization: Benchmarking Bothrops jararaca.

PubMed

Nicolau, Carolina A; Carvalho, Paulo C; Junqueira-de-Azevedo, Inácio L M; Teixeira-Ferreira, André; Junqueira, Magno; Perales, Jonas; Neves-Ferreira, Ana Gisele C; Valente, Richard H

2017-01-16

A large-scale proteomic approach was devised to advance the understanding of venom composition. Bothrops jararaca venom was fractionated by OFFGEL followed by chromatography, generating peptidic and proteic fractions. The latter was submitted to trypsin digestion. Both fractions were separately analyzed by reversed-phase nanochromatography coupled to high resolution mass spectrometry. This strategy allowed deeper and joint characterizations of the peptidome and proteome (proteopeptidome) of this venom. Our results lead to the identification of 46 protein classes (with several uniquely assigned proteins per class) comprising eight high-abundance bona fide venom components, and 38 additional classes in smaller quantities. This last category included previously described B. jararaca venom proteins, common Elapidae venom constituents (cobra venom factor and three-finger toxin), and proteins typically encountered in lysosomes, cellular membranes and blood plasma. Furthermore, this report is the most complete snake venom peptidome described so far, both in number of peptides and in variety of unique proteins that could have originated them. It is hypothesized that such diversity could enclose cryptides, whose bioactivities would contribute to envenomation in yet undetermined ways. Finally, we propose that the broad range screening of B. jararaca peptidome will facilitate the discovery of bioactive molecules, eventually leading to valuable therapeutical agents. Our proteopeptidomic strategy yielded unprecedented insights into the remarkable diversity of B. jararaca venom composition, both at the peptide and protein levels. These results bring a substantial contribution to the actual pursuit of large-scale protein-level assignment in snake venomics. The detection of typical elapidic venom components, in a Viperidae venom, reinforces our view that the use of this approach (hand-in-hand with transcriptomic and genomic data) for venom proteomic analysis, at the specimen-level, can greatly contribute for venom toxin evolution studies. Furthermore, data were generated in support of a previous hypothesis that venom gland secretory vesicles are specialized forms of lysosomes. Two testable hypotheses also emerge from the results of this work. The first is that a nucleobindin-2-derived protein could lead to prey disorientation during envenomation, aiding in its capture by the snake. The other being that the venom's peptidome might contain a population of cryptides, whose biological activities could lead to the development of new therapeutical agents. Copyright © 2016 Elsevier B.V. All rights reserved.
Linking the proteins--elucidation of proteome-scale networks using mass spectrometry.

PubMed

Pflieger, Delphine; Gonnet, Florence; de la Fuente van Bentem, Sergio; Hirt, Heribert; de la Fuente, Alberto

2011-01-01

Proteomes are intricate. Typically, thousands of proteins interact through physical association and post-translational modifications (PTMs) to give rise to the emergent functions of cells. Understanding these functions requires one to study proteomes as "systems" rather than collections of individual protein molecules. The abstraction of the interacting proteome to "protein networks" has recently gained much attention, as networks are effective representations, that lose specific molecular details, but provide the ability to see the proteome as a whole. Mostly two aspects of the proteome have been represented by network models: proteome-wide physical protein-protein-binding interactions organized into Protein Interaction Networks (PINs), and proteome-wide PTM relations organized into Protein Signaling Networks (PSNs). Mass spectrometry (MS) techniques have been shown to be essential to reveal both of these aspects on a proteome-wide scale. Techniques such as affinity purification followed by MS have been used to elucidate protein-protein interactions, and MS-based quantitative phosphoproteomics is critical to understand the structure and dynamics of signaling through the proteome. We here review the current state-of-the-art MS-based analytical pipelines for the purpose to characterize proteome-scale networks. Copyright © 2010 Wiley Periodicals, Inc.
multiplierz v2.0: A Python-based ecosystem for shared access and analysis of native mass spectrometry data.

PubMed

Alexander, William M; Ficarro, Scott B; Adelmant, Guillaume; Marto, Jarrod A

2017-08-01

The continued evolution of modern mass spectrometry instrumentation and associated methods represents a critical component in efforts to decipher the molecular mechanisms which underlie normal physiology and understand how dysregulation of biological pathways contributes to human disease. The increasing scale of these experiments combined with the technological diversity of mass spectrometers presents several challenges for community-wide data access, analysis, and distribution. Here we detail a redesigned version of multiplierz, our Python software library which leverages our common application programming interface (mzAPI) for analysis and distribution of proteomic data. New features include support for a wider range of native mass spectrometry file types, interfaces to additional database search engines, compatibility with new reporting formats, and high-level tools to perform post-search proteomic analyses. A GUI desktop environment, mzDesktop, provides access to multiplierz functionality through a user friendly interface. multiplierz is available for download from: https://github.com/BlaisProteomics/multiplierz; and mzDesktop is available for download from: https://sourceforge.net/projects/multiplierz/. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Automated image alignment for 2D gel electrophoresis in a high-throughput proteomics pipeline.

PubMed

Dowsey, Andrew W; Dunn, Michael J; Yang, Guang-Zhong

2008-04-01

The quest for high-throughput proteomics has revealed a number of challenges in recent years. Whilst substantial improvements in automated protein separation with liquid chromatography and mass spectrometry (LC/MS), aka 'shotgun' proteomics, have been achieved, large-scale open initiatives such as the Human Proteome Organization (HUPO) Brain Proteome Project have shown that maximal proteome coverage is only possible when LC/MS is complemented by 2D gel electrophoresis (2-DE) studies. Moreover, both separation methods require automated alignment and differential analysis to relieve the bioinformatics bottleneck and so make high-throughput protein biomarker discovery a reality. The purpose of this article is to describe a fully automatic image alignment framework for the integration of 2-DE into a high-throughput differential expression proteomics pipeline. The proposed method is based on robust automated image normalization (RAIN) to circumvent the drawbacks of traditional approaches. These use symbolic representation at the very early stages of the analysis, which introduces persistent errors due to inaccuracies in modelling and alignment. In RAIN, a third-order volume-invariant B-spline model is incorporated into a multi-resolution schema to correct for geometric and expression inhomogeneity at multiple scales. The normalized images can then be compared directly in the image domain for quantitative differential analysis. Through evaluation against an existing state-of-the-art method on real and synthetically warped 2D gels, the proposed analysis framework demonstrates substantial improvements in matching accuracy and differential sensitivity. High-throughput analysis is established through an accelerated GPGPU (general purpose computation on graphics cards) implementation. Supplementary material, software and images used in the validation are available at http://www.proteomegrid.org/rain/.
Systematic Evaluation of the Use of Human Plasma and Serum for Mass-Spectrometry-Based Shotgun Proteomics.

PubMed

Lan, Jiayi; Núñez Galindo, Antonio; Doecke, James; Fowler, Christopher; Martins, Ralph N; Rainey-Smith, Stephanie R; Cominetti, Ornella; Dayon, Loïc

2018-04-06

Over the last two decades, EDTA-plasma has been used as the preferred sample matrix for human blood proteomic profiling. Serum has also been employed widely. Only a few studies have assessed the difference and relevance of the proteome profiles obtained from plasma samples, such as EDTA-plasma or lithium-heparin-plasma, and serum. A more complete evaluation of the use of EDTA-plasma, heparin-plasma, and serum would greatly expand the comprehensiveness of shotgun proteomics of blood samples. In this study, we evaluated the use of heparin-plasma with respect to EDTA-plasma and serum to profile blood proteomes using a scalable automated proteomic pipeline (ASAP 2 ). The use of plasma and serum for mass-spectrometry-based shotgun proteomics was first tested with commercial pooled samples. The proteome coverage consistency and the quantitative performance were compared. Furthermore, protein measurements in EDTA-plasma and heparin-plasma samples were comparatively studied using matched sample pairs from 20 individuals from the Australian Imaging, Biomarkers and Lifestyle (AIBL) Study. We identified 442 proteins in common between EDTA-plasma and heparin-plasma samples. Overall agreement of the relative protein quantification between the sample pairs demonstrated that shotgun proteomics using workflows such as the ASAP 2 is suitable in analyzing heparin-plasma and that such sample type may be considered in large-scale clinical research studies. Moreover, the partial proteome coverage overlaps (e.g., ∼70%) showed that measures from heparin-plasma could be complementary to those obtained from EDTA-plasma.
Proteome-wide search for functional motifs altered in tumors: Prediction of nuclear export signals inactivated by cancer-related mutations

PubMed Central

Prieto, Gorka; Fullaondo, Asier; Rodríguez, Jose A.

2016-01-01

Large-scale sequencing projects are uncovering a growing number of missense mutations in human tumors. Understanding the phenotypic consequences of these alterations represents a formidable challenge. In silico prediction of functionally relevant amino acid motifs disrupted by cancer mutations could provide insight into the potential impact of a mutation, and guide functional tests. We have previously described Wregex, a tool for the identification of potential functional motifs, such as nuclear export signals (NESs), in proteins. Here, we present an improved version that allows motif prediction to be combined with data from large repositories, such as the Catalogue of Somatic Mutations in Cancer (COSMIC), and to be applied to a whole proteome scale. As an example, we have searched the human proteome for candidate NES motifs that could be altered by cancer-related mutations included in the COSMIC database. A subset of the candidate NESs identified was experimentally tested using an in vivo nuclear export assay. A significant proportion of the selected motifs exhibited nuclear export activity, which was abrogated by the COSMIC mutations. In addition, our search identified a cancer mutation that inactivates the NES of the human deubiquitinase USP21, and leads to the aberrant accumulation of this protein in the nucleus. PMID:27174732
Large-scale label-free quantitative proteomics of the pea aphid-Buchnera symbiosis.

PubMed

Poliakov, Anton; Russell, Calum W; Ponnala, Lalit; Hoops, Harold J; Sun, Qi; Douglas, Angela E; van Wijk, Klaas J

2011-06-01

Many insects are nutritionally dependent on symbiotic microorganisms that have tiny genomes and are housed in specialized host cells called bacteriocytes. The obligate symbiosis between the pea aphid Acyrthosiphon pisum and the γ-proteobacterium Buchnera aphidicola (only 584 predicted proteins) is particularly amenable for molecular analysis because the genomes of both partners have been sequenced. To better define the symbiotic relationship between this aphid and Buchnera, we used large-scale, high accuracy tandem mass spectrometry (nanoLC-LTQ-Orbtrap) to identify aphid and Buchnera proteins in the whole aphid body, purified bacteriocytes, isolated Buchnera cells and the residual bacteriocyte fraction. More than 1900 aphid and 400 Buchnera proteins were identified. All enzymes in amino acid metabolism annotated in the Buchnera genome were detected, reflecting the high (68%) coverage of the proteome and supporting the core function of Buchnera in the aphid symbiosis. Transporters mediating the transport of predicted metabolites were present in the bacteriocyte. Label-free spectral counting combined with hierarchical clustering, allowed to define the quantitative distribution of a subset of these proteins across both symbiotic partners, yielding no evidence for the selective transfer of protein among the partners in either direction. This is the first quantitative proteome analysis of bacteriocyte symbiosis, providing a wealth of information about molecular function of both the host cell and bacterial symbiont.
The HUPO PSI's molecular interaction format--a community standard for the representation of protein interaction data.

PubMed

Hermjakob, Henning; Montecchi-Palazzi, Luisa; Bader, Gary; Wojcik, Jérôme; Salwinski, Lukasz; Ceol, Arnaud; Moore, Susan; Orchard, Sandra; Sarkans, Ugis; von Mering, Christian; Roechert, Bernd; Poux, Sylvain; Jung, Eva; Mersch, Henning; Kersey, Paul; Lappe, Michael; Li, Yixue; Zeng, Rong; Rana, Debashis; Nikolski, Macha; Husi, Holger; Brun, Christine; Shanker, K; Grant, Seth G N; Sander, Chris; Bork, Peer; Zhu, Weimin; Pandey, Akhilesh; Brazma, Alvis; Jacq, Bernard; Vidal, Marc; Sherman, David; Legrain, Pierre; Cesareni, Gianni; Xenarios, Ioannis; Eisenberg, David; Steipe, Boris; Hogue, Chris; Apweiler, Rolf

2004-02-01

A major goal of proteomics is the complete description of the protein interaction network underlying cell physiology. A large number of small scale and, more recently, large-scale experiments have contributed to expanding our understanding of the nature of the interaction network. However, the necessary data integration across experiments is currently hampered by the fragmentation of publicly available protein interaction data, which exists in different formats in databases, on authors' websites or sometimes only in print publications. Here, we propose a community standard data model for the representation and exchange of protein interaction data. This data model has been jointly developed by members of the Proteomics Standards Initiative (PSI), a work group of the Human Proteome Organization (HUPO), and is supported by major protein interaction data providers, in particular the Biomolecular Interaction Network Database (BIND), Cellzome (Heidelberg, Germany), the Database of Interacting Proteins (DIP), Dana Farber Cancer Institute (Boston, MA, USA), the Human Protein Reference Database (HPRD), Hybrigenics (Paris, France), the European Bioinformatics Institute's (EMBL-EBI, Hinxton, UK) IntAct, the Molecular Interactions (MINT, Rome, Italy) database, the Protein-Protein Interaction Database (PPID, Edinburgh, UK) and the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING, EMBL, Heidelberg, Germany).
Substrate-Mediated Laser Ablation under Ambient Conditions for Spatially-Resolved Tissue Proteomics

PubMed Central

Fatou, Benoit; Wisztorski, Maxence; Focsa, Cristian; Salzet, Michel; Ziskind, Michael; Fournier, Isabelle

2015-01-01

Numerous applications of ambient Mass Spectrometry (MS) have been demonstrated over the past decade. They promoted the emergence of various micro-sampling techniques such as Laser Ablation/Droplet Capture (LADC). LADC consists in the ablation of analytes from a surface and their subsequent capture in a solvent droplet which can then be analyzed by MS. LADC is thus generally performed in the UV or IR range, using a wavelength at which analytes or the matrix absorb. In this work, we explore the potential of visible range LADC (532 nm) as a micro-sampling technology for large-scale proteomics analyses. We demonstrate that biomolecule analyses using 532 nm LADC are possible, despite the low absorbance of biomolecules at this wavelength. This is due to the preponderance of an indirect substrate-mediated ablation mechanism at low laser energy which contrasts with the conventional direct ablation driven by sample absorption. Using our custom LADC system and taking advantage of this substrate-mediated ablation mechanism, we were able to perform large-scale proteomic analyses of micro-sampled tissue sections and demonstrated the possible identification of proteins with relevant biological functions. Consequently, the 532 nm LADC technique offers a new tool for biological and clinical applications. PMID:26674367
Large Scale Proteomic Data and Network-Based Systems Biology Approaches to Explore the Plant World.

PubMed

Di Silvestre, Dario; Bergamaschi, Andrea; Bellini, Edoardo; Mauri, PierLuigi

2018-06-03

The investigation of plant organisms by means of data-derived systems biology approaches based on network modeling is mainly characterized by genomic data, while the potential of proteomics is largely unexplored. This delay is mainly caused by the paucity of plant genomic/proteomic sequences and annotations which are fundamental to perform mass-spectrometry (MS) data interpretation. However, Next Generation Sequencing (NGS) techniques are contributing to filling this gap and an increasing number of studies are focusing on plant proteome profiling and protein-protein interactions (PPIs) identification. Interesting results were obtained by evaluating the topology of PPI networks in the context of organ-associated biological processes as well as plant-pathogen relationships. These examples foreshadow well the benefits that these approaches may provide to plant research. Thus, in addition to providing an overview of the main-omic technologies recently used on plant organisms, we will focus on studies that rely on concepts of module, hub and shortest path, and how they can contribute to the plant discovery processes. In this scenario, we will also consider gene co-expression networks, and some examples of integration with metabolomic data and genome-wide association studies (GWAS) to select candidate genes will be mentioned.
Highly multiplexed targeted proteomics using precise control of peptide retention time.

PubMed

Gallien, Sebastien; Peterman, Scott; Kiyonami, Reiko; Souady, Jamal; Duriez, Elodie; Schoen, Alan; Domon, Bruno

2012-04-01

Large-scale proteomics applications using SRM analysis on triple quadrupole mass spectrometers present new challenges to LC-MS/MS experimental design. Despite the automation of building large-scale LC-SRM methods, the increased numbers of targeted peptides can compromise the balance between sensitivity and selectivity. To facilitate large target numbers, time-scheduled SRM transition acquisition is performed. Previously published results have demonstrated incorporation of a well-characterized set of synthetic peptides enabled chromatographic characterization of the elution profile for most endogenous peptides. We have extended this application of peptide trainer kits to not only build SRM methods but to facilitate real-time elution profile characterization that enables automated adjustment of the scheduled detection windows. Incorporation of dynamic retention time adjustments better facilitate targeted assays lasting several days without the need for constant supervision. This paper provides an overview of how the dynamic retention correction approach identifies and corrects for commonly observed LC variations. This adjustment dramatically improves robustness in targeted discovery experiments as well as routine quantification experiments. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Automated selected reaction monitoring software for accurate label-free protein quantification.

PubMed

Teleman, Johan; Karlsson, Christofer; Waldemarson, Sofia; Hansson, Karin; James, Peter; Malmström, Johan; Levander, Fredrik

2012-07-06

Selected reaction monitoring (SRM) is a mass spectrometry method with documented ability to quantify proteins accurately and reproducibly using labeled reference peptides. However, the use of labeled reference peptides becomes impractical if large numbers of peptides are targeted and when high flexibility is desired when selecting peptides. We have developed a label-free quantitative SRM workflow that relies on a new automated algorithm, Anubis, for accurate peak detection. Anubis efficiently removes interfering signals from contaminating peptides to estimate the true signal of the targeted peptides. We evaluated the algorithm on a published multisite data set and achieved results in line with manual data analysis. In complex peptide mixtures from whole proteome digests of Streptococcus pyogenes we achieved a technical variability across the entire proteome abundance range of 6.5-19.2%, which was considerably below the total variation across biological samples. Our results show that the label-free SRM workflow with automated data analysis is feasible for large-scale biological studies, opening up new possibilities for quantitative proteomics and systems biology.
MaRaCluster: A Fragment Rarity Metric for Clustering Fragment Spectra in Shotgun Proteomics.

PubMed

The, Matthew; Käll, Lukas

2016-03-04

Shotgun proteomics experiments generate large amounts of fragment spectra as primary data, normally with high redundancy between and within experiments. Here, we have devised a clustering technique to identify fragment spectra stemming from the same species of peptide. This is a powerful alternative method to traditional search engines for analyzing spectra, specifically useful for larger scale mass spectrometry studies. As an aid in this process, we propose a distance calculation relying on the rarity of experimental fragment peaks, following the intuition that peaks shared by only a few spectra offer more evidence than peaks shared by a large number of spectra. We used this distance calculation and a complete-linkage scheme to cluster data from a recent large-scale mass spectrometry-based study. The clusterings produced by our method have up to 40% more identified peptides for their consensus spectra compared to those produced by the previous state-of-the-art method. We see that our method would advance the construction of spectral libraries as well as serve as a tool for mining large sets of fragment spectra. The source code and Ubuntu binary packages are available at https://github.com/statisticalbiotechnology/maracluster (under an Apache 2.0 license).
Understanding protein evolution: from protein physics to Darwinian selection.

PubMed

Zeldovich, Konstantin B; Shakhnovich, Eugene I

2008-01-01

Efforts in whole-genome sequencing and structural proteomics start to provide a global view of the protein universe, the set of existing protein structures and sequences. However, approaches based on the selection of individual sequences have not been entirely successful at the quantitative description of the distribution of structures and sequences in the protein universe because evolutionary pressure acts on the entire organism, rather than on a particular molecule. In parallel to this line of study, studies in population genetics and phenomenological molecular evolution established a mathematical framework to describe the changes in genome sequences in populations of organisms over time. Here, we review both microscopic (physics-based) and macroscopic (organism-level) models of protein-sequence evolution and demonstrate that bridging the two scales provides the most complete description of the protein universe starting from clearly defined, testable, and physiologically relevant assumptions.
ProteoSign: an end-user online differential proteomics statistical analysis platform.

PubMed

Efstathiou, Georgios; Antonakis, Andreas N; Pavlopoulos, Georgios A; Theodosiou, Theodosios; Divanach, Peter; Trudgian, David C; Thomas, Benjamin; Papanikolaou, Nikolas; Aivaliotis, Michalis; Acuto, Oreste; Iliopoulos, Ioannis

2017-07-03

Profiling of proteome dynamics is crucial for understanding cellular behavior in response to intrinsic and extrinsic stimuli and maintenance of homeostasis. Over the last 20 years, mass spectrometry (MS) has emerged as the most powerful tool for large-scale identification and characterization of proteins. Bottom-up proteomics, the most common MS-based proteomics approach, has always been challenging in terms of data management, processing, analysis and visualization, with modern instruments capable of producing several gigabytes of data out of a single experiment. Here, we present ProteoSign, a freely available web application, dedicated in allowing users to perform proteomics differential expression/abundance analysis in a user-friendly and self-explanatory way. Although several non-commercial standalone tools have been developed for post-quantification statistical analysis of proteomics data, most of them are not end-user appealing as they often require very stringent installation of programming environments, third-party software packages and sometimes further scripting or computer programming. To avoid this bottleneck, we have developed a user-friendly software platform accessible via a web interface in order to enable proteomics laboratories and core facilities to statistically analyse quantitative proteomics data sets in a resource-efficient manner. ProteoSign is available at http://bioinformatics.med.uoc.gr/ProteoSign and the source code at https://github.com/yorgodillo/ProteoSign. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Transformative Impact of Proteomics on Cardiovascular Health and Disease: A Scientific Statement From the American Heart Association.

PubMed

Lindsey, Merry L; Mayr, Manuel; Gomes, Aldrin V; Delles, Christian; Arrell, D Kent; Murphy, Anne M; Lange, Richard A; Costello, Catherine E; Jin, Yu-Fang; Laskowitz, Daniel T; Sam, Flora; Terzic, Andre; Van Eyk, Jennifer; Srinivas, Pothur R

2015-09-01

The year 2014 marked the 20th anniversary of the coining of the term proteomics. The purpose of this scientific statement is to summarize advances over this period that have catalyzed our capacity to address the experimental, translational, and clinical implications of proteomics as applied to cardiovascular health and disease and to evaluate the current status of the field. Key successes that have energized the field are delineated; opportunities for proteomics to drive basic science research, facilitate clinical translation, and establish diagnostic and therapeutic healthcare algorithms are discussed; and challenges that remain to be solved before proteomic technologies can be readily translated from scientific discoveries to meaningful advances in cardiovascular care are addressed. Proteomics is the result of disruptive technologies, namely, mass spectrometry and database searching, which drove protein analysis from 1 protein at a time to protein mixture analyses that enable large-scale analysis of proteins and facilitate paradigm shifts in biological concepts that address important clinical questions. Over the past 20 years, the field of proteomics has matured, yet it is still developing rapidly. The scope of this statement will extend beyond the reaches of a typical review article and offer guidance on the use of next-generation proteomics for future scientific discovery in the basic research laboratory and clinical settings. © 2015 American Heart Association, Inc.
A Combination of Histological, Physiological, and Proteomic Approaches Shed Light on Seed Desiccation Tolerance of the Basal Angiosperm Amborella trichopoda.

PubMed

Villegente, Matthieu; Marmey, Philippe; Job, Claudette; Galland, Marc; Cueff, Gwendal; Godin, Béatrice; Rajjou, Loïc; Balliau, Thierry; Zivy, Michel; Fogliani, Bruno; Sarramegna-Burtet, Valérie; Job, Dominique

2017-07-28

Desiccation tolerance allows plant seeds to remain viable in a dry state for years and even centuries. To reveal potential evolutionary processes of this trait, we have conducted a shotgun proteomic analysis of isolated embryo and endosperm from mature seeds of Amborella trichopoda , an understory shrub endemic to New Caledonia that is considered to be the basal extant angiosperm. The present analysis led to the characterization of 415 and 69 proteins from the isolated embryo and endosperm tissues, respectively. The role of these proteins is discussed in terms of protein evolution and physiological properties of the rudimentary, underdeveloped, Amborella embryos, notably considering that the acquisition of desiccation tolerance corresponds to the final developmental stage of mature seeds possessing large embryos.

A Combination of Histological, Physiological, and Proteomic Approaches Shed Light on Seed Desiccation Tolerance of the Basal Angiosperm Amborella trichopoda

PubMed Central

Villegente, Matthieu; Marmey, Philippe; Job, Claudette; Galland, Marc; Cueff, Gwendal; Godin, Béatrice; Rajjou, Loïc; Balliau, Thierry; Zivy, Michel; Sarramegna-Burtet, Valérie; Job, Dominique

2017-01-01

Desiccation tolerance allows plant seeds to remain viable in a dry state for years and even centuries. To reveal potential evolutionary processes of this trait, we have conducted a shotgun proteomic analysis of isolated embryo and endosperm from mature seeds of Amborella trichopoda, an understory shrub endemic to New Caledonia that is considered to be the basal extant angiosperm. The present analysis led to the characterization of 415 and 69 proteins from the isolated embryo and endosperm tissues, respectively. The role of these proteins is discussed in terms of protein evolution and physiological properties of the rudimentary, underdeveloped, Amborella embryos, notably considering that the acquisition of desiccation tolerance corresponds to the final developmental stage of mature seeds possessing large embryos. PMID:28788068
Giant viruses coexisted with the cellular ancestors and represent a distinct supergroup along with superkingdoms Archaea, Bacteria and Eukarya

PubMed Central

2012-01-01

Background The discovery of giant viruses with genome and physical size comparable to cellular organisms, remnants of protein translation machinery and virus-specific parasites (virophages) have raised intriguing questions about their origin. Evidence advocates for their inclusion into global phylogenomic studies and their consideration as a distinct and ancient form of life. Results Here we reconstruct phylogenies describing the evolution of proteomes and protein domain structures of cellular organisms and double-stranded DNA viruses with medium-to-very-large proteomes (giant viruses). Trees of proteomes define viruses as a ‘fourth supergroup’ along with superkingdoms Archaea, Bacteria, and Eukarya. Trees of domains indicate they have evolved via massive and primordial reductive evolutionary processes. The distribution of domain structures suggests giant viruses harbor a significant number of protein domains including those with no cellular representation. The genomic and structural diversity embedded in the viral proteomes is comparable to the cellular proteomes of organisms with parasitic lifestyles. Since viral domains are widespread among cellular species, we propose that viruses mediate gene transfer between cells and crucially enhance biodiversity. Conclusions Results call for a change in the way viruses are perceived. They likely represent a distinct form of life that either predated or coexisted with the last universal common ancestor (LUCA) and constitute a very crucial part of our planet’s biosphere. PMID:22920653
freeQuant: A Mass Spectrometry Label-Free Quantification Software Tool for Complex Proteome Analysis.

PubMed

Deng, Ning; Li, Zhenye; Pan, Chao; Duan, Huilong

2015-01-01

Study of complex proteome brings forward higher request for the quantification method using mass spectrometry technology. In this paper, we present a mass spectrometry label-free quantification tool for complex proteomes, called freeQuant, which integrated quantification with functional analysis effectively. freeQuant consists of two well-integrated modules: label-free quantification and functional analysis with biomedical knowledge. freeQuant supports label-free quantitative analysis which makes full use of tandem mass spectrometry (MS/MS) spectral count, protein sequence length, shared peptides, and ion intensity. It adopts spectral count for quantitative analysis and builds a new method for shared peptides to accurately evaluate abundance of isoforms. For proteins with low abundance, MS/MS total ion count coupled with spectral count is included to ensure accurate protein quantification. Furthermore, freeQuant supports the large-scale functional annotations for complex proteomes. Mitochondrial proteomes from the mouse heart, the mouse liver, and the human heart were used to evaluate the usability and performance of freeQuant. The evaluation showed that the quantitative algorithms implemented in freeQuant can improve accuracy of quantification with better dynamic range.
Brucella proteomes--a review.

PubMed

DelVecchio, Vito G; Wagner, Mary Ann; Eschenbrenner, Michel; Horn, Troy A; Kraycer, Jo Ann; Estock, Frank; Elzer, Phil; Mujer, Cesar V

2002-12-20

The proteomes of selected Brucella spp. have been extensively analyzed by utilizing current proteomic technology involving 2-DE and MALDI-MS. In Brucella melitensis, more than 500 proteins were identified. The rapid and large-scale identification of proteins in this organism was accomplished by using the annotated B. melitensis genome which is now available in the GenBank. Coupled with new and powerful tools for data analysis, differentially expressed proteins were identified and categorized into several classes. A global overview of protein expression patterns emerged, thereby facilitating the simultaneous analysis of different metabolic pathways in B. melitensis. Such a global characterization would not have been possible by using time consuming and traditional biochemical approaches. The era of post-genomic technology offers new and exciting opportunities to understand the complete biology of different Brucella species.
Solid-Phase Extraction Strategies to Surmount Body Fluid Sample Complexity in High-Throughput Mass Spectrometry-Based Proteomics

PubMed Central

Bladergroen, Marco R.; van der Burgt, Yuri E. M.

2015-01-01

For large-scale and standardized applications in mass spectrometry- (MS-) based proteomics automation of each step is essential. Here we present high-throughput sample preparation solutions for balancing the speed of current MS-acquisitions and the time needed for analytical workup of body fluids. The discussed workflows reduce body fluid sample complexity and apply for both bottom-up proteomics experiments and top-down protein characterization approaches. Various sample preparation methods that involve solid-phase extraction (SPE) including affinity enrichment strategies have been automated. Obtained peptide and protein fractions can be mass analyzed by direct infusion into an electrospray ionization (ESI) source or by means of matrix-assisted laser desorption ionization (MALDI) without further need of time-consuming liquid chromatography (LC) separations. PMID:25692071
pyGeno: A Python package for precision medicine and proteogenomics.

PubMed

Daouda, Tariq; Perreault, Claude; Lemieux, Sébastien

2016-01-01

pyGeno is a Python package mainly intended for precision medicine applications that revolve around genomics and proteomics. It integrates reference sequences and annotations from Ensembl, genomic polymorphisms from the dbSNP database and data from next-gen sequencing into an easy to use, memory-efficient and fast framework, therefore allowing the user to easily explore subject-specific genomes and proteomes. Compared to a standalone program, pyGeno gives the user access to the complete expressivity of Python, a general programming language. Its range of application therefore encompasses both short scripts and large scale genome-wide studies.
pyGeno: A Python package for precision medicine and proteogenomics

PubMed Central

Daouda, Tariq; Perreault, Claude; Lemieux, Sébastien

2016-01-01

pyGeno is a Python package mainly intended for precision medicine applications that revolve around genomics and proteomics. It integrates reference sequences and annotations from Ensembl, genomic polymorphisms from the dbSNP database and data from next-gen sequencing into an easy to use, memory-efficient and fast framework, therefore allowing the user to easily explore subject-specific genomes and proteomes. Compared to a standalone program, pyGeno gives the user access to the complete expressivity of Python, a general programming language. Its range of application therefore encompasses both short scripts and large scale genome-wide studies. PMID:27785359
Stable isotope dimethyl labelling for quantitative proteomics and beyond

PubMed Central

Hsu, Jue-Liang; Chen, Shu-Hui

2016-01-01

Stable-isotope reductive dimethylation, a cost-effective, simple, robust, reliable and easy-to- multiplex labelling method, is widely applied to quantitative proteomics using liquid chromatography-mass spectrometry. This review focuses on biological applications of stable-isotope dimethyl labelling for a large-scale comparative analysis of protein expression and post-translational modifications based on its unique properties of the labelling chemistry. Some other applications of the labelling method for sample preparation and mass spectrometry-based protein identification and characterization are also summarized. This article is part of the themed issue ‘Quantitative mass spectrometry’. PMID:27644970
Identification of Maturation-Specific Proteins by Single-Cell Proteomics of Human Oocytes

PubMed Central

Virant-Klun, Irma; Leicht, Stefan; Hughes, Christopher; Krijgsveld, Jeroen

2016-01-01

Oocytes undergo a range of complex processes via oogenesis, maturation, fertilization, and early embryonic development, eventually giving rise to a fully functioning organism. To understand proteome composition and diversity during maturation of human oocytes, here we have addressed crucial aspects of oocyte collection and proteome analysis, resulting in the first proteome and secretome maps of human oocytes. Starting from 100 oocytes collected via a novel serum-free hanging drop culture system, we identified 2,154 proteins, whose function indicate that oocytes are largely resting cells with a proteome that is tailored for homeostasis, cellular attachment, and interaction with its environment via secretory factors. In addition, we have identified 158 oocyte-enriched proteins (such as ECAT1, PIWIL3, NLRP7)1 not observed in high-coverage proteomics studies of other human cell lines or tissues. Exploiting SP3, a novel technology for proteomic sample preparation using magnetic beads, we scaled down proteome analysis to single cells. Despite the low protein content of only ∼100 ng per cell, we consistently identified ∼450 proteins from individual oocytes. When comparing individual oocytes at the germinal vesicle (GV) and metaphase II (MII) stage, we found that the Tudor and KH domain-containing protein (TDRKH) is preferentially expressed in immature oocytes, while Wee2, PCNA, and DNMT1 were enriched in mature cells, collectively indicating that maintenance of genome integrity is crucial during oocyte maturation. This study demonstrates that an innovative proteomics workflow facilitates analysis of single human oocytes to investigate human oocyte biology and preimplantation development. The approach presented here paves the way for quantitative proteomics in other quantity-limited tissues and cell types. Data associated with this study are available via ProteomeXchange with identifier PXD004142. PMID:27215607
Identification of Maturation-Specific Proteins by Single-Cell Proteomics of Human Oocytes.

PubMed

Virant-Klun, Irma; Leicht, Stefan; Hughes, Christopher; Krijgsveld, Jeroen

2016-08-01

Oocytes undergo a range of complex processes via oogenesis, maturation, fertilization, and early embryonic development, eventually giving rise to a fully functioning organism. To understand proteome composition and diversity during maturation of human oocytes, here we have addressed crucial aspects of oocyte collection and proteome analysis, resulting in the first proteome and secretome maps of human oocytes. Starting from 100 oocytes collected via a novel serum-free hanging drop culture system, we identified 2,154 proteins, whose function indicate that oocytes are largely resting cells with a proteome that is tailored for homeostasis, cellular attachment, and interaction with its environment via secretory factors. In addition, we have identified 158 oocyte-enriched proteins (such as ECAT1, PIWIL3, NLRP7)(1) not observed in high-coverage proteomics studies of other human cell lines or tissues. Exploiting SP3, a novel technology for proteomic sample preparation using magnetic beads, we scaled down proteome analysis to single cells. Despite the low protein content of only ∼100 ng per cell, we consistently identified ∼450 proteins from individual oocytes. When comparing individual oocytes at the germinal vesicle (GV) and metaphase II (MII) stage, we found that the Tudor and KH domain-containing protein (TDRKH) is preferentially expressed in immature oocytes, while Wee2, PCNA, and DNMT1 were enriched in mature cells, collectively indicating that maintenance of genome integrity is crucial during oocyte maturation. This study demonstrates that an innovative proteomics workflow facilitates analysis of single human oocytes to investigate human oocyte biology and preimplantation development. The approach presented here paves the way for quantitative proteomics in other quantity-limited tissues and cell types. Data associated with this study are available via ProteomeXchange with identifier PXD004142. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Recognizing millions of consistently unidentified spectra across hundreds of shotgun proteomics datasets

PubMed Central

Griss, Johannes; Perez-Riverol, Yasset; Lewis, Steve; Tabb, David L.; Dianes, José A.; del-Toro, Noemi; Rurik, Marc; Walzer, Mathias W.; Kohlbacher, Oliver; Hermjakob, Henning; Wang, Rui; Vizcaíno, Juan Antonio

2016-01-01

Mass spectrometry (MS) is the main technology used in proteomics approaches. However, on average 75% of spectra analysed in an MS experiment remain unidentified. We propose to use spectrum clustering at a large-scale to shed a light on these unidentified spectra. PRoteomics IDEntifications database (PRIDE) Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in PRIDE Archive, coming from hundreds of datasets, we were able to consistently characterize three distinct groups of spectra: 1) incorrectly identified spectra, 2) spectra correctly identified but below the set scoring threshold, and 3) truly unidentified spectra. Using a multitude of complementary analysis approaches, we were able to identify less than 20% of the consistently unidentified spectra. The complete spectrum clustering results are available through the new version of the PRIDE Cluster resource (http://www.ebi.ac.uk/pride/cluster). This resource is intended, among other aims, to encourage and simplify further investigation into these unidentified spectra. PMID:27493588
Recognizing millions of consistently unidentified spectra across hundreds of shotgun proteomics datasets.

PubMed

Griss, Johannes; Perez-Riverol, Yasset; Lewis, Steve; Tabb, David L; Dianes, José A; Del-Toro, Noemi; Rurik, Marc; Walzer, Mathias W; Kohlbacher, Oliver; Hermjakob, Henning; Wang, Rui; Vizcaíno, Juan Antonio

2016-08-01

Mass spectrometry (MS) is the main technology used in proteomics approaches. However, on average 75% of spectra analysed in an MS experiment remain unidentified. We propose to use spectrum clustering at a large-scale to shed a light on these unidentified spectra. PRoteomics IDEntifications database (PRIDE) Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in PRIDE Archive, coming from hundreds of datasets, we were able to consistently characterize three distinct groups of spectra: 1) incorrectly identified spectra, 2) spectra correctly identified but below the set scoring threshold, and 3) truly unidentified spectra. Using a multitude of complementary analysis approaches, we were able to identify less than 20% of the consistently unidentified spectra. The complete spectrum clustering results are available through the new version of the PRIDE Cluster resource (http://www.ebi.ac.uk/pride/cluster). This resource is intended, among other aims, to encourage and simplify further investigation into these unidentified spectra.
Top-Down Characterization of the Post-Translationally Modified Intact Periplasmic Proteome from the Bacterium Novosphingobium aromaticivorans

DOE PAGES

Wu, Si; Brown, Roslyn N.; Payne, Samuel H.; ...

2013-01-01

The periplasm of Gram-negative bacteria is a dynamic and physiologically important subcellular compartment where the constant exposure to potential environmental insults amplifies the need for proper protein folding and modifications. Top-down proteomics analysis of the periplasmic fraction at the intact protein level provides unrestricted characterization and annotation of the periplasmic proteome, including the post-translational modifications (PTMs) on these proteins. Here, we used single-dimension ultra-high pressure liquid chromatography coupled with the Fourier transform mass spectrometry (FTMS) to investigate the intact periplasmic proteome of Novosphingobium aromaticivorans . Our top-down analysis provided the confident identification of 55 proteins in the periplasm and characterizedmore » their PTMs including signal peptide removal, N-terminal methionine excision, acetylation, glutathionylation, pyroglutamate, and disulfide bond formation. This study provides the first experimental evidence for the expression and periplasmic localization of many hypothetical and uncharacterized proteins and the first unrestrictive, large-scale data on PTMs in the bacterial periplasm.« less
Large-scale identification of target proteins of a glycosyltransferase isozyme by Lectin-IGOT-LC/MS, an LC/MS-based glycoproteomic approach

PubMed Central

Sugahara, Daisuke; Kaji, Hiroyuki; Sugihara, Kazushi; Asano, Masahide; Narimatsu, Hisashi

2012-01-01

Model organisms containing deletion or mutation in a glycosyltransferase-gene exhibit various physiological abnormalities, suggesting that specific glycan motifs on certain proteins play important roles in vivo. Identification of the target proteins of glycosyltransferase isozymes is the key to understand the roles of glycans. Here, we demonstrated the proteome-scale identification of the target proteins specific for a glycosyltransferase isozyme, β1,4-galactosyltransferase-I (β4GalT-I). Although β4GalT-I is the most characterized glycosyltransferase, its distinctive contribution to β1,4-galactosylation has been hardly described so far. We identified a large number of candidates for the target proteins specific to β4GalT-I by comparative analysis of β4GalT-I-deleted and wild-type mice using the LC/MS-based technique with the isotope-coded glycosylation site-specific tagging (IGOT) of lectin-captured N-glycopeptides. Our approach to identify the target proteins in a proteome-scale offers common features and trends in the target proteins, which facilitate understanding of the mechanism that controls assembly of a particular glycan motif on specific proteins. PMID:23002422
A Novel Proteomics Approach to Identify SUMOylated Proteins and Their Modification Sites in Human Cells*

PubMed Central

Galisson, Frederic; Mahrouche, Louiza; Courcelles, Mathieu; Bonneil, Eric; Meloche, Sylvain; Chelbi-Alix, Mounira K.; Thibault, Pierre

2011-01-01

The small ubiquitin-related modifier (SUMO) is a small group of proteins that are reversibly attached to protein substrates to modify their functions. The large scale identification of protein SUMOylation and their modification sites in mammalian cells represents a significant challenge because of the relatively small number of in vivo substrates and the dynamic nature of this modification. We report here a novel proteomics approach to selectively enrich and identify SUMO conjugates from human cells. We stably expressed different SUMO paralogs in HEK293 cells, each containing a His6 tag and a strategically located tryptic cleavage site at the C terminus to facilitate the recovery and identification of SUMOylated peptides by affinity enrichment and mass spectrometry. Tryptic peptides with short SUMO remnants offer significant advantages in large scale SUMOylome experiments including the generation of paralog-specific fragment ions following CID and ETD activation, and the identification of modified peptides using conventional database search engines such as Mascot. We identified 205 unique protein substrates together with 17 precise SUMOylation sites present in 12 SUMO protein conjugates including three new sites (Lys-380, Lys-400, and Lys-497) on the protein promyelocytic leukemia. Label-free quantitative proteomics analyses on purified nuclear extracts from untreated and arsenic trioxide-treated cells revealed that all identified SUMOylated sites of promyelocytic leukemia were differentially SUMOylated upon stimulation. PMID:21098080
Coronal hole evolution by sudden large scale changes

NASA Technical Reports Server (NTRS)

Nolte, J. T.; Gerassimenko, M.; Krieger, A. S.; Solodyna, C. V.

1978-01-01

Sudden shifts in coronal-hole boundaries observed by the S-054 X-ray telescope on Skylab between May and November, 1973, within 1 day of CMP of the holes, at latitudes not exceeding 40 deg, are compared with the long-term evolution of coronal-hole area. It is found that large-scale shifts in boundary locations can account for most if not all of the evolution of coronal holes. The temporal and spatial scales of these large-scale changes imply that they are the results of a physical process occurring in the corona. It is concluded that coronal holes evolve by magnetic-field lines' opening when the holes are growing, and by fields' closing as the holes shrink.
Development of proteome-wide binding reagents for research and diagnostics.

PubMed

Taussig, Michael J; Schmidt, Ronny; Cook, Elizabeth A; Stoevesandt, Oda

2013-12-01

Alongside MS, antibodies and other specific protein-binding molecules have a special place in proteomics as affinity reagents in a toolbox of applications for determining protein location, quantitative distribution and function (affinity proteomics). The realisation that the range of research antibodies available, while apparently vast is nevertheless still very incomplete and frequently of uncertain quality, has stimulated projects with an objective of raising comprehensive, proteome-wide sets of protein binders. With progress in automation and throughput, a remarkable number of recent publications refer to the practical possibility of selecting binders to every protein encoded in the genome. Here we review the requirements of a pipeline of production of protein binders for the human proteome, including target prioritisation, antigen design, 'next generation' methods, databases and the approaches taken by ongoing projects in Europe and the USA. While the task of generating affinity reagents for all human proteins is complex and demanding, the benefits of well-characterised and quality-controlled pan-proteome binder resources for biomedical research, industry and life sciences in general would be enormous and justify the effort. Given the technical, personnel and financial resources needed to fulfil this aim, expansion of current efforts may best be addressed through large-scale international collaboration. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The emergence of top-down proteomics in clinical research

PubMed Central

2013-01-01

Proteomic technology has advanced steadily since the development of 'soft-ionization' techniques for mass-spectrometry-based molecular identification more than two decades ago. Now, the large-scale analysis of proteins (proteomics) is a mainstay of biological research and clinical translation, with researchers seeking molecular diagnostics, as well as protein-based markers for personalized medicine. Proteomic strategies using the protease trypsin (known as bottom-up proteomics) were the first to be developed and optimized and form the dominant approach at present. However, researchers are now beginning to understand the limitations of bottom-up techniques, namely the inability to characterize and quantify intact protein molecules from a complex mixture of digested peptides. To overcome these limitations, several laboratories are taking a whole-protein-based approach, in which intact protein molecules are the analytical targets for characterization and quantification. We discuss these top-down techniques and how they have been applied to clinical research and are likely to be applied in the near future. Given the recent improvements in mass-spectrometry-based proteomics and stronger cooperation between researchers, clinicians and statisticians, both peptide-based (bottom-up) strategies and whole-protein-based (top-down) strategies are set to complement each other and help researchers and clinicians better understand and detect complex disease phenotypes. PMID:23806018
Microbial Interactions in Plants: Perspectives and Applications of Proteomics.

PubMed

Imam, Jahangir; Shukla, Pratyoosh; Mandal, Nimai Prasad; Variar, Mukund

2017-01-01

The structure and function of proteins involved in plant-microbe interactions is investigated through large-scale proteomics technology in a complex biological sample. Since the whole genome sequences are now available for several plant species and microbes, proteomics study has become easier, accurate and huge amount of data can be generated and analyzed during plant-microbe interactions. Proteomics approaches are highly important and relevant in many studies and showed that only genomics approaches are not sufficient enough as much significant information are lost as the proteins and not the genes coding them are final product that is responsible for the observed phenotype. Novel approaches in proteomics are developing continuously enabling the study of the various aspects in arrangements and configuration of proteins and its functions. Its application is becoming more common and frequently used in plant-microbe interactions with the advancement in new technologies. They are more used for the portrayal of cell and extracellular destructiveness and pathogenicity variables delivered by pathogens. This distinguishes the protein level adjustments in host plants when infected with pathogens and advantageous partners. This review provides a brief overview of different proteomics technology which is currently available followed by their exploitation to study the plant-microbe interaction. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Expressing the human proteome for affinity proteomics: optimising expression of soluble protein domains and in vivo biotinylation.

PubMed

Keates, Tracy; Cooper, Christopher D O; Savitsky, Pavel; Allerston, Charles K; Phillips, Claire; Hammarström, Martin; Daga, Neha; Berridge, Georgina; Mahajan, Pravin; Burgess-Brown, Nicola A; Müller, Susanne; Gräslund, Susanne; Gileadi, Opher

2012-06-15

The generation of affinity reagents to large numbers of human proteins depends on the ability to express the target proteins as high-quality antigens. The Structural Genomics Consortium (SGC) focuses on the production and structure determination of human proteins. In a 7-year period, the SGC has deposited crystal structures of >800 human protein domains, and has additionally expressed and purified a similar number of protein domains that have not yet been crystallised. The targets include a diversity of protein domains, with an attempt to provide high coverage of protein families. The family approach provides an excellent basis for characterising the selectivity of affinity reagents. We present a summary of the approaches used to generate purified human proteins or protein domains, a test case demonstrating the ability to rapidly generate new proteins, and an optimisation study on the modification of >70 proteins by biotinylation in vivo. These results provide a unique synergy between large-scale structural projects and the recent efforts to produce a wide coverage of affinity reagents to the human proteome. Copyright © 2011 Elsevier B.V. All rights reserved.

Expressing the human proteome for affinity proteomics: optimising expression of soluble protein domains and in vivo biotinylation

PubMed Central

Keates, Tracy; Cooper, Christopher D.O.; Savitsky, Pavel; Allerston, Charles K.; Phillips, Claire; Hammarström, Martin; Daga, Neha; Berridge, Georgina; Mahajan, Pravin; Burgess-Brown, Nicola A.; Müller, Susanne; Gräslund, Susanne; Gileadi, Opher

2012-01-01

The generation of affinity reagents to large numbers of human proteins depends on the ability to express the target proteins as high-quality antigens. The Structural Genomics Consortium (SGC) focuses on the production and structure determination of human proteins. In a 7-year period, the SGC has deposited crystal structures of >800 human protein domains, and has additionally expressed and purified a similar number of protein domains that have not yet been crystallised. The targets include a diversity of protein domains, with an attempt to provide high coverage of protein families. The family approach provides an excellent basis for characterising the selectivity of affinity reagents. We present a summary of the approaches used to generate purified human proteins or protein domains, a test case demonstrating the ability to rapidly generate new proteins, and an optimisation study on the modification of >70 proteins by biotinylation in vivo. These results provide a unique synergy between large-scale structural projects and the recent efforts to produce a wide coverage of affinity reagents to the human proteome. PMID:22027370
BIG: a large-scale data integration tool for renal physiology.

PubMed

Zhao, Yue; Yang, Chin-Rang; Raghuram, Viswanathan; Parulekar, Jaya; Knepper, Mark A

2016-10-01

Due to recent advances in high-throughput techniques, we and others have generated multiple proteomic and transcriptomic databases to describe and quantify gene expression, protein abundance, or cellular signaling on the scale of the whole genome/proteome in kidney cells. The existence of so much data from diverse sources raises the following question: "How can researchers find information efficiently for a given gene product over all of these data sets without searching each data set individually?" This is the type of problem that has motivated the "Big-Data" revolution in Data Science, which has driven progress in fields such as marketing. Here we present an online Big-Data tool called BIG (Biological Information Gatherer) that allows users to submit a single online query to obtain all relevant information from all indexed databases. BIG is accessible at http://big.nhlbi.nih.gov/.
A Universal Trend among Proteomes Indicates an Oily Last Common Ancestor

PubMed Central

Mannige, Ranjan V.; Brooks, Charles L.; Shakhnovich, Eugene I.

2012-01-01

Despite progresses in ancestral protein sequence reconstruction, much needs to be unraveled about the nature of the putative last common ancestral proteome that served as the prototype of all extant lifeforms. Here, we present data that indicate a steady decline (oil escape) in proteome hydrophobicity over species evolvedness (node number) evident in 272 diverse proteomes, which indicates a highly hydrophobic (oily) last common ancestor (LCA). This trend, obtained from simple considerations (free from sequence reconstruction methods), was corroborated by regression studies within homologous and orthologous protein clusters as well as phylogenetic estimates of the ancestral oil content. While indicating an inherent irreversibility in molecular evolution, oil escape also serves as a rare and universal reaction-coordinate for evolution (reinforcing Darwin's principle of Common Descent), and may prove important in matters such as (i) explaining the emergence of intrinsically disordered proteins, (ii) developing composition- and speciation-based “global” molecular clocks, and (iii) improving the statistical methods for ancestral sequence reconstruction. PMID:23300421
Aptamer-based multiplexed proteomic technology for biomarker discovery.

PubMed

Gold, Larry; Ayers, Deborah; Bertino, Jennifer; Bock, Christopher; Bock, Ashley; Brody, Edward N; Carter, Jeff; Dalby, Andrew B; Eaton, Bruce E; Fitzwater, Tim; Flather, Dylan; Forbes, Ashley; Foreman, Trudi; Fowler, Cate; Gawande, Bharat; Goss, Meredith; Gunn, Magda; Gupta, Shashi; Halladay, Dennis; Heil, Jim; Heilig, Joe; Hicke, Brian; Husar, Gregory; Janjic, Nebojsa; Jarvis, Thale; Jennings, Susan; Katilius, Evaldas; Keeney, Tracy R; Kim, Nancy; Koch, Tad H; Kraemer, Stephan; Kroiss, Luke; Le, Ngan; Levine, Daniel; Lindsey, Wes; Lollo, Bridget; Mayfield, Wes; Mehan, Mike; Mehler, Robert; Nelson, Sally K; Nelson, Michele; Nieuwlandt, Dan; Nikrad, Malti; Ochsner, Urs; Ostroff, Rachel M; Otis, Matt; Parker, Thomas; Pietrasiewicz, Steve; Resnicow, Daniel I; Rohloff, John; Sanders, Glenn; Sattin, Sarah; Schneider, Daniel; Singer, Britta; Stanton, Martin; Sterkel, Alana; Stewart, Alex; Stratford, Suzanne; Vaught, Jonathan D; Vrkljan, Mike; Walker, Jeffrey J; Watrobka, Mike; Waugh, Sheela; Weiss, Allison; Wilcox, Sheri K; Wolfson, Alexey; Wolk, Steven K; Zhang, Chi; Zichi, Dom

2010-12-07

The interrogation of proteomes ("proteomics") in a highly multiplexed and efficient manner remains a coveted and challenging goal in biology and medicine. We present a new aptamer-based proteomic technology for biomarker discovery capable of simultaneously measuring thousands of proteins from small sample volumes (15 µL of serum or plasma). Our current assay measures 813 proteins with low limits of detection (1 pM median), 7 logs of overall dynamic range (~100 fM-1 µM), and 5% median coefficient of variation. This technology is enabled by a new generation of aptamers that contain chemically modified nucleotides, which greatly expand the physicochemical diversity of the large randomized nucleic acid libraries from which the aptamers are selected. Proteins in complex matrices such as plasma are measured with a process that transforms a signature of protein concentrations into a corresponding signature of DNA aptamer concentrations, which is quantified on a DNA microarray. Our assay takes advantage of the dual nature of aptamers as both folded protein-binding entities with defined shapes and unique nucleotide sequences recognizable by specific hybridization probes. To demonstrate the utility of our proteomics biomarker discovery technology, we applied it to a clinical study of chronic kidney disease (CKD). We identified two well known CKD biomarkers as well as an additional 58 potential CKD biomarkers. These results demonstrate the potential utility of our technology to rapidly discover unique protein signatures characteristic of various disease states. We describe a versatile and powerful tool that allows large-scale comparison of proteome profiles among discrete populations. This unbiased and highly multiplexed search engine will enable the discovery of novel biomarkers in a manner that is unencumbered by our incomplete knowledge of biology, thereby helping to advance the next generation of evidence-based medicine.
PTMscape: an open source tool to predict generic post-translational modifications and map modification crosstalk in protein domains and biological processes.

PubMed

Li, Ginny X H; Vogel, Christine; Choi, Hyungwon

2018-06-07

While tandem mass spectrometry can detect post-translational modifications (PTM) at the proteome scale, reported PTM sites are often incomplete and include false positives. Computational approaches can complement these datasets by additional predictions, but most available tools use prediction models pre-trained for single PTM type by the developers and it remains a difficult task to perform large-scale batch prediction for multiple PTMs with flexible user control, including the choice of training data. We developed an R package called PTMscape which predicts PTM sites across the proteome based on a unified and comprehensive set of descriptors of the physico-chemical microenvironment of modified sites, with additional downstream analysis modules to test enrichment of individual or pairs of PTMs in protein domains. PTMscape is flexible in the ability to process any major modifications, such as phosphorylation and ubiquitination, while achieving the sensitivity and specificity comparable to single-PTM methods and outperforming other multi-PTM tools. Applying this framework, we expanded proteome-wide coverage of five major PTMs affecting different residues by prediction, especially for lysine and arginine modifications. Using a combination of experimentally acquired sites (PSP) and newly predicted sites, we discovered that the crosstalk among multiple PTMs occur more frequently than by random chance in key protein domains such as histone, protein kinase, and RNA recognition motifs, spanning various biological processes such as RNA processing, DNA damage response, signal transduction, and regulation of cell cycle. These results provide a proteome-scale analysis of crosstalk among major PTMs and can be easily extended to other types of PTM.
Evolution of intrinsic disorder in eukaryotic proteins.

PubMed

Ahrens, Joseph B; Nunez-Castilla, Janelle; Siltberg-Liberles, Jessica

2017-09-01

Conformational flexibility conferred though regions of intrinsic structural disorder allows proteins to behave as dynamic molecules. While it is well-known that intrinsically disordered regions can undergo disorder-to-order transitions in real-time as part of their function, we also are beginning to learn more about the dynamics of disorder-to-order transitions along evolutionary time-scales. Intrinsically disordered regions endow proteins with functional promiscuity, which is further enhanced by the ability of some of these regions to undergo real-time disorder-to-order transitions. Disorder content affects gene retention after whole genome duplication, but it is not necessarily conserved. Altered patterns of disorder resulting from evolutionary disorder-to-order transitions indicate that disorder evolves to modify function through refining stability, regulation, and interactions. Here, we review the evolution of intrinsically disordered regions in eukaryotic proteins. We discuss the interplay between secondary structure and disorder on evolutionary time-scales, the importance of disorder for eukaryotic proteome expansion and functional divergence, and the evolutionary dynamics of disorder.
Punctuated Emergences of Genetic and Phenotypic Innovations in Eumetazoan, Bilaterian, Euteleostome, and Hominidae Ancestors

PubMed Central

Wenger, Yvan; Galliot, Brigitte

2013-01-01

Phenotypic traits derive from the selective recruitment of genetic materials over macroevolutionary times, and protein-coding genes constitute an essential component of these materials. We took advantage of the recent production of genomic scale data from sponges and cnidarians, sister groups from eumetazoans and bilaterians, respectively, to date the emergence of human proteins and to infer the timing of acquisition of novel traits through metazoan evolution. Comparing the proteomes of 23 eukaryotes, we find that 33% human proteins have an ortholog in nonmetazoan species. This premetazoan proteome associates with 43% of all annotated human biological processes. Subsequently, four major waves of innovations can be inferred in the last common ancestors of eumetazoans, bilaterians, euteleostomi (bony vertebrates), and hominidae, largely specific to each epoch, whereas early branching deuterostome and chordate phyla show very few innovations. Interestingly, groups of proteins that act together in their modern human functions often originated concomitantly, although the corresponding human phenotypes frequently emerged later. For example, the three cnidarians Acropora, Nematostella, and Hydra express a highly similar protein inventory, and their protein innovations can be affiliated either to traits shared by all eumetazoans (gut differentiation, neurogenesis); or to bilaterian traits present in only some cnidarians (eyes, striated muscle); or to traits not identified yet in this phylum (mesodermal layer, endocrine glands). The variable correspondence between phenotypes predicted from protein enrichments and observed phenotypes suggests that a parallel mechanism repeatedly produce similar phenotypes, thanks to novel regulatory events that independently tie preexisting conserved genetic modules. PMID:24065732
Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book.

PubMed

Sadygov, Rovshan G; Cociorva, Daniel; Yates, John R

2004-12-01

Database searching is an essential element of large-scale proteomics. Because these methods are widely used, it is important to understand the rationale of the algorithms. Most algorithms are based on concepts first developed in SEQUEST and PeptideSearch. Four basic approaches are used to determine a match between a spectrum and sequence: descriptive, interpretative, stochastic and probability-based matching. We review the basic concepts used by most search algorithms, the computational modeling of peptide identification and current challenges and limitations of this approach for protein identification.
A phylogenomic data-driven exploration of viral origins and evolution

PubMed Central

Nasir, Arshan; Caetano-Anollés, Gustavo

2015-01-01

The origin of viruses remains mysterious because of their diverse and patchy molecular and functional makeup. Although numerous hypotheses have attempted to explain viral origins, none is backed by substantive data. We take full advantage of the wealth of available protein structural and functional data to explore the evolution of the proteomic makeup of thousands of cells and viruses. Despite the extremely reduced nature of viral proteomes, we established an ancient origin of the “viral supergroup” and the existence of widespread episodes of horizontal transfer of genetic information. Viruses harboring different replicon types and infecting distantly related hosts shared many metabolic and informational protein structural domains of ancient origin that were also widespread in cellular proteomes. Phylogenomic analysis uncovered a universal tree of life and revealed that modern viruses reduced from multiple ancient cells that harbored segmented RNA genomes and coexisted with the ancestors of modern cells. The model for the origin and evolution of viruses and cells is backed by strong genomic and structural evidence and can be reconciled with existing models of viral evolution if one considers viruses to have originated from ancient cells and not from modern counterparts. PMID:26601271
TUBEs-Mass Spectrometry for Identification and Analysis of the Ubiquitin-Proteome.

PubMed

Azkargorta, Mikel; Escobes, Iraide; Elortza, Felix; Matthiesen, Rune; Rodríguez, Manuel S

2016-01-01

Mass spectrometry (MS) has become the method of choice for the large-scale analysis of protein ubiquitylation. There exist a number of proposed methods for mapping ubiquitin sites, each with different pros and cons. We present here a protocol for the MS analysis of the ubiquitin-proteome captured by TUBEs and subsequent data analysis. Using dedicated software and algorithms, specific information on the presence of ubiquitylated peptides can be obtained from the MS search results. In addition, a quantitative and functional analysis of the ubiquitylated proteins and their interacting partners helps to unravel the biological and molecular processes they are involved in.
Software Tools | Office of Cancer Clinical Proteomics Research

Cancer.gov

The CPTAC program develops new approaches to elucidate aspects of the molecular complexity of cancer made from large-scale proteogenomic datasets, and advance them toward precision medicine. Part of the CPTAC mission is to make data and tools available and accessible to the greater research community to accelerate the discovery process.
Quality Assessments of Long-Term Quantitative Proteomic Analysis of Breast Cancer Xenograft Tissues

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhou, Jian-Ying; Chen, Lijun; Zhang, Bai

The identification of protein biomarkers requires large-scale analysis of human specimens to achieve statistical significance. In this study, we evaluated the long-term reproducibility of an iTRAQ (isobaric tags for relative and absolute quantification) based quantitative proteomics strategy using one channel for universal normalization across all samples. A total of 307 liquid chromatography tandem mass spectrometric (LC-MS/MS) analyses were completed, generating 107 one-dimensional (1D) LC-MS/MS datasets and 8 offline two-dimensional (2D) LC-MS/MS datasets (25 fractions for each set) for human-in-mouse breast cancer xenograft tissues representative of basal and luminal subtypes. Such large-scale studies require the implementation of robust metrics to assessmore » the contributions of technical and biological variability in the qualitative and quantitative data. Accordingly, we developed a quantification confidence score based on the quality of each peptide-spectrum match (PSM) to remove quantification outliers from each analysis. After combining confidence score filtering and statistical analysis, reproducible protein identification and quantitative results were achieved from LC-MS/MS datasets collected over a 16 month period.« less
A proteome view of structural, functional, and taxonomic characteristics of major protein domain clusters.

PubMed

Sun, Chia-Tsen; Chiang, Austin W T; Hwang, Ming-Jing

2017-10-27

Proteome-scale bioinformatics research is increasingly conducted as the number of completely sequenced genomes increases, but analysis of protein domains (PDs) usually relies on similarity in their amino acid sequences and/or three-dimensional structures. Here, we present results from a bi-clustering analysis on presence/absence data for 6,580 unique PDs in 2,134 species with a sequenced genome, thus covering a complete set of proteins, for the three superkingdoms of life, Bacteria, Archaea, and Eukarya. Our analysis revealed eight distinctive PD clusters, which, following an analysis of enrichment of Gene Ontology functions and CATH classification of protein structures, were shown to exhibit structural and functional properties that are taxa-characteristic. For examples, the largest cluster is ubiquitous in all three superkingdoms, constituting a set of 1,472 persistent domains created early in evolution and retained in living organisms and characterized by basic cellular functions and ancient structural architectures, while an Archaea and Eukarya bi-superkingdom cluster suggests its PDs may have existed in the ancestor of the two superkingdoms, and others are single superkingdom- or taxa (e.g. Fungi)-specific. These results contribute to increase our appreciation of PD diversity and our knowledge of how PDs are used in species, yielding implications on species evolution.
Radial variations of large-scale magnetohydrodynamic fluctuations in the solar wind

NASA Technical Reports Server (NTRS)

Burlaga, L. F.; Goldstein, M. L.

1983-01-01

Two time periods are studied for which comprehensive data coverage is available at both 1 AU using IMP-8 and ISEE-3 and beyond using Voyager 1. One of these periods is characterized by the predominance of corotating stream interactions. Relatively small scale transient flows characterize the second period. The evolution of these flows with heliocentric distance is studied using power spectral techniques. The evolution of the transient dominated period is consistent with the hypothesis of turbulent evolution including an inverse cascade of large scales. The evolution of the corotating period is consistent with the entrainment of slow streams by faster streams in a deterministic model.
Evolution of complexity in the zebrafish synapse proteome

PubMed Central

Bayés, Àlex; Collins, Mark O.; Reig-Viader, Rita; Gou, Gemma; Goulding, David; Izquierdo, Abril; Choudhary, Jyoti S.; Emes, Richard D.; Grant, Seth G. N.

2017-01-01

The proteome of human brain synapses is highly complex and is mutated in over 130 diseases. This complexity arose from two whole-genome duplications early in the vertebrate lineage. Zebrafish are used in modelling human diseases; however, its synapse proteome is uncharacterized, and whether the teleost-specific genome duplication (TSGD) influenced complexity is unknown. We report the characterization of the proteomes and ultrastructure of central synapses in zebrafish and analyse the importance of the TSGD. While the TSGD increases overall synapse proteome complexity, the postsynaptic density (PSD) proteome of zebrafish has lower complexity than mammals. A highly conserved set of ∼1,000 proteins is shared across vertebrates. PSD ultrastructural features are also conserved. Lineage-specific proteome differences indicate that vertebrate species evolved distinct synapse types and functions. The data sets are a resource for a wide range of studies and have important implications for the use of zebrafish in modelling human synaptic diseases. PMID:28252024
The dynamics and evolution of clusters of galaxies

NASA Technical Reports Server (NTRS)

Geller, Margaret; Huchra, John P.

1987-01-01

Research was undertaken to produce a coherent picture of the formation and evolution of large-scale structures in the universe. The program is divided into projects which examine four areas: the relationship between individual galaxies and their environment; the structure and evolution of individual rich clusters of galaxies; the nature of superclusters; and the large-scale distribution of individual galaxies. A brief review of results in each area is provided.
Advancing Cell Biology Through Proteomics in Space and Time (PROSPECTS)*

PubMed Central

Lamond, Angus I.; Uhlen, Mathias; Horning, Stevan; Makarov, Alexander; Robinson, Carol V.; Serrano, Luis; Hartl, F. Ulrich; Baumeister, Wolfgang; Werenskiold, Anne Katrin; Andersen, Jens S.; Vorm, Ole; Linial, Michal; Aebersold, Ruedi; Mann, Matthias

2012-01-01

The term “proteomics” encompasses the large-scale detection and analysis of proteins and their post-translational modifications. Driven by major improvements in mass spectrometric instrumentation, methodology, and data analysis, the proteomics field has burgeoned in recent years. It now provides a range of sensitive and quantitative approaches for measuring protein structures and dynamics that promise to revolutionize our understanding of cell biology and molecular mechanisms in both human cells and model organisms. The Proteomics Specification in Time and Space (PROSPECTS) Network is a unique EU-funded project that brings together leading European research groups, spanning from instrumentation to biomedicine, in a collaborative five year initiative to develop new methods and applications for the functional analysis of cellular proteins. This special issue of Molecular and Cellular Proteomics presents 16 research papers reporting major recent progress by the PROSPECTS groups, including improvements to the resolution and sensitivity of the Orbitrap family of mass spectrometers, systematic detection of proteins using highly characterized antibody collections, and new methods for absolute as well as relative quantification of protein levels. Manuscripts in this issue exemplify approaches for performing quantitative measurements of cell proteomes and for studying their dynamic responses to perturbation, both during normal cellular responses and in disease mechanisms. Here we present a perspective on how the proteomics field is moving beyond simply identifying proteins with high sensitivity toward providing a powerful and versatile set of assay systems for characterizing proteome dynamics and thereby creating a new “third generation” proteomics strategy that offers an indispensible tool for cell biology and molecular medicine. PMID:22311636
MaxReport: An Enhanced Proteomic Result Reporting Tool for MaxQuant.

PubMed

Zhou, Tao; Li, Chuyu; Zhao, Wene; Wang, Xinru; Wang, Fuqiang; Sha, Jiahao

2016-01-01

MaxQuant is a proteomic software widely used for large-scale tandem mass spectrometry data. We have designed and developed an enhanced result reporting tool for MaxQuant, named as MaxReport. This tool can optimize the results of MaxQuant and provide additional functions for result interpretation. MaxReport can generate report tables for protein N-terminal modifications. It also supports isobaric labelling based relative quantification at the protein, peptide or site level. To obtain an overview of the results, MaxReport performs general descriptive statistical analyses for both identification and quantification results. The output results of MaxReport are well organized and therefore helpful for proteomic users to better understand and share their data. The script of MaxReport, which is freely available at http://websdoor.net/bioinfo/maxreport/, is developed using Python code and is compatible across multiple systems including Windows and Linux.
Systematically Ranking the Tightness of Membrane Association for Peripheral Membrane Proteins (PMPs)*

PubMed Central

Gao, Liyan; Ge, Haitao; Huang, Xiahe; Liu, Kehui; Zhang, Yuanya; Xu, Wu; Wang, Yingchun

2015-01-01

Large-scale quantitative evaluation of the tightness of membrane association for nontransmembrane proteins is important for identifying true peripheral membrane proteins with functional significance. Herein, we simultaneously ranked more than 1000 proteins of the photosynthetic model organism Synechocystis sp. PCC 6803 for their relative tightness of membrane association using a proteomic approach. Using multiple precisely ranked and experimentally verified peripheral subunits of photosynthetic protein complexes as the landmarks, we found that proteins involved in two-component signal transduction systems and transporters are overall tightly associated with the membranes, whereas the associations of ribosomal proteins are much weaker. Moreover, we found that hypothetical proteins containing the same domains generally have similar tightness. This work provided a global view of the structural organization of the membrane proteome with respect to divergent functions, and built the foundation for future investigation of the dynamic membrane proteome reorganization in response to different environmental or internal stimuli. PMID:25505158
MALDI versus ESI: The Impact of the Ion Source on Peptide Identification.

PubMed

Nadler, Wiebke Maria; Waidelich, Dietmar; Kerner, Alexander; Hanke, Sabrina; Berg, Regina; Trumpp, Andreas; Rösli, Christoph

2017-03-03

For mass spectrometry-based proteomic analyses, electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI) are the commonly used ionization techniques. To investigate the influence of the ion source on peptide detection in large-scale proteomics, an optimized GeLC/MS workflow was developed and applied either with ESI/MS or with MALDI/MS for the proteomic analysis of different human cell lines of pancreatic origin. Statistical analysis of the resulting data set with more than 72 000 peptides emphasized the complementary character of the two methods, as the percentage of peptides identified with both approaches was as low as 39%. Significant differences between the resulting peptide sets were observed with respect to amino acid composition, charge-related parameters, hydrophobicity, and modifications of the detected peptides and could be linked to factors governing the respective ion yields in ESI and MALDI.

Use of proteomic methods in the analysis of human body fluids in Alzheimer research.

PubMed

Zürbig, Petra; Jahn, Holger

2012-12-01

Proteomics is the study of the entire population of proteins and peptides in an organism or a part of it, such as a cell, tissue, or fluids like cerebrospinal fluid, plasma, serum, urine, or saliva. It is widely assumed that changes in the composition of the proteome may reflect disease states and provide clues to its origin, eventually leading to targets for new treatments. The ability to perform large-scale proteomic studies now is based jointly on recent advances in our analytical methods. Separation techniques like CE and 2DE have developed and matured. Detection methods like MS have also improved greatly in the last 5 years. These developments have also driven the fields of bioinformatics, needed to deal with the increased data production and systems biology. All these developing methods offer specific advantages but also come with certain limitations. This review describes the different proteomic methods used in the field, their limitations, and their possible pitfalls. Based on a literature search in PubMed, we identified 112 studies that applied proteomic techniques to identify biomarkers for Alzheimer disease. This review describes the results of these studies on proteome changes in human body fluids of Alzheimer patients reviewing the most important studies. We extracted a list of 366 proteins and peptides that were identified by these studies as potential targets in Alzheimer research. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
TRIC: an automated alignment strategy for reproducible protein quantification in targeted proteomics

PubMed Central

Röst, Hannes L.; Liu, Yansheng; D’Agostino, Giuseppe; Zanella, Matteo; Navarro, Pedro; Rosenberger, George; Collins, Ben C.; Gillet, Ludovic; Testa, Giuseppe; Malmström, Lars; Aebersold, Ruedi

2016-01-01

Large scale, quantitative proteomic studies have become essential for the analysis of clinical cohorts, large perturbation experiments and systems biology studies. While next-generation mass spectrometric techniques such as SWATH-MS have substantially increased throughput and reproducibility, ensuring consistent quantification of thousands of peptide analytes across multiple LC-MS/MS runs remains a challenging and laborious manual process. To produce highly consistent and quantitatively accurate proteomics data matrices in an automated fashion, we have developed the TRIC software which utilizes fragment ion data to perform cross-run alignment, consistent peak-picking and quantification for high throughput targeted proteomics. TRIC uses a graph-based alignment strategy based on non-linear retention time correction to integrate peak elution information from all LC-MS/MS runs acquired in a study. When compared to state-of-the-art SWATH-MS data analysis, the algorithm was able to reduce the identification error by more than 3-fold at constant recall, while correcting for highly non-linear chromatographic effects. On a pulsed-SILAC experiment performed on human induced pluripotent stem (iPS) cells, TRIC was able to automatically align and quantify thousands of light and heavy isotopic peak groups and substantially increased the quantitative completeness and biological information in the data, providing insights into protein dynamics of iPS cells. Overall, this study demonstrates the importance of consistent quantification in highly challenging experimental setups, and proposes an algorithm to automate this task, constituting the last missing piece in a pipeline for automated analysis of massively parallel targeted proteomics datasets. PMID:27479329
Using Proteomics to Understand How Leishmania Parasites Survive inside the Host and Establish Infection

PubMed Central

Veras, Patrícia Sampaio Tavares; Bezerra de Menezes, Juliana Perrone

2016-01-01

Leishmania is a protozoan parasite that causes a wide range of different clinical manifestations in mammalian hosts. It is a major public health risk on different continents and represents one of the most important neglected diseases. Due to the high toxicity of the drugs currently used, and in the light of increasing drug resistance, there is a critical need to develop new drugs and vaccines to control Leishmania infection. Over the past few years, proteomics has become an important tool to understand the underlying biology of Leishmania parasites and host interaction. The large-scale study of proteins, both in parasites and within the host in response to infection, can accelerate the discovery of new therapeutic targets. By studying the proteomes of host cells and tissues infected with Leishmania, as well as changes in protein profiles among promastigotes and amastigotes, scientists hope to better understand the biology involved in the parasite survival and the host-parasite interaction. This review demonstrates the feasibility of proteomics as an approach to identify new proteins involved in Leishmania differentiation and intracellular survival. PMID:27548150
Using Proteomics to Understand How Leishmania Parasites Survive inside the Host and Establish Infection.

PubMed

Veras, Patrícia Sampaio Tavares; Bezerra de Menezes, Juliana Perrone

2016-08-19

Leishmania is a protozoan parasite that causes a wide range of different clinical manifestations in mammalian hosts. It is a major public health risk on different continents and represents one of the most important neglected diseases. Due to the high toxicity of the drugs currently used, and in the light of increasing drug resistance, there is a critical need to develop new drugs and vaccines to control Leishmania infection. Over the past few years, proteomics has become an important tool to understand the underlying biology of Leishmania parasites and host interaction. The large-scale study of proteins, both in parasites and within the host in response to infection, can accelerate the discovery of new therapeutic targets. By studying the proteomes of host cells and tissues infected with Leishmania, as well as changes in protein profiles among promastigotes and amastigotes, scientists hope to better understand the biology involved in the parasite survival and the host-parasite interaction. This review demonstrates the feasibility of proteomics as an approach to identify new proteins involved in Leishmania differentiation and intracellular survival.
Mass spectrometry-based proteomics: from cancer biology to protein biomarkers, drug targets, and clinical applications.

PubMed

Jimenez, Connie R; Verheul, Henk M W

2014-01-01

Proteomics is optimally suited to bridge the gap between genomic information on the one hand and biologic functions and disease phenotypes at the other, since it studies the expression and/or post-translational modification (especially phosphorylation) of proteins--the major cellular players bringing about cellular functions--at a global level in biologic specimens. Mass spectrometry technology and (bio)informatic tools have matured to the extent that they can provide high-throughput, comprehensive, and quantitative protein inventories of cells, tissues, and biofluids in clinical samples at low level. In this article, we focus on next-generation proteomics employing nanoliquid chromatography coupled to high-resolution tandem mass spectrometry for in-depth (phospho)protein profiling of tumor tissues and (proximal) biofluids, with a focus on studies employing clinical material. In addition, we highlight emerging proteogenomic approaches for the identification of tumor-specific protein variants, and targeted multiplex mass spectrometry strategies for large-scale biomarker validation. Below we provide a discussion of recent progress, some research highlights, and challenges that remain for clinical translation of proteomic discoveries.
Automation of nanoflow liquid chromatography-tandem mass spectrometry for proteome analysis by using a strong cation exchange trap column.

PubMed

Jiang, Xiaogang; Feng, Shun; Tian, Ruijun; Han, Guanghui; Jiang, Xinning; Ye, Mingliang; Zou, Hanfa

2007-02-01

An approach was developed to automate sample introduction for nanoflow LC-MS/MS (microLC-MS/MS) analysis using a strong cation exchange (SCX) trap column. The system consisted of a 100 microm id x 2 cm SCX trap column and a 75 microm id x 12 cm C18 RP analytical column. During the sample loading step, the flow passing through the SCX trap column was directed to waste for loading a large volume of sample at high flow rate. Then the peptides bound on the SCX trap column were eluted onto the RP analytical column by a high salt buffer followed by RP chromatographic separation of the peptides at nanoliter flow rate. It was observed that higher performance of separation could be achieved with the system using SCX trap column than with the system using C18 trap column. The high proteomic coverage using this approach was demonstrated in the analysis of tryptic digest of BSA and yeast cell lysate. In addition, this system was also applied to two-dimensional separation of tryptic digest of human hepatocellular carcinoma cell line SMMC-7721 for large scale proteome analysis. This system was fully automated and required minimum changes on current microLC-MS/MS system. This system represented a promising platform for routine proteome analysis.
From protein-protein interactions to protein co-expression networks: a new perspective to evaluate large-scale proteomic data.

PubMed

Vella, Danila; Zoppis, Italo; Mauri, Giancarlo; Mauri, Pierluigi; Di Silvestre, Dario

2017-12-01

The reductionist approach of dissecting biological systems into their constituents has been successful in the first stage of the molecular biology to elucidate the chemical basis of several biological processes. This knowledge helped biologists to understand the complexity of the biological systems evidencing that most biological functions do not arise from individual molecules; thus, realizing that the emergent properties of the biological systems cannot be explained or be predicted by investigating individual molecules without taking into consideration their relations. Thanks to the improvement of the current -omics technologies and the increasing understanding of the molecular relationships, even more studies are evaluating the biological systems through approaches based on graph theory. Genomic and proteomic data are often combined with protein-protein interaction (PPI) networks whose structure is routinely analyzed by algorithms and tools to characterize hubs/bottlenecks and topological, functional, and disease modules. On the other hand, co-expression networks represent a complementary procedure that give the opportunity to evaluate at system level including organisms that lack information on PPIs. Based on these premises, we introduce the reader to the PPI and to the co-expression networks, including aspects of reconstruction and analysis. In particular, the new idea to evaluate large-scale proteomic data by means of co-expression networks will be discussed presenting some examples of application. Their use to infer biological knowledge will be shown, and a special attention will be devoted to the topological and module analysis.
Metaproteomics as a Complementary Approach to Gut Microbiota in Health and Disease

NASA Astrophysics Data System (ADS)

Petriz, Bernardo A.; Franco, Octávio L.

2017-01-01

Classic studies on phylotype profiling are limited to the identification of microbial constituents, where information is lacking about the molecular interaction of these bacterial communities with the host genome and the possible outcomes in host biology. A range of OMICs approaches have provided great progress linking the microbiota to health and disease. However, the investigation of this context through proteomic mass spectrometry-based tools is still being improved. Therefore, metaproteomics or community proteogenomics has emerged as a complementary approach to metagenomic data, as a field in proteomics aiming to perform large-scale characterization of proteins from environmental microbiota such as the human gut. The advances in molecular separation methods coupled with mass spectrometry (e.g. LC-MS/MS) and proteome bioinformatics have been fundamental in these novel large-scale metaproteomic studies, which have further been performed in a wide range of samples including soil, plant and human environments. Metaproteomic studies will make major progress if a comprehensive database covering the genes and expresses proteins from all gut microbial species is developed. To this end, we here present some of the main limitations of metaproteomic studies in complex microbiota environments such as the gut, also addressing the up-to-date pipelines in sample preparation prior to fractionation/separation and mass spectrometry analysis. In addition, a novel approach to the limitations of metagenomic databases is also discussed. Finally, prospects are addressed regarding the application of metaproteomic analysis using a unified host-microbiome gene database and other meta-OMICs platforms.
Advances in targeted proteomics and applications to biomedical research

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shi, Tujin; Song, Ehwang; Nie, Song

Targeted proteomics technique has emerged as a powerful protein quantification tool in systems biology, biomedical research, and increasing for clinical applications. The most widely used targeted proteomics approach, selected reaction monitoring (SRM), also known as multiple reaction monitoring (MRM), can be used for quantification of cellular signaling networks and preclinical verification of candidate protein biomarkers. As an extension to our previous review on advances in SRM sensitivity (Shi et al., Proteomics, 12, 1074–1092, 2012) herein we review recent advances in the method and technology for further enhancing SRM sensitivity (from 2012 to present), and highlighting its broad biomedical applications inmore » human bodily fluids, tissue and cell lines. Furthermore, we also review two recently introduced targeted proteomics approaches, parallel reaction monitoring (PRM) and data-independent acquisition (DIA) with targeted data extraction on fast scanning high-resolution accurate-mass (HR/AM) instruments. Such HR/AM targeted quantification with monitoring all target product ions addresses SRM limitations effectively in specificity and multiplexing; whereas when compared to SRM, PRM and DIA are still in the infancy with a limited number of applications. Thus, for HR/AM targeted quantification we focus our discussion on method development, data processing and analysis, and its advantages and limitations in targeted proteomics. Finally, general perspectives on the potential of achieving both high sensitivity and high sample throughput for large-scale quantification of hundreds of target proteins are discussed.« less
Potential for proteomic approaches in determining efficacy biomarkers following administration of fish oils rich in omega-3 fatty acids: application in pancreatic cancers.

PubMed

Runau, Franscois; Arshad, Ali; Isherwood, John; Norris, Leonie; Howells, Lynne; Metcalfe, Matthew; Dennison, Ashley

2015-06-01

Pancreatic cancer is a disease with a significantly poor prognosis. Despite modern advances in other medical, surgical, and oncologic therapy, the outcome from pancreatic cancer has improved little over the last 40 years. To improve the management of this difficult disease, trials investigating the use of dietary and parenteral fish oils rich in omega-3 (ω-3) fatty acids, exhibiting proven anti-inflammatory and anticarcinogenic properties, have revealed favorable results in pancreatic cancers. Proteomics is the large-scale study of proteins that attempts to characterize the complete set of proteins encoded by the genome of an organism and that, with the use of sensitive mass spectrometric-based techniques, has allowed high-throughput analysis of the proteome to aid identification of putative biomarkers pertinent to given disease states. These biomarkers provide useful insight into potentially discovering new markers for early detection or elucidating the efficacy of treatment on pancreatic cancers. Here, our review identifies potential proteomic-based biomarkers in pancreatic cancer relating to apoptosis, cell proliferation, angiogenesis, and metabolic regulation in clinical studies. We also reviewed proteomic biomarkers from the administration of ω-3 fatty acids that act on similar anticarcinogenic pathways as above and reflect that proteomic studies on the effect of ω-3 fatty acids in pancreatic cancer will yield favorable results. © 2015 American Society for Parenteral and Enteral Nutrition.
Proteomic insights into floral biology.

PubMed

Li, Xiaobai; Jackson, Aaron; Xie, Ming; Wu, Dianxing; Tsai, Wen-Chieh; Zhang, Sheng

2016-08-01

The flower is the most important biological structure for ensuring angiosperms reproductive success. Not only does the flower contain critical reproductive organs, but the wide variation in morphology, color, and scent has evolved to entice specialized pollinators, and arguably mankind in many cases, to ensure the successful propagation of its species. Recent proteomic approaches have identified protein candidates related to these flower traits, which has shed light on a number of previously unknown mechanisms underlying these traits. This review article provides a comprehensive overview of the latest advances in proteomic research in floral biology according to the order of flower structure, from corolla to male and female reproductive organs. It summarizes mainstream proteomic methods for plant research and recent improvements on two dimensional gel electrophoresis and gel-free workflows for both peptide level and protein level analysis. The recent advances in sequencing technologies provide a new paradigm for the ever-increasing genome and transcriptome information on many organisms. It is now possible to integrate genomic and transcriptomic data with proteomic results for large-scale protein characterization, so that a global understanding of the complex molecular networks in flower biology can be readily achieved. This article is part of a Special Issue entitled: Plant Proteomics--a bridge between fundamental processes and crop production, edited by Dr. Hans-Peter Mock. Copyright © 2016 Elsevier B.V. All rights reserved.
BIG: a large-scale data integration tool for renal physiology

PubMed Central

Zhao, Yue; Yang, Chin-Rang; Raghuram, Viswanathan; Parulekar, Jaya

2016-01-01

Due to recent advances in high-throughput techniques, we and others have generated multiple proteomic and transcriptomic databases to describe and quantify gene expression, protein abundance, or cellular signaling on the scale of the whole genome/proteome in kidney cells. The existence of so much data from diverse sources raises the following question: “How can researchers find information efficiently for a given gene product over all of these data sets without searching each data set individually?” This is the type of problem that has motivated the “Big-Data” revolution in Data Science, which has driven progress in fields such as marketing. Here we present an online Big-Data tool called BIG (Biological Information Gatherer) that allows users to submit a single online query to obtain all relevant information from all indexed databases. BIG is accessible at http://big.nhlbi.nih.gov/. PMID:27279488
Evolution of a tissue-specific splicing network

PubMed Central

Taliaferro, J. Matthew; Alvarez, Nehemiah; Green, Richard E.; Blanchette, Marco; Rio, Donald C.

2011-01-01

Alternative splicing of precursor mRNA (pre-mRNA) is a strategy employed by most eukaryotes to increase transcript and proteomic diversity. Many metazoan splicing factors are members of multigene families, with each member having different functions. How these highly related proteins evolve unique properties has been unclear. Here we characterize the evolution and function of a new Drosophila splicing factor, termed LS2 (Large Subunit 2), that arose from a gene duplication event of dU2AF50, the large subunit of the highly conserved heterodimeric general splicing factor U2AF (U2-associated factor). The quickly evolving LS2 gene has diverged from the splicing-promoting, ubiquitously expressed dU2AF50 such that it binds a markedly different RNA sequence, acts as a splicing repressor, and is preferentially expressed in testes. Target transcripts of LS2 are also enriched for performing testes-related functions. We therefore propose a path for the evolution of a new splicing factor in Drosophila that regulates specific pre-mRNAs and contributes to transcript diversity in a tissue-specific manner. PMID:21406555
Do Viruses Exchange Genes across Superkingdoms of Life?

PubMed

Malik, Shahana S; Azem-E-Zahra, Syeda; Kim, Kyung Mo; Caetano-Anollés, Gustavo; Nasir, Arshan

2017-01-01

Viruses can be classified into archaeoviruses, bacterioviruses, and eukaryoviruses according to the taxonomy of the infected host. The host-constrained perception of viruses implies preference of genetic exchange between viruses and cellular organisms of their host superkingdoms and viral origins from host cells either via escape or reduction. However, viruses frequently establish non-lytic interactions with organisms and endogenize into the genomes of bacterial endosymbionts that reside in eukaryotic cells. Such interactions create opportunities for genetic exchange between viruses and organisms of non-host superkingdoms. Here, we take an atypical approach to revisit virus-cell interactions by first identifying protein fold structures in the proteomes of archaeoviruses, bacterioviruses, and eukaryoviruses and second by tracing their spread in the proteomes of superkingdoms Archaea, Bacteria, and Eukarya. The exercise quantified protein structural homologies between viruses and organisms of their host and non-host superkingdoms and revealed likely candidates for virus-to-cell and cell-to-virus gene transfers. Unexpected lifestyle-driven genetic affiliations between bacterioviruses and Eukarya and eukaryoviruses and Bacteria were also predicted in addition to a large cohort of protein folds that were universally shared by viral and cellular proteomes and virus-specific protein folds not detected in cellular proteomes. These protein folds provide unique insights into viral origins and evolution that are generally difficult to recover with traditional sequence alignment-dependent evolutionary analyses owing to the fast mutation rates of viral gene sequences.
Venomics of New World pit vipers: genus-wide comparisons of venom proteomes across Agkistrodon.

PubMed

Lomonte, Bruno; Tsai, Wan-Chih; Ureña-Diaz, Juan Manuel; Sanz, Libia; Mora-Obando, Diana; Sánchez, Elda E; Fry, Bryan G; Gutiérrez, José María; Gibbs, H Lisle; Sovic, Michael G; Calvete, Juan J

2014-01-16

We report a genus-wide comparison of venom proteome variation across New World pit vipers in the genus Agkistrodon. Despite the wide variety of habitats occupied by this genus and that all its taxa feed on diverse species of vertebrates and invertebrate prey, the venom proteomes of copperheads, cottonmouths, and cantils are remarkably similar, both in the type and relative abundance of their different toxin families. The venoms from all the eleven species and subspecies sampled showed relatively similar proteolytic and PLA2 activities. In contrast, quantitative differences were observed in hemorrhagic and myotoxic activities in mice. The highest myotoxic activity was observed with the venoms of A. b. bilineatus, followed by A. p. piscivorus, whereas the venoms of A. c. contortrix and A. p. leucostoma induced the lowest myotoxic activity. The venoms of Agkistrodon bilineatus subspecies showed the highest hemorrhagic activity and A. c. contortrix the lowest. Compositional and toxicological analyses agree with clinical observations of envenomations by Agkistrodon in the USA and Central America. A comparative analysis of Agkistrodon shows that venom divergence tracks phylogeny of this genus to a greater extent than in Sistrurus rattlesnakes, suggesting that the distinct natural histories of Agkistrodon and Sistrurus clades may have played a key role in molding the patterns of evolution of their venom protein genes. A deep understanding of the structural and functional profiles of venoms and of the principles governing the evolution of venomous systems is a goal of venomics. Isolated proteomics analyses have been conducted on venoms from many species of vipers and pit vipers. However, making sense of these large inventories of data requires the integration of this information across multiple species to identify evolutionary and ecological trends. Our genus-wide venomics study provides a comprehensive overview of the toxic arsenal across Agkistrodon and a ground for understanding the natural histories of, and clinical observations of envenomations by, species of this genus. Copyright © 2013 Elsevier B.V. All rights reserved.
Rotation and magnetism in intermediate-mass stars

NASA Astrophysics Data System (ADS)

Quentin, Léo G.; Tout, Christopher A.

2018-06-01

Rotation and magnetism are increasingly recognized as important phenomena in stellar evolution. Surface magnetic fields from a few to 20 000 G have been observed and models have suggested that magnetohydrodynamic transport of angular momentum and chemical composition could explain the peculiar composition of some stars. Stellar remnants such as white dwarfs have been observed with fields from a few to more than 109 G. We investigate the origin of and the evolution, on thermal and nuclear rather than dynamical time-scales, of an averaged large-scale magnetic field throughout a star's life and its coupling to stellar rotation. Large-scale magnetic fields sustained until late stages of stellar evolution with conservation of magnetic flux could explain the very high fields observed in white dwarfs. We include these effects in the Cambridge stellar evolution code using three time-dependant advection-diffusion equations coupled to the structural and composition equations of stars to model the evolution of angular momentum and the two components of the magnetic field. We present the evolution in various cases for a 3 M_{⊙} star from the beginning to the late stages of its life. Our particular model assumes that turbulent motions, including convection, favour small-scale field at the expense of large-scale field. As a result, the large-scale field concentrates in radiative zones of the star and so is exchanged between the core and the envelope of the star as it evolves. The field is sustained until the end of the asymptotic giant branch, when it concentrates in the degenerate core.
Exploring metazoan evolution through dynamic and holistic changes in protein families and domains

USDA-ARS?s Scientific Manuscript database

Understanding proteome evolution is important for deciphering processes that drive species diversity and adaptation. Herein, the dynamics of change in protein families and protein domains over the course of metazoan evolution was explored. Change, as defined by birth/death and duplication/deletion ...
Application of Large-Scale Aptamer-Based Proteomic Profiling to Planned Myocardial Infarctions.

PubMed

Jacob, Jaison; Ngo, Debby; Finkel, Nancy; Pitts, Rebecca; Gleim, Scott; Benson, Mark D; Keyes, Michelle J; Farrell, Laurie A; Morgan, Thomas; Jennings, Lori L; Gerszten, Robert E

2018-03-20

Emerging proteomic technologies using novel affinity-based reagents allow for efficient multiplexing with high-sample throughput. To identify early biomarkers of myocardial injury, we recently applied an aptamer-based proteomic profiling platform that measures 1129 proteins to samples from patients undergoing septal alcohol ablation for hypertrophic cardiomyopathy, a human model of planned myocardial injury. Here, we examined the scalability of this approach using a markedly expanded platform to study a far broader range of human proteins in the context of myocardial injury. We applied a highly multiplexed, expanded proteomic technique that uses single-stranded DNA aptamers to assay 4783 human proteins (4137 distinct human gene targets) to derivation and validation cohorts of planned myocardial injury, individuals with spontaneous myocardial infarction, and at-risk controls. We found 376 target proteins that significantly changed in the blood after planned myocardial injury in a derivation cohort (n=20; P <1.05E-05, 1-way repeated measures analysis of variance, Bonferroni threshold). Two hundred forty-seven of these proteins were validated in an independent planned myocardial injury cohort (n=15; P <1.33E-04, 1-way repeated measures analysis of variance); >90% were directionally consistent and reached nominal significance in the validation cohort. Among the validated proteins that were increased within 1 hour after planned myocardial injury, 29 were also elevated in patients with spontaneous myocardial infarction (n=63; P <6.17E-04). Many of the novel markers identified in our study are intracellular proteins not previously identified in the peripheral circulation or have functional roles relevant to myocardial injury. For example, the cardiac LIM protein, cysteine- and glycine-rich protein 3, is thought to mediate cardiac mechanotransduction and stress responses, whereas the mitochondrial ATP synthase F 0 subunit component is a vasoactive peptide on its release from cells. Last, we performed aptamer-affinity enrichment coupled with mass spectrometry to technically verify aptamer specificity for a subset of the new biomarkers. Our results demonstrate the feasibility of large-scale aptamer multiplexing at a level that has not previously been reported and with sample throughput that greatly exceeds other existing proteomic methods. The expanded aptamer-based proteomic platform provides a unique opportunity for biomarker and pathway discovery after myocardial injury. © 2017 American Heart Association, Inc.
Data Use Agreement | Office of Cancer Clinical Proteomics Research

Cancer.gov

CPTAC requests that data users abide by the same principles that were previously established in the Fort Lauderdale and Amsterdam meetings. The recommendations from the Fort Lauderdale meeting (2003) on best practices and principles for sharing large-scale genomic data address the roles and responsibilities of data producers, data users and funders of community resource projects.
Correlation of proteome-wide changes with social immunity behaviors provides insight into resistance to the parasitic mite, Varroa destructor, in the honey bee (Apis mellifera)

PubMed Central

2012-01-01

Background Disease is a major factor driving the evolution of many organisms. In honey bees, selection for social behavioral responses is the primary adaptive process facilitating disease resistance. One such process, hygienic behavior, enables bees to resist multiple diseases, including the damaging parasitic mite Varroa destructor. The genetic elements and biochemical factors that drive the expression of these adaptations are currently unknown. Proteomics provides a tool to identify proteins that control behavioral processes, and these proteins can be used as biomarkers to aid identification of disease tolerant colonies. Results We sampled a large cohort of commercial queen lineages, recording overall mite infestation, hygiene, and the specific hygienic response to V. destructor. We performed proteome-wide correlation analyses in larval integument and adult antennae, identifying several proteins highly predictive of behavior and reduced hive infestation. In the larva, response to wounding was identified as a key adaptive process leading to reduced infestation, and chitin biosynthesis and immune responses appear to represent important disease resistant adaptations. The speed of hygienic behavior may be underpinned by changes in the antenna proteome, and chemosensory and neurological processes could also provide specificity for detection of V. destructor in antennae. Conclusions Our results provide, for the first time, some insight into how complex behavioural adaptations manifest in the proteome of honey bees. The most important biochemical correlations provide clues as to the underlying molecular mechanisms of social and innate immunity of honey bees. Such changes are indicative of potential divergence in processes controlling the hive-worker maturation. PMID:23021491

Correlation of proteome-wide changes with social immunity behaviors provides insight into resistance to the parasitic mite, Varroa destructor, in the honey bee (Apis mellifera).

PubMed

Parker, Robert; Guarna, M Marta; Melathopoulos, Andony P; Moon, Kyung-Mee; White, Rick; Huxter, Elizabeth; Pernal, Stephen F; Foster, Leonard J

2012-06-29

Disease is a major factor driving the evolution of many organisms. In honey bees, selection for social behavioral responses is the primary adaptive process facilitating disease resistance. One such process, hygienic behavior, enables bees to resist multiple diseases, including the damaging parasitic mite Varroa destructor. The genetic elements and biochemical factors that drive the expression of these adaptations are currently unknown. Proteomics provides a tool to identify proteins that control behavioral processes, and these proteins can be used as biomarkers to aid identification of disease tolerant colonies. We sampled a large cohort of commercial queen lineages, recording overall mite infestation, hygiene, and the specific hygienic response to V. destructor. We performed proteome-wide correlation analyses in larval integument and adult antennae, identifying several proteins highly predictive of behavior and reduced hive infestation. In the larva, response to wounding was identified as a key adaptive process leading to reduced infestation, and chitin biosynthesis and immune responses appear to represent important disease resistant adaptations. The speed of hygienic behavior may be underpinned by changes in the antenna proteome, and chemosensory and neurological processes could also provide specificity for detection of V. destructor in antennae. Our results provide, for the first time, some insight into how complex behavioural adaptations manifest in the proteome of honey bees. The most important biochemical correlations provide clues as to the underlying molecular mechanisms of social and innate immunity of honey bees. Such changes are indicative of potential divergence in processes controlling the hive-worker maturation.
Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework

PubMed Central

2012-01-01

Background For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. Results We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. Conclusion The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources. PMID:23216909
Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework.

PubMed

Lewis, Steven; Csordas, Attila; Killcoyne, Sarah; Hermjakob, Henning; Hoopmann, Michael R; Moritz, Robert L; Deutsch, Eric W; Boyle, John

2012-12-05

For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources.
Using extremely halophilic bacteria to understand the role of surface charge and surface hydration in protein evolution, folding, and function

NASA Astrophysics Data System (ADS)

Hoff, Wouter; Deole, Ratnakar; Osu Collaboration

2013-03-01

Halophilic Archaea accumulate molar concentrations of KCl in their cytoplasm as an osmoprotectant, and have evolved highly acidic proteomes that only function at high salinity. We examine osmoprotection in the photosynthetic Proteobacteria Halorhodospira halophila. We find that H. halophila has an acidic proteome and accumulates molar concentrations of KCl when grown in high salt media. Upon growth of H. halophila in low salt media, its cytoplasmic K + content matches that of Escherichia coli, revealing an acidic proteome that can function in the absence of high cytoplasmic salt concentrations. These findings necessitate a reassessment of two central aspects of theories for understanding extreme halophiles. We conclude that proteome acidity is not driven by stabilizing interactions between K + ions and acidic side chains, but by the need for maintaining sufficient solvation and hydration of the protein surface at high salinity through strongly hydrated carboxylates. We propose that obligate protein halophilicity is a non-adaptive property resulting from genetic drift in which constructive neutral evolution progressively incorporates weakly stabilizing K + binding sites on an increasingly acidic protein surface.
Expediting SRM assay development for large-scale targeted proteomics experiments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Chaochao; Shi, Tujin; Brown, Joseph N.

2014-08-22

Due to their high sensitivity and specificity, targeted proteomics measurements, e.g. selected reaction monitoring (SRM), are becoming increasingly popular for biological and translational applications. Selection of optimal transitions and optimization of collision energy (CE) are important assay development steps for achieving sensitive detection and accurate quantification; however, these steps can be labor-intensive, especially for large-scale applications. Herein, we explored several options for accelerating SRM assay development evaluated in the context of a relatively large set of 215 synthetic peptide targets. We first showed that HCD fragmentation is very similar to CID in triple quadrupole (QQQ) instrumentation, and by selection ofmore » top six y fragment ions from HCD spectra, >86% of top transitions optimized from direct infusion on QQQ instrument are covered. We also demonstrated that the CE calculated by existing prediction tools was less accurate for +3 precursors, and a significant increase in intensity for transitions could be obtained using a new CE prediction equation constructed from the present experimental data. Overall, our study illustrates the feasibility of expediting the development of larger numbers of high-sensitivity SRM assays through automation of transitions selection and accurate prediction of optimal CE to improve both SRM throughput and measurement quality.« less
The Response of the Root Proteome to the Synthetic Strigolactone GR24 in Arabidopsis*

PubMed Central

Walton, Alan; Stes, Elisabeth; Goeminne, Geert; Braem, Lukas; Vuylsteke, Marnik; Matthys, Cedrick; De Cuyper, Carolien; Staes, An; Vandenbussche, Jonathan; Boyer, François-Didier; Vanholme, Ruben; Fromentin, Justine; Boerjan, Wout; Gevaert, Kris; Goormachtig, Sofie

2016-01-01

Strigolactones are plant metabolites that act as phytohormones and rhizosphere signals. Whereas most research on unraveling the action mechanisms of strigolactones is focused on plant shoots, we investigated proteome adaptation during strigolactone signaling in the roots of Arabidopsis thaliana. Through large-scale, time-resolved, and quantitative proteomics, the impact of the strigolactone analog rac-GR24 was elucidated on the root proteome of the wild type and the signaling mutant more axillary growth 2 (max2). Our study revealed a clear MAX2-dependent rac-GR24 response: an increase in abundance of enzymes involved in flavonol biosynthesis, which was reduced in the max2–1 mutant. Mass spectrometry-driven metabolite profiling and thin-layer chromatography experiments demonstrated that these changes in protein expression lead to the accumulation of specific flavonols. Moreover, quantitative RT-PCR revealed that the flavonol-related protein expression profile was caused by rac-GR24-induced changes in transcript levels of the corresponding genes. This induction of flavonol production was shown to be activated by the two pure enantiomers that together make up rac-GR24. Finally, our data provide much needed clues concerning the multiple roles played by MAX2 in the roots and a comprehensive view of the rac-GR24-induced response in the root proteome. PMID:27317401
Classification of Complete Proteomes of Different Organisms and Protein Sets Based on Their Protein Distributions in Terms of Some Key Attributes of Proteins

PubMed Central

Ma, Yue; Tuskan, Gerald A.

2018-01-01

The existence of complete genome sequences makes it important to develop different approaches for classification of large-scale data sets and to make extraction of biological insights easier. Here, we propose an approach for classification of complete proteomes/protein sets based on protein distributions on some basic attributes. We demonstrate the usefulness of this approach by determining protein distributions in terms of two attributes: protein lengths and protein intrinsic disorder contents (ID). The protein distributions based on L and ID are surveyed for representative proteome organisms and protein sets from the three domains of life. The two-dimensional maps (designated as fingerprints here) from the protein distribution densities in the LD space defined by ln(L) and ID are then constructed. The fingerprints for different organisms and protein sets are found to be distinct with each other, and they can therefore be used for comparative studies. As a test case, phylogenetic trees have been constructed based on the protein distribution densities in the fingerprints of proteomes of organisms without performing any protein sequence comparison and alignments. The phylogenetic trees generated are biologically meaningful, demonstrating that the protein distributions in the LD space may serve as unique phylogenetic signals of the organisms at the proteome level. PMID:29686995
Comparative Analysis of Proteomes and Functionomes Provides Insights into Origins of Cellular Diversification

PubMed Central

Caetano-Anollés, Gustavo

2013-01-01

Reconstructing the evolutionary history of modern species is a difficult problem complicated by the conceptual and technical limitations of phylogenetic tree building methods. Here, we propose a comparative proteomic and functionomic inferential framework for genome evolution that allows resolving the tripartite division of cells and sketching their history. Evolutionary inferences were derived from the spread of conserved molecular features, such as molecular structures and functions, in the proteomes and functionomes of contemporary organisms. Patterns of use and reuse of these traits yielded significant insights into the origins of cellular diversification. Results uncovered an unprecedented strong evolutionary association between Bacteria and Eukarya while revealing marked evolutionary reductive tendencies in the archaeal genomic repertoires. The effects of nonvertical evolutionary processes (e.g., HGT, convergent evolution) were found to be limited while reductive evolution and molecular innovation appeared to be prevalent during the evolution of cells. Our study revealed a strong vertical trace in the history of proteins and associated molecular functions, which was reliably recovered using the comparative genomics approach. The trace supported the existence of a stem line of descent and the very early appearance of Archaea as a diversified superkingdom, but failed to uncover a hidden canonical pattern in which Bacteria was the first superkingdom to deploy superkingdom-specific structures and functions. PMID:24492748
PNAC: a protein nucleolar association classifier

PubMed Central

2011-01-01

Background Although primarily known as the site of ribosome subunit production, the nucleolus is involved in numerous and diverse cellular processes. Recent large-scale proteomics projects have identified thousands of human proteins that associate with the nucleolus. However, in most cases, we know neither the fraction of each protein pool that is nucleolus-associated nor whether their association is permanent or conditional. Results To describe the dynamic localisation of proteins in the nucleolus, we investigated the extent of nucleolar association of proteins by first collating an extensively curated literature-derived dataset. This dataset then served to train a probabilistic predictor which integrates gene and protein characteristics. Unlike most previous experimental and computational studies of the nucleolar proteome that produce large static lists of nucleolar proteins regardless of their extent of nucleolar association, our predictor models the fluidity of the nucleolus by considering different classes of nucleolar-associated proteins. The new method predicts all human proteins as either nucleolar-enriched, nucleolar-nucleoplasmic, nucleolar-cytoplasmic or non-nucleolar. Leave-one-out cross validation tests reveal sensitivity values for these four classes ranging from 0.72 to 0.90 and positive predictive values ranging from 0.63 to 0.94. The overall accuracy of the classifier was measured to be 0.85 on an independent literature-based test set and 0.74 using a large independent quantitative proteomics dataset. While the three nucleolar-association groups display vastly different Gene Ontology biological process signatures and evolutionary characteristics, they collectively represent the most well characterised nucleolar functions. Conclusions Our proteome-wide classification of nucleolar association provides a novel representation of the dynamic content of the nucleolus. This model of nucleolar localisation thus increases the coverage while providing accurate and specific annotations of the nucleolar proteome. It will be instrumental in better understanding the central role of the nucleolus in the cell and its interaction with other subcellular compartments. PMID:21272300
A Proteomics Approach to the Protein Normalization Problem: Selection of Unvarying Proteins for MS-Based Proteomics and Western Blotting.

PubMed

Wiśniewski, Jacek R; Mann, Matthias

2016-07-01

Proteomics and other protein-based analysis methods such as Western blotting all face the challenge of discriminating changes in the levels of proteins of interest from inadvertent changes in the amount loaded for analysis. Mass-spectrometry-based proteomics can now estimate the relative and absolute amounts of thousands of proteins across diverse biological systems. We reasoned that this new technology could prove useful for selection of very stably expressed proteins that could serve as better loading controls than those traditionally employed. Large-scale proteomic analyses of SDS lysates of cultured cells and tissues revealed deglycase DJ-1 as the protein with the lowest variability in abundance among different cell types in human, mouse, and amphibian cells. The protein constitutes 0.069 ± 0.017% of total cellular protein and occurs at a specific concentration of 34.6 ± 8.7 pmol/mg of total protein. Since DJ-1 is ubiquitous and therefore easily detectable with several peptides, it can be helpful in normalization of proteomic data sets. In addition, DJ-1 appears to be an advantageous loading control for Western blot that is superior to those used commonly used, allowing comparisons between tissues and cells originating from evolutionarily distant vertebrate species. Notably, this is not possible by the detection and quantitation of housekeeping proteins, which are often used in the Western blot technique. The approach introduced here can be applied to select the most appropriate loading controls for MS-based proteomics or Western blotting in any biological system.
Aptamer-Based Multiplexed Proteomic Technology for Biomarker Discovery

PubMed Central

Gold, Larry; Ayers, Deborah; Bertino, Jennifer; Bock, Christopher; Bock, Ashley; Brody, Edward N.; Carter, Jeff; Dalby, Andrew B.; Eaton, Bruce E.; Fitzwater, Tim; Flather, Dylan; Forbes, Ashley; Foreman, Trudi; Fowler, Cate; Gawande, Bharat; Goss, Meredith; Gunn, Magda; Gupta, Shashi; Halladay, Dennis; Heil, Jim; Heilig, Joe; Hicke, Brian; Husar, Gregory; Janjic, Nebojsa; Jarvis, Thale; Jennings, Susan; Katilius, Evaldas; Keeney, Tracy R.; Kim, Nancy; Koch, Tad H.; Kraemer, Stephan; Kroiss, Luke; Le, Ngan; Levine, Daniel; Lindsey, Wes; Lollo, Bridget; Mayfield, Wes; Mehan, Mike; Mehler, Robert; Nelson, Sally K.; Nelson, Michele; Nieuwlandt, Dan; Nikrad, Malti; Ochsner, Urs; Ostroff, Rachel M.; Otis, Matt; Parker, Thomas; Pietrasiewicz, Steve; Resnicow, Daniel I.; Rohloff, John; Sanders, Glenn; Sattin, Sarah; Schneider, Daniel; Singer, Britta; Stanton, Martin; Sterkel, Alana; Stewart, Alex; Stratford, Suzanne; Vaught, Jonathan D.; Vrkljan, Mike; Walker, Jeffrey J.; Watrobka, Mike; Waugh, Sheela; Weiss, Allison; Wilcox, Sheri K.; Wolfson, Alexey; Wolk, Steven K.; Zhang, Chi; Zichi, Dom

2010-01-01

Background The interrogation of proteomes (“proteomics”) in a highly multiplexed and efficient manner remains a coveted and challenging goal in biology and medicine. Methodology/Principal Findings We present a new aptamer-based proteomic technology for biomarker discovery capable of simultaneously measuring thousands of proteins from small sample volumes (15 µL of serum or plasma). Our current assay measures 813 proteins with low limits of detection (1 pM median), 7 logs of overall dynamic range (∼100 fM–1 µM), and 5% median coefficient of variation. This technology is enabled by a new generation of aptamers that contain chemically modified nucleotides, which greatly expand the physicochemical diversity of the large randomized nucleic acid libraries from which the aptamers are selected. Proteins in complex matrices such as plasma are measured with a process that transforms a signature of protein concentrations into a corresponding signature of DNA aptamer concentrations, which is quantified on a DNA microarray. Our assay takes advantage of the dual nature of aptamers as both folded protein-binding entities with defined shapes and unique nucleotide sequences recognizable by specific hybridization probes. To demonstrate the utility of our proteomics biomarker discovery technology, we applied it to a clinical study of chronic kidney disease (CKD). We identified two well known CKD biomarkers as well as an additional 58 potential CKD biomarkers. These results demonstrate the potential utility of our technology to rapidly discover unique protein signatures characteristic of various disease states. Conclusions/Significance We describe a versatile and powerful tool that allows large-scale comparison of proteome profiles among discrete populations. This unbiased and highly multiplexed search engine will enable the discovery of novel biomarkers in a manner that is unencumbered by our incomplete knowledge of biology, thereby helping to advance the next generation of evidence-based medicine. PMID:21165148
The peripheral blood proteome signature of idiopathic pulmonary fibrosis is distinct from normal and is associated with novel immunological processes.

PubMed

O'Dwyer, David N; Norman, Katy C; Xia, Meng; Huang, Yong; Gurczynski, Stephen J; Ashley, Shanna L; White, Eric S; Flaherty, Kevin R; Martinez, Fernando J; Murray, Susan; Noth, Imre; Arnold, Kelly B; Moore, Bethany B

2017-04-25

Idiopathic pulmonary fibrosis (IPF) is a progressive and fatal interstitial pneumonia. The disease pathophysiology is poorly understood and the etiology remains unclear. Recent advances have generated new therapies and improved knowledge of the natural history of IPF. These gains have been brokered by advances in technology and improved insight into the role of various genes in mediating disease, but gene expression and protein levels do not always correlate. Thus, in this paper we apply a novel large scale high throughput aptamer approach to identify more than 1100 proteins in the peripheral blood of well-characterized IPF patients and normal volunteers. We use systems biology approaches to identify a unique IPF proteome signature and give insight into biological processes driving IPF. We found IPF plasma to be altered and enriched for proteins involved in defense response, wound healing and protein phosphorylation when compared to normal human plasma. Analysis also revealed a minimal protein signature that differentiated IPF patients from normal controls, which may allow for accurate diagnosis of IPF based on easily-accessible peripheral blood. This report introduces large scale unbiased protein discovery analysis to IPF and describes distinct biological processes that further inform disease biology.
Large-Scale Proteome Comparative Analysis of Developing Rhizomes of the Ancient Vascular Plant Equisetum Hyemale

PubMed Central

Balbuena, Tiago Santana; He, Ruifeng; Salvato, Fernanda; Gang, David R.; Thelen, Jay J.

2012-01-01

Horsetail (Equisetum hyemale) is a widespread vascular plant species, whose reproduction is mainly dependent on the growth and development of the rhizomes. Due to its key evolutionary position, the identification of factors that could be involved in the existence of the rhizomatous trait may contribute to a better understanding of the role of this underground organ for the successful propagation of this and other plant species. In the present work, we characterized the proteome of E. hyemale rhizomes using a GeLC-MS spectral-counting proteomics strategy. A total of 1,911 and 1,860 non-redundant proteins were identified in the rhizomes apical tip and elongation zone, respectively. Rhizome-characteristic proteins were determined by comparisons of the developing rhizome tissues to developing roots. A total of 87 proteins were found to be up-regulated in both horsetail rhizome tissues in relation to developing roots. Hierarchical clustering indicated a vast dynamic range in the regulation of the 87 characteristic proteins and revealed, based on the regulation profile, the existence of nine major protein groups. Gene ontology analyses suggested an over-representation of the terms involved in macromolecular and protein biosynthetic processes, gene expression, and nucleotide and protein binding functions. Spatial difference analysis between the rhizome apical tip and the elongation zone revealed that only eight proteins were up-regulated in the apical tip including RNA-binding proteins and an acyl carrier protein, as well as a KH domain protein and a T-complex subunit; while only seven proteins were up-regulated in the elongation zone including phosphomannomutase, galactomannan galactosyltransferase, endoglucanase 10 and 25, and mannose-1-phosphate guanyltransferase subunits alpha and beta. This is the first large-scale characterization of the proteome of a plant rhizome. Implications of the findings were discussed in relation to other underground organs and related species. PMID:22740841
Quantitative Missense Variant Effect Prediction Using Large-Scale Mutagenesis Data.

PubMed

Gray, Vanessa E; Hause, Ronald J; Luebeck, Jens; Shendure, Jay; Fowler, Douglas M

2018-01-24

Large datasets describing the quantitative effects of mutations on protein function are becoming increasingly available. Here, we leverage these datasets to develop Envision, which predicts the magnitude of a missense variant's molecular effect. Envision combines 21,026 variant effect measurements from nine large-scale experimental mutagenesis datasets, a hitherto untapped training resource, with a supervised, stochastic gradient boosting learning algorithm. Envision outperforms other missense variant effect predictors both on large-scale mutagenesis data and on an independent test dataset comprising 2,312 TP53 variants whose effects were measured using a low-throughput approach. This dataset was never used for hyperparameter tuning or model training and thus serves as an independent validation set. Envision prediction accuracy is also more consistent across amino acids than other predictors. Finally, we demonstrate that Envision's performance improves as more large-scale mutagenesis data are incorporated. We precompute Envision predictions for every possible single amino acid variant in human, mouse, frog, zebrafish, fruit fly, worm, and yeast proteomes (https://envision.gs.washington.edu/). Copyright © 2017 Elsevier Inc. All rights reserved.
Proteomics and Systems Biology: Current and Future Applications in the Nutritional Sciences1

PubMed Central

Moore, J. Bernadette; Weeks, Mark E.

2011-01-01

In the last decade, advances in genomics, proteomics, and metabolomics have yielded large-scale datasets that have driven an interest in global analyses, with the objective of understanding biological systems as a whole. Systems biology integrates computational modeling and experimental biology to predict and characterize the dynamic properties of biological systems, which are viewed as complex signaling networks. Whereas the systems analysis of disease-perturbed networks holds promise for identification of drug targets for therapy, equally the identified critical network nodes may be targeted through nutritional intervention in either a preventative or therapeutic fashion. As such, in the context of the nutritional sciences, it is envisioned that systems analysis of normal and nutrient-perturbed signaling networks in combination with knowledge of underlying genetic polymorphisms will lead to a future in which the health of individuals will be improved through predictive and preventative nutrition. Although high-throughput transcriptomic microarray data were initially most readily available and amenable to systems analysis, recent technological and methodological advances in MS have contributed to a linear increase in proteomic investigations. It is now commonplace for combined proteomic technologies to generate complex, multi-faceted datasets, and these will be the keystone of future systems biology research. This review will define systems biology, outline current proteomic methodologies, highlight successful applications of proteomics in nutrition research, and discuss the challenges for future applications of systems biology approaches in the nutritional sciences. PMID:22332076
QC-ART: A tool for real-time quality control assessment of mass spectrometry-based proteomics data.

PubMed

Stanfill, Bryan A; Nakayasu, Ernesto S; Bramer, Lisa M; Thompson, Allison M; Ansong, Charles K; Clauss, Therese; Gritsenko, Marina A; Monroe, Matthew E; Moore, Ronald J; Orton, Daniel J; Piehowski, Paul D; Schepmoes, Athena A; Smith, Richard D; Webb-Robertson, Bobbie-Jo; Metz, Thomas O

2018-04-17

Liquid chromatography-mass spectrometry (LC-MS)-based proteomics studies of large sample cohorts can easily require from months to years to complete. Acquiring consistent, high-quality data in such large-scale studies is challenging because of normal variations in instrumentation performance over time, as well as artifacts introduced by the samples themselves, such as those due to collection, storage and processing. Existing quality control methods for proteomics data primarily focus on post-hoc analysis to remove low-quality data that would degrade downstream statistics; they are not designed to evaluate the data in near real-time, which would allow for interventions as soon as deviations in data quality are detected. In addition to flagging analyses that demonstrate outlier behavior, evaluating how the data structure changes over time can aide in understanding typical instrument performance or identify issues such as a degradation in data quality due to the need for instrument cleaning and/or re-calibration. To address this gap for proteomics, we developed Quality Control Analysis in Real-Time (QC-ART), a tool for evaluating data as they are acquired in order to dynamically flag potential issues with instrument performance or sample quality. QC-ART has similar accuracy as standard post-hoc analysis methods with the additional benefit of real-time analysis. We demonstrate the utility and performance of QC-ART in identifying deviations in data quality due to both instrument and sample issues in near real-time for LC-MS-based plasma proteomics analyses of a sample subset of The Environmental Determinants of Diabetes in the Young cohort. We also present a case where QC-ART facilitated the identification of oxidative modifications, which are often underappreciated in proteomic experiments. Published under license by The American Society for Biochemistry and Molecular Biology, Inc.
The Use of Weighted Graphs for Large-Scale Genome Analysis

PubMed Central

Zhou, Fang; Toivonen, Hannu; King, Ross D.

2014-01-01

There is an acute need for better tools to extract knowledge from the growing flood of sequence data. For example, thousands of complete genomes have been sequenced, and their metabolic networks inferred. Such data should enable a better understanding of evolution. However, most existing network analysis methods are based on pair-wise comparisons, and these do not scale to thousands of genomes. Here we propose the use of weighted graphs as a data structure to enable large-scale phylogenetic analysis of networks. We have developed three types of weighted graph for enzymes: taxonomic (these summarize phylogenetic importance), isoenzymatic (these summarize enzymatic variety/redundancy), and sequence-similarity (these summarize sequence conservation); and we applied these types of weighted graph to survey prokaryotic metabolism. To demonstrate the utility of this approach we have compared and contrasted the large-scale evolution of metabolism in Archaea and Eubacteria. Our results provide evidence for limits to the contingency of evolution. PMID:24619061
Proteomics meets blue biotechnology: a wealth of novelties and opportunities.

PubMed

Hartmann, Erica M; Durighello, Emie; Pible, Olivier; Nogales, Balbina; Beltrametti, Fabrizio; Bosch, Rafael; Christie-Oleza, Joseph A; Armengaud, Jean

2014-10-01

Blue biotechnology, in which aquatic environments provide the inspiration for various products such as food additives, aquaculture, biosensors, green chemistry, bioenergy, and pharmaceuticals, holds enormous promise. Large-scale efforts to sequence aquatic genomes and metagenomes, as well as campaigns to isolate new organisms and culture-based screenings, are helping to push the boundaries of known organisms. Mass spectrometry-based proteomics can complement 16S gene sequencing in the effort to discover new organisms of potential relevance to blue biotechnology by facilitating the rapid screening of microbial isolates and by providing in depth profiles of the proteomes and metaproteomes of marine organisms, both model cultivable isolates and, more recently, exotic non-cultivable species and communities. Proteomics has already contributed to blue biotechnology by identifying aquatic proteins with potential applications to food fermentation, the textile industry, and biomedical drug development. In this review, we discuss historical developments in blue biotechnology, the current limitations to the known marine biosphere, and the ways in which mass spectrometry can expand that knowledge. We further speculate about directions that research in blue biotechnology will take given current and near-future technological advancements in mass spectrometry. Copyright © 2014 Elsevier B.V. All rights reserved.
Proteomics in the genome engineering era.

PubMed

Vandemoortele, Giel; Gevaert, Kris; Eyckerman, Sven

2016-01-01

Genome engineering experiments used to be lengthy, inefficient, and often expensive, preventing a widespread adoption of such experiments for the full assessment of endogenous protein functions. With the revolutionary clustered regularly interspaced short palindromic repeats/CRISPR-associated protein 9 technology, genome engineering became accessible to the broad life sciences community and is now implemented in several research areas. One particular field that can benefit significantly from this evolution is proteomics where a substantial impact on experimental design and general proteome biology can be expected. In this review, we describe the main applications of genome engineering in proteomics, including the use of engineered disease models and endogenous epitope tagging. In addition, we provide an overview on current literature and highlight important considerations when launching genome engineering technologies in proteomics workflows. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Quantitative proteomic analysis reveals posttranslational responses to aneuploidy in yeast

PubMed Central

Dephoure, Noah; Hwang, Sunyoung; O'Sullivan, Ciara; Dodgson, Stacie E; Gygi, Steven P; Amon, Angelika; Torres, Eduardo M

2014-01-01

Aneuploidy causes severe developmental defects and is a near universal feature of tumor cells. Despite its profound effects, the cellular processes affected by aneuploidy are not well characterized. Here, we examined the consequences of aneuploidy on the proteome of aneuploid budding yeast strains. We show that although protein levels largely scale with gene copy number, subunits of multi-protein complexes are notable exceptions. Posttranslational mechanisms attenuate their expression when their encoding genes are in excess. Our proteomic analyses further revealed a novel aneuploidy-associated protein expression signature characteristic of altered metabolism and redox homeostasis. Indeed aneuploid cells harbor increased levels of reactive oxygen species (ROS). Interestingly, increased protein turnover attenuates ROS levels and this novel aneuploidy-associated signature and improves the fitness of most aneuploid strains. Our results show that aneuploidy causes alterations in metabolism and redox homeostasis. Cells respond to these alterations through both transcriptional and posttranscriptional mechanisms. DOI: http://dx.doi.org/10.7554/eLife.03023.001 PMID:25073701

Principles of proteome allocation are revealed using proteomic data and genome-scale models

PubMed Central

Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.; Ebrahim, Ali; Saunders, Michael A.; Palsson, Bernhard O.

2016-01-01

Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the “generalist” (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions, prediction errors for growth rate and metabolic fluxes were 69% and 14% lower, respectively. The sector-constrained ME model thus represents a generalist ME model reflecting both growth rate maximization and “hedging” against uncertain environments and stresses, as indicated by significant enrichment of these sectors for the general stress response sigma factor σS. Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally-related protein groups) as demonstrated here. This flexible formalism provides an accessible approach for narrowing the gap between the complexity captured by omics data and governing principles of proteome allocation described by systems-level models. PMID:27857205
Principles of proteome allocation are revealed using proteomic data and genome-scale models

DOE PAGES

Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.; ...

2016-11-18

Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the “generalist” (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions, prediction errors for growth rate and metabolic fluxes were 69% and 14% lower, respectively. The sector-constrained ME model thusmore » represents a generalist ME model reflecting both growth rate maximization and “hedging” against uncertain environments and stresses, as indicated by significant enrichment of these sectors for the general stress response sigma factor σS. Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally-related protein groups) as demonstrated here. Furthermore, this flexible formalism provides an accessible approach for narrowing the gap between the complexity captured by omics data and governing principles of proteome allocation described by systems-level models.« less
Quantitative Proteomics Reveals Temporal Proteomic Changes in Signaling Pathways during BV2 Mouse Microglial Cell Activation.

PubMed

Woo, Jongmin; Han, Dohyun; Wang, Joseph Injae; Park, Joonho; Kim, Hyunsoo; Kim, Youngsoo

2017-09-01

The development of systematic proteomic quantification techniques in systems biology research has enabled one to perform an in-depth analysis of cellular systems. We have developed a systematic proteomic approach that encompasses the spectrum from global to targeted analysis on a single platform. We have applied this technique to an activated microglia cell system to examine changes in the intracellular and extracellular proteomes. Microglia become activated when their homeostatic microenvironment is disrupted. There are varying degrees of microglial activation, and we chose to focus on the proinflammatory reactive state that is induced by exposure to such stimuli as lipopolysaccharide (LPS) and interferon-gamma (IFN-γ). Using an improved shotgun proteomics approach, we identified 5497 proteins in the whole-cell proteome and 4938 proteins in the secretome that were associated with the activation of BV2 mouse microglia by LPS or IFN-γ. Of the differentially expressed proteins in stimulated microglia, we classified pathways that were related to immune-inflammatory responses and metabolism. Our label-free parallel reaction monitoring (PRM) approach made it possible to comprehensively measure the hyper-multiplex quantitative value of each protein by high-resolution mass spectrometry. Over 450 peptides that corresponded to pathway proteins and direct or indirect interactors via the STRING database were quantified by label-free PRM in a single run. Moreover, we performed a longitudinal quantification of secreted proteins during microglial activation, in which neurotoxic molecules that mediate neuronal cell loss in the brain are released. These data suggest that latent pathways that are associated with neurodegenerative diseases can be discovered by constructing and analyzing a pathway network model of proteins. Furthermore, this systematic quantification platform has tremendous potential for applications in large-scale targeted analyses. The proteomics data for discovery and label-free PRM analysis have been deposited to the ProteomeXchange Consortium with identifiers and , respectively.
Identification of cypermethrin induced protein changes in green algae by iTRAQ quantitative proteomics.

PubMed

Gao, Yan; Lim, Teck Kwang; Lin, Qingsong; Li, Sam Fong Yau

2016-04-29

Cypermethrin (CYP) is one of the most widely used pesticides in large scale for agricultural and domestic purpose and the residue often seriously affects aquatic system. Environmental pollutant-induced protein changes in organisms could be detected by proteomics, leading to discovery of potential biomarkers and understanding of mode of action. While proteomics investigations of CYP stress in some animal models have been well studied, few reports about the effects of exposure to CYP on algae proteome were published. To determine CYP effect in algae, the impact of various dosages (0.001μg/L, 0.01μg/L and 1μg/L) of CYP on green algae Chlorella vulgaris for 24h and 96h was investigated by using iTRAQ quantitative proteomics technique. A total of 162 and 198 proteins were significantly altered after CYP exposure for 24h and 96h, respectively. Overview of iTRAQ results indicated that the influence of CYP on algae protein might be dosage-dependent. Functional analysis of differentially expressed proteins showed that CYP could induce protein alterations related to photosynthesis, stress responses and carbohydrate metabolism. This study provides a comprehensive view of complex mode of action of algae under CYP stress and highlights several potential biomarkers for further investigation of pesticide-exposed plant and algae. Copyright © 2016 Elsevier B.V. All rights reserved.
Classification of Complete Proteomes of Different Organisms and Protein Sets Based on Their Protein Distributions in Terms of Some Key Attributes of Proteins

DOE PAGES

Guo, Hao-Bo; Ma, Yue; Tuskan, Gerald A.; ...

2018-01-01

The existence of complete genome sequences makes it important to develop different approaches for classification of large-scale data sets and to make extraction of biological insights easier. Here, we propose an approach for classification of complete proteomes/protein sets based on protein distributions on some basic attributes. We demonstrate the usefulness of this approach by determining protein distributions in terms of two attributes: protein lengths and protein intrinsic disorder contents (ID). The protein distributions based on L and ID are surveyed for representative proteome organisms and protein sets from the three domains of life. The two-dimensional maps (designated as fingerprints here)more » from the protein distribution densities in the LD space defined by ln( L ) and ID are then constructed. The fingerprints for different organisms and protein sets are found to be distinct with each other, and they can therefore be used for comparative studies. As a test case, phylogenetic trees have been constructed based on the protein distribution densities in the fingerprints of proteomes of organisms without performing any protein sequence comparison and alignments. The phylogenetic trees generated are biologically meaningful, demonstrating that the protein distributions in the LD space may serve as unique phylogenetic signals of the organisms at the proteome level.« less
Proteomic Approaches to Quantify Cysteine Reversible Modifications in Aging and Neurodegenerative Diseases

PubMed Central

Gu, Liqing; Robinson, Renã A. S.

2016-01-01

Cysteine is a highly reactive amino acid and is subject to a variety of reversible post-translational modifications (PTMs), including nitrosylation, glutathionylation, palmitoylation, as well as formation of sulfenic acid and disulfides. These modifications are not only involved in normal biological activities, such as enzymatic catalysis, redox signaling and cellular homeostasis, but can also be the result of oxidative damage. Especially in aging and neurodegenerative diseases, oxidative stress leads to aberrant cysteine oxidations that affect protein structure and function leading to neurodegeneration as well as other detrimental effects. Methods that can identify cysteine modifications by type, including the site of modification, as well as the relative stoichiometry of the modification can be very helpful for understanding the role of the thiol proteome and redox homeostasis in the context of disease. Cysteine reversible modifications however, are challenging to investigate as they are low abundant, diverse, and labile especially under endogenous conditions. Thanks to the development of redox proteomic approaches, large-scale quantification of cysteine reversible modifications is possible. These approaches cover a range of strategies to enrich, identify, and quantify cysteine reversible modifications from biological samples. This review will focus on nongel-based redox proteomics workflows that give quantitative information about cysteine PTMs and highlight how these strategies have been useful for investigating the redox thiol proteome in aging and neurodegenerative diseases. PMID:27666938
Advances in targeted proteomics and applications to biomedical research

PubMed Central

Shi, Tujin; Song, Ehwang; Nie, Song; Rodland, Karin D.; Liu, Tao; Qian, Wei-Jun; Smith, Richard D.

2016-01-01

Targeted proteomics technique has emerged as a powerful protein quantification tool in systems biology, biomedical research, and increasing for clinical applications. The most widely used targeted proteomics approach, selected reaction monitoring (SRM), also known as multiple reaction monitoring (MRM), can be used for quantification of cellular signaling networks and preclinical verification of candidate protein biomarkers. As an extension to our previous review on advances in SRM sensitivity herein we review recent advances in the method and technology for further enhancing SRM sensitivity (from 2012 to present), and highlighting its broad biomedical applications in human bodily fluids, tissue and cell lines. Furthermore, we also review two recently introduced targeted proteomics approaches, parallel reaction monitoring (PRM) and data-independent acquisition (DIA) with targeted data extraction on fast scanning high-resolution accurate-mass (HR/AM) instruments. Such HR/AM targeted quantification with monitoring all target product ions addresses SRM limitations effectively in specificity and multiplexing; whereas when compared to SRM, PRM and DIA are still in the infancy with a limited number of applications. Thus, for HR/AM targeted quantification we focus our discussion on method development, data processing and analysis, and its advantages and limitations in targeted proteomics. Finally, general perspectives on the potential of achieving both high sensitivity and high sample throughput for large-scale quantification of hundreds of target proteins are discussed. PMID:27302376
An object model and database for functional genomics.

PubMed

Jones, Andrew; Hunt, Ela; Wastling, Jonathan M; Pizarro, Angel; Stoeckert, Christian J

2004-07-10

Large-scale functional genomics analysis is now feasible and presents significant challenges in data analysis, storage and querying. Data standards are required to enable the development of public data repositories and to improve data sharing. There is an established data format for microarrays (microarray gene expression markup language, MAGE-ML) and a draft standard for proteomics (PEDRo). We believe that all types of functional genomics experiments should be annotated in a consistent manner, and we hope to open up new ways of comparing multiple datasets used in functional genomics. We have created a functional genomics experiment object model (FGE-OM), developed from the microarray model, MAGE-OM and two models for proteomics, PEDRo and our own model (Gla-PSI-Glasgow Proposal for the Proteomics Standards Initiative). FGE-OM comprises three namespaces representing (i) the parts of the model common to all functional genomics experiments; (ii) microarray-specific components; and (iii) proteomics-specific components. We believe that FGE-OM should initiate discussion about the contents and structure of the next version of MAGE and the future of proteomics standards. A prototype database called RNA And Protein Abundance Database (RAPAD), based on FGE-OM, has been implemented and populated with data from microbial pathogenesis. FGE-OM and the RAPAD schema are available from http://www.gusdb.org/fge.html, along with a set of more detailed diagrams. RAPAD can be accessed by registration at the site.
Classification of Complete Proteomes of Different Organisms and Protein Sets Based on Their Protein Distributions in Terms of Some Key Attributes of Proteins

DOE Office of Scientific and Technical Information (OSTI.GOV)

Guo, Hao-Bo; Ma, Yue; Tuskan, Gerald A.

The existence of complete genome sequences makes it important to develop different approaches for classification of large-scale data sets and to make extraction of biological insights easier. Here, we propose an approach for classification of complete proteomes/protein sets based on protein distributions on some basic attributes. We demonstrate the usefulness of this approach by determining protein distributions in terms of two attributes: protein lengths and protein intrinsic disorder contents (ID). The protein distributions based on L and ID are surveyed for representative proteome organisms and protein sets from the three domains of life. The two-dimensional maps (designated as fingerprints here)more » from the protein distribution densities in the LD space defined by ln( L ) and ID are then constructed. The fingerprints for different organisms and protein sets are found to be distinct with each other, and they can therefore be used for comparative studies. As a test case, phylogenetic trees have been constructed based on the protein distribution densities in the fingerprints of proteomes of organisms without performing any protein sequence comparison and alignments. The phylogenetic trees generated are biologically meaningful, demonstrating that the protein distributions in the LD space may serve as unique phylogenetic signals of the organisms at the proteome level.« less
Genomics, transcriptomics and proteomics: enabling insights into social evolution and disease challenges for managed and wild bees.

PubMed

Trapp, Judith; McAfee, Alison; Foster, Leonard J

2017-02-01

Globally, there are over 20 000 bee species (Hymenoptera: Apoidea: Anthophila) with a host of biologically fascinating characteristics. Although they have long been studied as models for social evolution, recent challenges to bee health (mainly diseases and pesticides) have gathered the attention of both public and research communities. Genome sequences of twelve bee species are now complete or under progress, facilitating the application of additional 'omic technologies. Here, we review recent developments in honey bee and native bee research in the genomic era. We discuss the progress in genome sequencing and functional annotation, followed by the enabled comparative genomics, proteomics and transcriptomics applications regarding social evolution and health. Finally, we end with comments on future challenges in the postgenomic era. © 2016 John Wiley & Sons Ltd.
hEIDI: An Intuitive Application Tool To Organize and Treat Large-Scale Proteomics Data.

PubMed

Hesse, Anne-Marie; Dupierris, Véronique; Adam, Claire; Court, Magali; Barthe, Damien; Emadali, Anouk; Masselon, Christophe; Ferro, Myriam; Bruley, Christophe

2016-10-07

Advances in high-throughput proteomics have led to a rapid increase in the number, size, and complexity of the associated data sets. Managing and extracting reliable information from such large series of data sets require the use of dedicated software organized in a consistent pipeline to reduce, validate, exploit, and ultimately export data. The compilation of multiple mass-spectrometry-based identification and quantification results obtained in the context of a large-scale project represents a real challenge for developers of bioinformatics solutions. In response to this challenge, we developed a dedicated software suite called hEIDI to manage and combine both identifications and semiquantitative data related to multiple LC-MS/MS analyses. This paper describes how, through a user-friendly interface, hEIDI can be used to compile analyses and retrieve lists of nonredundant protein groups. Moreover, hEIDI allows direct comparison of series of analyses, on the basis of protein groups, while ensuring consistent protein inference and also computing spectral counts. hEIDI ensures that validated results are compliant with MIAPE guidelines as all information related to samples and results is stored in appropriate databases. Thanks to the database structure, validated results generated within hEIDI can be easily exported in the PRIDE XML format for subsequent publication. hEIDI can be downloaded from http://biodev.extra.cea.fr/docs/heidi .
Guidelines for reporting quantitative mass spectrometry based experiments in proteomics.

PubMed

Martínez-Bartolomé, Salvador; Deutsch, Eric W; Binz, Pierre-Alain; Jones, Andrew R; Eisenacher, Martin; Mayer, Gerhard; Campos, Alex; Canals, Francesc; Bech-Serra, Joan-Josep; Carrascal, Montserrat; Gay, Marina; Paradela, Alberto; Navajas, Rosana; Marcilla, Miguel; Hernáez, María Luisa; Gutiérrez-Blázquez, María Dolores; Velarde, Luis Felipe Clemente; Aloria, Kerman; Beaskoetxea, Jabier; Medina-Aunon, J Alberto; Albar, Juan P

2013-12-16

Mass spectrometry is already a well-established protein identification tool and recent methodological and technological developments have also made possible the extraction of quantitative data of protein abundance in large-scale studies. Several strategies for absolute and relative quantitative proteomics and the statistical assessment of quantifications are possible, each having specific measurements and therefore, different data analysis workflows. The guidelines for Mass Spectrometry Quantification allow the description of a wide range of quantitative approaches, including labeled and label-free techniques and also targeted approaches such as Selected Reaction Monitoring (SRM). The HUPO Proteomics Standards Initiative (HUPO-PSI) has invested considerable efforts to improve the standardization of proteomics data handling, representation and sharing through the development of data standards, reporting guidelines, controlled vocabularies and tooling. In this manuscript, we describe a key output from the HUPO-PSI-namely the MIAPE Quant guidelines, which have developed in parallel with the corresponding data exchange format mzQuantML [1]. The MIAPE Quant guidelines describe the HUPO-PSI proposal concerning the minimum information to be reported when a quantitative data set, derived from mass spectrometry (MS), is submitted to a database or as supplementary information to a journal. The guidelines have been developed with input from a broad spectrum of stakeholders in the proteomics field to represent a true consensus view of the most important data types and metadata, required for a quantitative experiment to be analyzed critically or a data analysis pipeline to be reproduced. It is anticipated that they will influence or be directly adopted as part of journal guidelines for publication and by public proteomics databases and thus may have an impact on proteomics laboratories across the world. This article is part of a Special Issue entitled: Standardization and Quality Control. Copyright © 2013 Elsevier B.V. All rights reserved.
Direct Detection of Alternative Open Reading Frames Translation Products in Human Significantly Expands the Proteome

PubMed Central

Vanderperre, Benoît; Lucier, Jean-François; Bissonnette, Cyntia; Motard, Julie; Tremblay, Guillaume; Vanderperre, Solène; Wisztorski, Maxence; Salzet, Michel; Boisvert, François-Michel; Roucou, Xavier

2013-01-01

A fully mature mRNA is usually associated to a reference open reading frame encoding a single protein. Yet, mature mRNAs contain unconventional alternative open reading frames (AltORFs) located in untranslated regions (UTRs) or overlapping the reference ORFs (RefORFs) in non-canonical +2 and +3 reading frames. Although recent ribosome profiling and footprinting approaches have suggested the significant use of unconventional translation initiation sites in mammals, direct evidence of large-scale alternative protein expression at the proteome level is still lacking. To determine the contribution of alternative proteins to the human proteome, we generated a database of predicted human AltORFs revealing a new proteome mainly composed of small proteins with a median length of 57 amino acids, compared to 344 amino acids for the reference proteome. We experimentally detected a total of 1,259 alternative proteins by mass spectrometry analyses of human cell lines, tissues and fluids. In plasma and serum, alternative proteins represent up to 55% of the proteome and may be a potential unsuspected new source for biomarkers. We observed constitutive co-expression of RefORFs and AltORFs from endogenous genes and from transfected cDNAs, including tumor suppressor p53, and provide evidence that out-of-frame clones representing AltORFs are mistakenly rejected as false positive in cDNAs screening assays. Functional importance of alternative proteins is strongly supported by significant evolutionary conservation in vertebrates, invertebrates, and yeast. Our results imply that coding of multiple proteins in a single gene by the use of AltORFs may be a common feature in eukaryotes, and confirm that translation of unconventional ORFs generates an as yet unexplored proteome. PMID:23950983
Spectrum-to-Spectrum Searching Using a Proteome-wide Spectral Library*

PubMed Central

Yen, Chia-Yu; Houel, Stephane; Ahn, Natalie G.; Old, William M.

2011-01-01

The unambiguous assignment of tandem mass spectra (MS/MS) to peptide sequences remains a key unsolved problem in proteomics. Spectral library search strategies have emerged as a promising alternative for peptide identification, in which MS/MS spectra are directly compared against a reference library of confidently assigned spectra. Two problems relate to library size. First, reference spectral libraries are limited to rediscovery of previously identified peptides and are not applicable to new peptides, because of their incomplete coverage of the human proteome. Second, problems arise when searching a spectral library the size of the entire human proteome. We observed that traditional dot product scoring methods do not scale well with spectral library size, showing reduction in sensitivity when library size is increased. We show that this problem can be addressed by optimizing scoring metrics for spectrum-to-spectrum searches with large spectral libraries. MS/MS spectra for the 1.3 million predicted tryptic peptides in the human proteome are simulated using a kinetic fragmentation model (MassAnalyzer version2.1) to create a proteome-wide simulated spectral library. Searches of the simulated library increase MS/MS assignments by 24% compared with Mascot, when using probabilistic and rank based scoring methods. The proteome-wide coverage of the simulated library leads to 11% increase in unique peptide assignments, compared with parallel searches of a reference spectral library. Further improvement is attained when reference spectra and simulated spectra are combined into a hybrid spectral library, yielding 52% increased MS/MS assignments compared with Mascot searches. Our study demonstrates the advantages of using probabilistic and rank based scores to improve performance of spectrum-to-spectrum search strategies. PMID:21532008
How many proteins can be identified in a 2DE gel spot within an analysis of a complex human cancer tissue proteome?

PubMed

Zhan, Xianquan; Yang, Haiyan; Peng, Fang; Li, Jianglin; Mu, Yun; Long, Ying; Cheng, Tingting; Huang, Yuda; Li, Zhao; Lu, Miaolong; Li, Na; Li, Maoyu; Liu, Jianping; Jungblut, Peter R

2018-04-01

Two-dimensional gel electrophoresis (2DE) in proteomics is traditionally assumed to contain only one or two proteins in each 2DE spot. However, 2DE resolution is being complemented by the rapid development of high sensitivity mass spectrometers. Here we compared MALDI-MS, LC-Q-TOF MS and LC-Orbitrap Velos MS for the identification of proteins within one spot. With LC-Orbitrap Velos MS each Coomassie Blue-stained 2DE spot contained an average of at least 42 and 63 proteins/spot in an analysis of a human glioblastoma proteome and a human pituitary adenoma proteome, respectively, if a single gel spot was analyzed. If a pool of three matched gel spots was analyzed this number further increased up to an average of 230 and 118 proteins/spot for glioblastoma and pituitary adenoma proteome, respectively. Multiple proteins per spot confirm the necessity of isotopic labeling in large-scale quantification of different protein species in a proteome. Furthermore, a protein abundance analysis revealed that most of the identified proteins in each analyzed 2DE spot were low-abundance proteins. Many proteins were present in several of the analyzed spots showing the ability of 2DE-MS to separate at the protein species level. Therefore, 2DE coupled with high-sensitivity LC-MS has a clearly higher sensitivity as expected until now to detect, identify and quantify low abundance proteins in a complex human proteome with an estimated resolution of about 500 000 protein species. This clearly exceeds the resolution power of bottom-up LC-MS investigations. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Comparison of Collisional and Electron-Based Dissociation Modes for Middle-Down Analysis of Multiply Glycosylated Peptides

NASA Astrophysics Data System (ADS)

Khatri, Kshitij; Pu, Yi; Klein, Joshua A.; Wei, Juan; Costello, Catherine E.; Lin, Cheng; Zaia, Joseph

2018-04-01

Analysis of singly glycosylated peptides has evolved to a point where large-scale LC-MS analyses can be performed at almost the same scale as proteomics experiments. While collisionally activated dissociation (CAD) remains the mainstay of bottom-up analyses, it performs poorly for the middle-down analysis of multiply glycosylated peptides. With improvements in instrumentation, electron-activated dissociation (ExD) modes are becoming increasingly prevalent for proteomics experiments and for the analysis of fragile modifications such as glycosylation. While these methods have been applied for glycopeptide analysis in isolated studies, an organized effort to compare their efficiencies, particularly for analysis of multiply glycosylated peptides (termed here middle-down glycoproteomics), has not been made. We therefore compared the performance of different ExD modes for middle-down glycopeptide analyses. We identified key features among the different dissociation modes and show that increased electron energy and supplemental activation provide the most useful data for middle-down glycopeptide analysis. [Figure not available: see fulltext.
A Life-Cycle Model of Human Social Groups Produces a U-Shaped Distribution in Group Size.

PubMed

Salali, Gul Deniz; Whitehouse, Harvey; Hochberg, Michael E

2015-01-01

One of the central puzzles in the study of sociocultural evolution is how and why transitions from small-scale human groups to large-scale, hierarchically more complex ones occurred. Here we develop a spatially explicit agent-based model as a first step towards understanding the ecological dynamics of small and large-scale human groups. By analogy with the interactions between single-celled and multicellular organisms, we build a theory of group lifecycles as an emergent property of single cell demographic and expansion behaviours. We find that once the transition from small-scale to large-scale groups occurs, a few large-scale groups continue expanding while small-scale groups gradually become scarcer, and large-scale groups become larger in size and fewer in number over time. Demographic and expansion behaviours of groups are largely influenced by the distribution and availability of resources. Our results conform to a pattern of human political change in which religions and nation states come to be represented by a few large units and many smaller ones. Future enhancements of the model should include decision-making rules and probabilities of fragmentation for large-scale societies. We suggest that the synthesis of population ecology and social evolution will generate increasingly plausible models of human group dynamics.
A Life-Cycle Model of Human Social Groups Produces a U-Shaped Distribution in Group Size

PubMed Central

Salali, Gul Deniz; Whitehouse, Harvey; Hochberg, Michael E.

2015-01-01

One of the central puzzles in the study of sociocultural evolution is how and why transitions from small-scale human groups to large-scale, hierarchically more complex ones occurred. Here we develop a spatially explicit agent-based model as a first step towards understanding the ecological dynamics of small and large-scale human groups. By analogy with the interactions between single-celled and multicellular organisms, we build a theory of group lifecycles as an emergent property of single cell demographic and expansion behaviours. We find that once the transition from small-scale to large-scale groups occurs, a few large-scale groups continue expanding while small-scale groups gradually become scarcer, and large-scale groups become larger in size and fewer in number over time. Demographic and expansion behaviours of groups are largely influenced by the distribution and availability of resources. Our results conform to a pattern of human political change in which religions and nation states come to be represented by a few large units and many smaller ones. Future enhancements of the model should include decision-making rules and probabilities of fragmentation for large-scale societies. We suggest that the synthesis of population ecology and social evolution will generate increasingly plausible models of human group dynamics. PMID:26381745
Identification of Phosphorylated Proteins on a Global Scale.

PubMed

Iliuk, Anton

2018-05-31

Liquid chromatography (LC) coupled with tandem mass spectrometry (MS/MS) has enabled researchers to analyze complex biological samples with unprecedented depth. It facilitates the identification and quantification of modifications within thousands of proteins in a single large-scale proteomic experiment. Analysis of phosphorylation, one of the most common and important post-translational modifications, has particularly benefited from such progress in the field. Here, detailed protocols are provided for a few well-regarded, common sample preparation methods for an effective phosphoproteomic experiment. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.
Interplay between Chaperones and Protein Disorder Promotes the Evolution of Protein Networks

PubMed Central

Pechmann, Sebastian; Frydman, Judith

2014-01-01

Evolution is driven by mutations, which lead to new protein functions but come at a cost to protein stability. Non-conservative substitutions are of interest in this regard because they may most profoundly affect both function and stability. Accordingly, organisms must balance the benefit of accepting advantageous substitutions with the possible cost of deleterious effects on protein folding and stability. We here examine factors that systematically promote non-conservative mutations at the proteome level. Intrinsically disordered regions in proteins play pivotal roles in protein interactions, but many questions regarding their evolution remain unanswered. Similarly, whether and how molecular chaperones, which have been shown to buffer destabilizing mutations in individual proteins, generally provide robustness during proteome evolution remains unclear. To this end, we introduce an evolutionary parameter λ that directly estimates the rate of non-conservative substitutions. Our analysis of λ in Escherichia coli, Saccharomyces cerevisiae, and Homo sapiens sequences reveals how co- and post-translationally acting chaperones differentially promote non-conservative substitutions in their substrates, likely through buffering of their destabilizing effects. We further find that λ serves well to quantify the evolution of intrinsically disordered proteins even though the unstructured, thus generally variable regions in proteins are often flanked by very conserved sequences. Crucially, we show that both intrinsically disordered proteins and highly re-wired proteins in protein interaction networks, which have evolved new interactions and functions, exhibit a higher λ at the expense of enhanced chaperone assistance. Our findings thus highlight an intricate interplay of molecular chaperones and protein disorder in the evolvability of protein networks. Our results illuminate the role of chaperones in enabling protein evolution, and underline the importance of the cellular context and integrated approaches for understanding proteome evolution. We feel that the development of λ may be a valuable addition to the toolbox applied to understand the molecular basis of evolution. PMID:24968255

Universal features in the genome-level evolution of protein domains.

PubMed

Cosentino Lagomarsino, Marco; Sellerio, Alessandro L; Heijning, Philip D; Bassetti, Bruno

2009-01-01

Protein domains can be used to study proteome evolution at a coarse scale. In particular, they are found on genomes with notable statistical distributions. It is known that the distribution of domains with a given topology follows a power law. We focus on a further aspect: these distributions, and the number of distinct topologies, follow collective trends, or scaling laws, depending on the total number of domains only, and not on genome-specific features. We present a stochastic duplication/innovation model, in the class of the so-called 'Chinese restaurant processes', that explains this observation with two universal parameters, representing a minimal number of domains and the relative weight of innovation to duplication. Furthermore, we study a model variant where new topologies are related to occurrence in genomic data, accounting for fold specificity. Both models have general quantitative agreement with data from hundreds of genomes, which indicates that the domains of a genome are built with a combination of specificity and robust self-organizing phenomena. The latter are related to the basic evolutionary 'moves' of duplication and innovation, and give rise to the observed scaling laws, a priori of the specific evolutionary history of a genome. We interpret this as the concurrent effect of neutral and selective drives, which increase duplication and decrease innovation in larger and more complex genomes. The validity of our model would imply that the empirical observation of a small number of folds in nature may be a consequence of their evolution.
Remodeling Cildb, a popular database for cilia and links for ciliopathies

PubMed Central

2014-01-01

Background New generation technologies in cell and molecular biology generate large amounts of data hard to exploit for individual proteins. This is particularly true for ciliary and centrosomal research. Cildb is a multi–species knowledgebase gathering high throughput studies, which allows advanced searches to identify proteins involved in centrosome, basal body or cilia biogenesis, composition and function. Combined to localization of genetic diseases on human chromosomes given by OMIM links, candidate ciliopathy proteins can be compiled through Cildb searches. Methods Othology between recent versions of the whole proteomes was computed using Inparanoid and ciliary high throughput studies were remapped on these recent versions. Results Due to constant evolution of the ciliary and centrosomal field, Cildb has been recently upgraded twice, with new species whole proteomes and new ciliary studies, and the latter version displays a novel BioMart interface, much more intuitive than the previous ones. Conclusions This already popular database is designed now for easier use and is up to date in regard to high throughput ciliary studies. PMID:25422781
Evolution of scaling emergence in large-scale spatial epidemic spreading.

PubMed

Wang, Lin; Li, Xiang; Zhang, Yi-Qing; Zhang, Yan; Zhang, Kan

2011-01-01

Zipf's law and Heaps' law are two representatives of the scaling concepts, which play a significant role in the study of complexity science. The coexistence of the Zipf's law and the Heaps' law motivates different understandings on the dependence between these two scalings, which has still hardly been clarified. In this article, we observe an evolution process of the scalings: the Zipf's law and the Heaps' law are naturally shaped to coexist at the initial time, while the crossover comes with the emergence of their inconsistency at the larger time before reaching a stable state, where the Heaps' law still exists with the disappearance of strict Zipf's law. Such findings are illustrated with a scenario of large-scale spatial epidemic spreading, and the empirical results of pandemic disease support a universal analysis of the relation between the two laws regardless of the biological details of disease. Employing the United States domestic air transportation and demographic data to construct a metapopulation model for simulating the pandemic spread at the U.S. country level, we uncover that the broad heterogeneity of the infrastructure plays a key role in the evolution of scaling emergence. The analyses of large-scale spatial epidemic spreading help understand the temporal evolution of scalings, indicating the coexistence of the Zipf's law and the Heaps' law depends on the collective dynamics of epidemic processes, and the heterogeneity of epidemic spread indicates the significance of performing targeted containment strategies at the early time of a pandemic disease.
Automated Interpretation of Subcellular Patterns in Fluorescence Microscope Images for Location Proteomics

PubMed Central

Chen, Xiang; Velliste, Meel; Murphy, Robert F.

2010-01-01

Proteomics, the large scale identification and characterization of many or all proteins expressed in a given cell type, has become a major area of biological research. In addition to information on protein sequence, structure and expression levels, knowledge of a protein’s subcellular location is essential to a complete understanding of its functions. Currently subcellular location patterns are routinely determined by visual inspection of fluorescence microscope images. We review here research aimed at creating systems for automated, systematic determination of location. These employ numerical feature extraction from images, feature reduction to identify the most useful features, and various supervised learning (classification) and unsupervised learning (clustering) methods. These methods have been shown to perform significantly better than human interpretation of the same images. When coupled with technologies for tagging large numbers of proteins and high-throughput microscope systems, the computational methods reviewed here enable the new subfield of location proteomics. This subfield will make critical contributions in two related areas. First, it will provide structured, high-resolution information on location to enable Systems Biology efforts to simulate cell behavior from the gene level on up. Second, it will provide tools for Cytomics projects aimed at characterizing the behaviors of all cell types before, during and after the onset of various diseases. PMID:16752421
Proteomics in Argentina - limitations and future perspectives: A special emphasis on meat proteomics.

PubMed

Fadda, Silvina; Almeida, André M

2015-11-01

Argentina is one of the most relevant countries in Latin America, playing a major role in regional economics, culture and science. Over the last 80 years, Argentinean history has been characterized by several upward and downward phases that had major consequences on the development of science in the country and most recently on proteomics. In this article, we characterize the evolution of Proteomics sciences in Argentina over the last decade and a half. We describe the proteomics publication output of the country in the framework of the regional and international contexts, demonstrating that Argentina is solidly anchored in a regional context, showing results similar to other emergent and Latin American countries, albeit still far from the European, American or Australian realities. We also provide a case-study on the importance of Proteomics to a specific sector in the area of food science: the use of bacteria of technological interest, highlighting major achievements obtained by Argentinean proteomics scientists. Finally, we provide a general picture of the endeavors being undertaken by Argentinean Proteomics scientists and their international collaborators to promote the Proteomics-based research with the new generation of scientists and PhD students in both Argentina and other countries in the Southern cone. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Systems biology definition of the core proteome of metabolism and expression is consistent with high-throughput data.

PubMed

Yang, Laurence; Tan, Justin; O'Brien, Edward J; Monk, Jonathan M; Kim, Donghyuk; Li, Howard J; Charusanti, Pep; Ebrahim, Ali; Lloyd, Colton J; Yurkovich, James T; Du, Bin; Dräger, Andreas; Thomas, Alex; Sun, Yuekai; Saunders, Michael A; Palsson, Bernhard O

2015-08-25

Finding the minimal set of gene functions needed to sustain life is of both fundamental and practical importance. Minimal gene lists have been proposed by using comparative genomics-based core proteome definitions. A definition of a core proteome that is supported by empirical data, is understood at the systems-level, and provides a basis for computing essential cell functions is lacking. Here, we use a systems biology-based genome-scale model of metabolism and expression to define a functional core proteome consisting of 356 gene products, accounting for 44% of the Escherichia coli proteome by mass based on proteomics data. This systems biology core proteome includes 212 genes not found in previous comparative genomics-based core proteome definitions, accounts for 65% of known essential genes in E. coli, and has 78% gene function overlap with minimal genomes (Buchnera aphidicola and Mycoplasma genitalium). Based on transcriptomics data across environmental and genetic backgrounds, the systems biology core proteome is significantly enriched in nondifferentially expressed genes and depleted in differentially expressed genes. Compared with the noncore, core gene expression levels are also similar across genetic backgrounds (two times higher Spearman rank correlation) and exhibit significantly more complex transcriptional and posttranscriptional regulatory features (40% more transcription start sites per gene, 22% longer 5'UTR). Thus, genome-scale systems biology approaches rigorously identify a functional core proteome needed to support growth. This framework, validated by using high-throughput datasets, facilitates a mechanistic understanding of systems-level core proteome function through in silico models; it de facto defines a paleome.
Molecular phenotype of zebrafish ovarian follicle by serial analysis of gene expression and proteomic profiling, and comparison with the transcriptomes of other animals

PubMed Central

Knoll-Gellida, Anja; André, Michèle; Gattegno, Tamar; Forgue, Jean; Admon, Arie; Babin, Patrick J

2006-01-01

Background The ability of an oocyte to develop into a viable embryo depends on the accumulation of specific maternal information and molecules, such as RNAs and proteins. A serial analysis of gene expression (SAGE) was carried out in parallel with proteomic analysis on fully-grown ovarian follicles from zebrafish (Danio rerio). The data obtained were compared with ovary/follicle/egg molecular phenotypes of other animals, published or available in public sequence databases. Results Sequencing of 27,486 SAGE tags identified 11,399 different ones, including 3,329 tags with an occurrence superior to one. Fifty-eight genes were expressed at over 0.15% of the total population and represented 17.34% of the mRNA population identified. The three most expressed transcripts were a rhamnose-binding lectin, beta-actin 2, and a transcribed locus similar to the H2B histone family. Comparison with the large-scale expressed sequence tags sequencing approach revealed highly expressed transcripts that were not previously known to be expressed at high levels in fish ovaries, like the short-sized polarized metallothionein 2 transcript. A higher sensitivity for the detection of transcripts with a characterized maternal genetic contribution was also demonstrated compared to large-scale sequencing of cDNA libraries. Ferritin heavy polypeptide 1, heat shock protein 90-beta, lactate dehydrogenase B4, beta-actin isoforms, tubulin beta 2, ATP synthase subunit 9, together with 40 S ribosomal protein S27a, were common highly-expressed transcripts of vertebrate ovary/unfertilized egg. Comparison of transcriptome and proteome data revealed that transcript levels provide little predictive value with respect to the extent of protein abundance. All the proteins identified by proteomic analysis of fully-grown zebrafish follicles had at least one transcript counterpart, with two exceptions: eosinophil chemotactic cytokine and nothepsin. Conclusion This study provides a complete sequence data set of maternal mRNA stored in zebrafish germ cells at the end of oogenesis. This catalogue contains highly-expressed transcripts that are part of a vertebrate ovarian expressed gene signature. Comparison of transcriptome and proteome data identified downregulated transcripts or proteins potentially incorporated in the oocyte by endocytosis. The molecular phenotype described provides groundwork for future experimental approaches aimed at identifying functionally important stored maternal transcripts and proteins involved in oogenesis and early stages of embryo development. PMID:16526958
Proteome-scale human interactomics

PubMed Central

Luck, Katja; Sheynkman, Gloria M.; Zhang, Ivy; Vidal, Marc

2017-01-01

Cellular functions are mediated by complex interactome networks of physical, biochemical, and functional interactions between DNA sequences, RNA molecules, proteins, lipids, and small metabolites. A thorough understanding of cellular organization requires accurate and relatively complete models of interactome networks at proteome-scale. The recent publication of four human protein-protein interaction (PPI) maps represents a technological breakthrough and an unprecedented resource for the scientific community, heralding a new era of proteome-scale human interactomics. Our knowledge gained from these and complementary studies provides fresh insights into the opportunities and challenges when analyzing systematically generated interactome data, defines a clear roadmap towards the generation of a first reference interactome, and reveals new perspectives on the organization of cellular life. PMID:28284537
The evolution of surface magnetic fields in young solar-type stars II: the early main sequence (250-650 Myr)

NASA Astrophysics Data System (ADS)

Folsom, C. P.; Bouvier, J.; Petit, P.; Lèbre, A.; Amard, L.; Palacios, A.; Morin, J.; Donati, J.-F.; Vidotto, A. A.

2018-03-01

There is a large change in surface rotation rates of sun-like stars on the pre-main sequence and early main sequence. Since these stars have dynamo-driven magnetic fields, this implies a strong evolution of their magnetic properties over this time period. The spin-down of these stars is controlled by interactions between stellar and magnetic fields, thus magnetic evolution in turn plays an important role in rotational evolution. We present here the second part of a study investigating the evolution of large-scale surface magnetic fields in this critical time period. We observed stars in open clusters and stellar associations with known ages between 120 and 650 Myr, and used spectropolarimetry and Zeeman Doppler Imaging to characterize their large-scale magnetic field strength and geometry. We report 15 stars with magnetic detections here. These stars have masses from 0.8 to 0.95 M⊙, rotation periods from 0.326 to 10.6 d, and we find large-scale magnetic field strengths from 8.5 to 195 G with a wide range of geometries. We find a clear trend towards decreasing magnetic field strength with age, and a power law decrease in magnetic field strength with Rossby number. There is some tentative evidence for saturation of the large-scale magnetic field strength at Rossby numbers below 0.1, although the saturation point is not yet well defined. Comparing to younger classical T Tauri stars, we support the hypothesis that differences in internal structure produce large differences in observed magnetic fields, however for weak-lined T Tauri stars this is less clear.
Diversity in the origins of proteostasis networks- a driver for protein function in evolution

PubMed Central

Powers, Evan T.; Balch, William E.

2013-01-01

Although a protein’s primary sequence largely determines its function, proteins can adopt different folding states in response to changes in the environment, some of which may be deleterious to the organism. All organisms, including Bacteria, Archaea and Eukarya, have evolved a protein homeostasis network, or proteostasis network, that consists of chaperones and folding factors, degradation components, signalling pathways and specialized compartmentalized modules that manage protein folding in response to environmental stimuli and variation. Surveying the origins of proteostasis networks reveals that they have co-evolved with the proteome to regulate the physiological state of the cell, reflecting the unique stresses that different cells or organisms experience, and that they have a key role in driving evolution by closely managing the link between the phenotype and the genotype. PMID:23463216
Functional Module Search in Protein Networks based on Semantic Similarity Improves the Analysis of Proteomics Data*

PubMed Central

Boyanova, Desislava; Nilla, Santosh; Klau, Gunnar W.; Dandekar, Thomas; Müller, Tobias; Dittrich, Marcus

2014-01-01

The continuously evolving field of proteomics produces increasing amounts of data while improving the quality of protein identifications. Albeit quantitative measurements are becoming more popular, many proteomic studies are still based on non-quantitative methods for protein identification. These studies result in potentially large sets of identified proteins, where the biological interpretation of proteins can be challenging. Systems biology develops innovative network-based methods, which allow an integrated analysis of these data. Here we present a novel approach, which combines prior knowledge of protein-protein interactions (PPI) with proteomics data using functional similarity measurements of interacting proteins. This integrated network analysis exactly identifies network modules with a maximal consistent functional similarity reflecting biological processes of the investigated cells. We validated our approach on small (H9N2 virus-infected gastric cells) and large (blood constituents) proteomic data sets. Using this novel algorithm, we identified characteristic functional modules in virus-infected cells, comprising key signaling proteins (e.g. the stress-related kinase RAF1) and demonstrate that this method allows a module-based functional characterization of cell types. Analysis of a large proteome data set of blood constituents resulted in clear separation of blood cells according to their developmental origin. A detailed investigation of the T-cell proteome further illustrates how the algorithm partitions large networks into functional subnetworks each representing specific cellular functions. These results demonstrate that the integrated network approach not only allows a detailed analysis of proteome networks but also yields a functional decomposition of complex proteomic data sets and thereby provides deeper insights into the underlying cellular processes of the investigated system. PMID:24807868
Design and analysis issues in quantitative proteomics studies.

PubMed

Karp, Natasha A; Lilley, Kathryn S

2007-09-01

Quantitative proteomics is the comparison of distinct proteomes which enables the identification of protein species which exhibit changes in expression or post-translational state in response to a given stimulus. Many different quantitative techniques are being utilized and generate large datasets. Independent of the technique used, these large datasets need robust data analysis to ensure valid conclusions are drawn from such studies. Approaches to address the problems that arise with large datasets are discussed to give insight into the types of statistical analyses of data appropriate for the various experimental strategies that can be employed by quantitative proteomic studies. This review also highlights the importance of employing a robust experimental design and highlights various issues surrounding the design of experiments. The concepts and examples discussed within will show how robust design and analysis will lead to confident results that will ensure quantitative proteomics delivers.
Global proteomic profiling in multistep hepatocarcinogenesis and identification of PARP1 as a novel molecular marker in hepatocellular carcinoma

PubMed Central

Wang, Jianguo; Xie, Haiyang; Li, Jie; Cao, Jili; Zhou, Lin; Zheng, Shusen

2016-01-01

The more accurate biomarkers have long been desired for hepatocellular carcinoma (HCC). Here, we characterized global large-scale proteomics of multistep hepatocarcinogenesis in an attempt to identify novel biomarkers for HCC. Quantitative data of 37874 sequences and 3017 proteins during hepatocarcinogenesis were obtained in cohort 1 of 75 samples (5 pooled groups: normal livers, hepatitis livers, cirrhotic livers, peritumoral livers, and HCC tissues) by iTRAQ 2D LC-MS/MS. The diagnostic performance of the top six most upregulated proteins in HCC group and HSP70 as reference were subsequently validated in cohort 2 of 114 samples (hepatocarcinogenesis from normal livers to HCC) using immunohistochemistry. Of seven candidate protein markers, PARP1, GS and NDRG1 showed the optimal diagnostic performance for HCC. PARP1, as a novel marker, showed comparable diagnostic performance to that of classic markers GS and NDRG1 in HCC (AUCs = 0.872, 0.856 and 0.792, respectively). A significant higher AUC of 0.945 was achieved when three markers combined. For diagnosis of HCC, the sensitivity and specificity were 88.2% and 81.0% when at least two of the markers were positive. Similar diagnostic values of PARP1, GS and NDRG1 were confirmed by immunohistochemistry in cohort 3 of 180 HCC patients. Further analysis indicated that PARP1 and NDRG1 were associated with some clinicopathological features, and the independent prognostic factors for HCC patients. Overall, global large-scale proteomics on spectrum of multistep hepatocarcinogenesis are obtained. PARP1 is a novel promising diagnostic/prognostic marker for HCC, and the three-marker panel (PARP1, GS and NDRG1) with excellent diagnostic performance for HCC was established. PMID:26883192
Microgravity-driven remodeling of the proteome reveals insights into molecular mechanisms and signal networks involved in response to the space flight environment.

PubMed

Rea, Giuseppina; Cristofaro, Francesco; Pani, Giuseppe; Pascucci, Barbara; Ghuge, Sandip A; Corsetto, Paola Antonia; Imbriani, Marcello; Visai, Livia; Rizzo, Angela M

2016-03-30

Space is a hostile environment characterized by high vacuum, extreme temperatures, meteoroids, space debris, ionospheric plasma, microgravity and space radiation, which all represent risks for human health. A deep understanding of the biological consequences of exposure to the space environment is required to design efficient countermeasures to minimize their negative impact on human health. Recently, proteomic approaches have received a significant amount of attention in the effort to further study microgravity-induced physiological changes. In this review, we summarize the current knowledge about the effects of microgravity on microorganisms (in particular Cupriavidus metallidurans CH34, Bacillus cereus and Rhodospirillum rubrum S1H), plants (whole plants, organs, and cell cultures), mammalian cells (endothelial cells, bone cells, chondrocytes, muscle cells, thyroid cancer cells, immune system cells) and animals (invertebrates, vertebrates and mammals). Herein, we describe their proteome's response to microgravity, focusing on proteomic discoveries and their future potential applications in space research. Space experiments and operational flight experience have identified detrimental effects on human health and performance because of exposure to weightlessness, even when currently available countermeasures are implemented. Many experimental tools and methods have been developed to study microgravity induced physiological changes. Recently, genomic and proteomic approaches have received a significant amount of attention. This review summarizes the recent research studies of the proteome response to microgravity inmicroorganisms, plants, mammalians cells and animals. Current proteomic tools allow large-scale, high-throughput analyses for the detection, identification, and functional investigation of all proteomes. Understanding gene and/or protein expression is the key to unlocking the mechanisms behind microgravity-induced problems and to finding effective countermeasures to spaceflight-induced alterations but also for the study of diseases on earth. Future perspectives are also highlighted. Copyright © 2015 Elsevier B.V. All rights reserved.
Assessing signal-to-noise in quantitative proteomics: multivariate statistical analysis in DIGE experiments.

PubMed

Friedman, David B

2012-01-01

All quantitative proteomics experiments measure variation between samples. When performing large-scale experiments that involve multiple conditions or treatments, the experimental design should include the appropriate number of individual biological replicates from each condition to enable the distinction between a relevant biological signal from technical noise. Multivariate statistical analyses, such as principal component analysis (PCA), provide a global perspective on experimental variation, thereby enabling the assessment of whether the variation describes the expected biological signal or the unanticipated technical/biological noise inherent in the system. Examples will be shown from high-resolution multivariable DIGE experiments where PCA was instrumental in demonstrating biologically significant variation as well as sample outliers, fouled samples, and overriding technical variation that would not be readily observed using standard univariate tests.
Free Flow Zonal Electrophoresis for Fractionation of Plant Membrane Compartments Prior to Proteomic Analysis.

PubMed

Barkla, Bronwyn J

2018-01-01

Free flow zonal electrophoresis (FFZE) is a versatile, reproducible, and potentially high-throughput technique for the separation of plant organelles and membranes by differences in membrane surface charge. It offers considerable benefits over traditional fractionation techniques, such as density gradient centrifugation and two-phase partitioning, as it is relatively fast, sample recovery is high, and the method provides unparalleled sample purity. It has been used to successfully purify chloroplasts and mitochondria from plants but also, to obtain highly pure fractions of plasma membrane, tonoplast, ER, Golgi, and thylakoid membranes. Application of the technique can significantly improve protein coverage in large-scale proteomics studies by decreasing sample complexity. Here, we describe the method for the fractionation of plant cellular membranes from leaves by FFZE.
Evolution of Large-Scale Magnetic Fields and State Transitions in Black Hole X-Ray Binaries

NASA Astrophysics Data System (ADS)

Wang, Ding-Xiong; Huang, Chang-Yin; Wang, Jiu-Zhou

2010-04-01

The state transitions of black hole (BH) X-ray binaries are discussed based on the evolution of large-scale magnetic fields, in which the combination of three energy mechanisms are involved: (1) the Blandford-Znajek (BZ) process related to the open field lines connecting a rotating BH with remote astrophysical loads, (2) the magnetic coupling (MC) process related to the closed field lines connecting the BH with its surrounding accretion disk, and (3) the Blandford-Payne (BP) process related to the open field lines connecting the disk with remote astrophysical loads. It turns out that each spectral state of the BH binaries corresponds to each configuration of magnetic field in BH magnetosphere, and the main characteristics of low/hard (LH) state, hard intermediate (HIM) state and steep power law (SPL) state are roughly fitted based on the evolution of large-scale magnetic fields associated with disk accretion.
Reductive evolution of architectural repertoires in proteomes and the birth of the tripartite world

PubMed Central

Wang, Minglei; Yafremava, Liudmila S.; Caetano-Anollés, Derek; Mittenthal, Jay E.; Caetano-Anollés, Gustavo

2007-01-01

The repertoire of protein architectures in proteomes is evolutionarily conserved and capable of preserving an accurate record of genomic history. Here we use a census of protein architecture in 185 genomes that have been fully sequenced to generate genome-based phylogenies that describe the evolution of the protein world at fold (F) and fold superfamily (FSF) levels. The patterns of representation of F and FSF architectures over evolutionary history suggest three epochs in the evolution of the protein world: (1) architectural diversification, where members of an architecturally rich ancestral community diversified their protein repertoire; (2) superkingdom specification, where superkingdoms Archaea, Bacteria, and Eukarya were specified; and (3) organismal diversification, where F and FSF specific to relatively small sets of organisms appeared as the result of diversification of organismal lineages. Functional annotation of FSF along these architectural chronologies revealed patterns of discovery of biological function. Most importantly, the analysis identified an early and extensive differential loss of architectures occurring primarily in Archaea that segregates the archaeal lineage from the ancient community of organisms and establishes the first organismal divide. Reconstruction of phylogenomic trees of proteomes reflects the timeline of architectural diversification in the emerging lineages. Thus, Archaea undertook a minimalist strategy using only a small subset of the full architectural repertoire and then crystallized into a diversified superkingdom late in evolution. Our analysis also suggests a communal ancestor to all life that was molecularly complex and adopted genomic strategies currently present in Eukarya. PMID:17908824
Proteomic profiling of developing cotton fibers from wild and domesticated Gossypium barbadense.

PubMed

Hu, Guanjing; Koh, Jin; Yoo, Mi-Jeong; Grupp, Kara; Chen, Sixue; Wendel, Jonathan F

2013-10-01

Pima cotton (Gossypium barbadense) is widely cultivated because of its long, strong seed trichomes ('fibers') used for premium textiles. These agronomically advanced fibers were derived following domestication and thousands of years of human-mediated crop improvement. To gain an insight into fiber development and evolution, we conducted comparative proteomic and transcriptomic profiling of developing fiber from an elite cultivar and a wild accession. Analyses using isobaric tag for relative and absolute quantification (iTRAQ) LC-MS/MS technology identified 1317 proteins in fiber. Of these, 205 were differentially expressed across developmental stages, and 190 showed differential expression between wild and cultivated forms, 14.4% of the proteome sampled. Human selection may have shifted the timing of developmental modules, such that some occur earlier in domesticated than in wild cotton. A novel approach was used to detect possible biased expression of homoeologous copies of proteins. Results indicate a significant partitioning of duplicate gene expression at the protein level, but an approximately equal degree of bias for each of the two constituent genomes of allopolyploid cotton. Our results demonstrate the power of complementary transcriptomic and proteomic approaches for the study of the domestication process. They also provide a rich database for mining for functional analyses of cotton improvement or evolution. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.
Polycyclic aromatic hydrocarbon metabolic network in Mycobacterium vanbaalenii PYR-1.

PubMed

Kweon, Ohgew; Kim, Seong-Jae; Holland, Ricky D; Chen, Hongyan; Kim, Dae-Wi; Gao, Yuan; Yu, Li-Rong; Baek, Songjoon; Baek, Dong-Heon; Ahn, Hongsik; Cerniglia, Carl E

2011-09-01

This study investigated a metabolic network (MN) from Mycobacterium vanbaalenii PYR-1 for polycyclic aromatic hydrocarbons (PAHs) from the perspective of structure, behavior, and evolution, in which multilayer omics data are integrated. Initially, we utilized a high-throughput proteomic analysis to assess the protein expression response of M. vanbaalenii PYR-1 to seven different aromatic compounds. A total of 3,431 proteins (57.38% of the genome-predicted proteins) were identified, which included 160 proteins that seemed to be involved in the degradation of aromatic hydrocarbons. Based on the proteomic data and the previous metabolic, biochemical, physiological, and genomic information, we reconstructed an experiment-based system-level PAH-MN. The structure of PAH-MN, with 183 metabolic compounds and 224 chemical reactions, has a typical scale-free nature. The behavior and evolution of the PAH-MN reveals a hierarchical modularity with funnel effects in structure/function and intimate association with evolutionary modules of the functional modules, which are the ring cleavage process (RCP), side chain process (SCP), and central aromatic process (CAP). The 189 commonly upregulated proteins in all aromatic hydrocarbon treatments provide insights into the global adaptation to facilitate the PAH metabolism. Taken together, the findings of our study provide the hierarchical viewpoint from genes/proteins/metabolites to the network via functional modules of the PAH-MN equipped with the engineering-driven approaches of modularization and rationalization, which may expand our understanding of the metabolic potential of M. vanbaalenii PYR-1 for bioremediation applications.

Evolution of Scaling Emergence in Large-Scale Spatial Epidemic Spreading

PubMed Central

Wang, Lin; Li, Xiang; Zhang, Yi-Qing; Zhang, Yan; Zhang, Kan

2011-01-01

Background Zipf's law and Heaps' law are two representatives of the scaling concepts, which play a significant role in the study of complexity science. The coexistence of the Zipf's law and the Heaps' law motivates different understandings on the dependence between these two scalings, which has still hardly been clarified. Methodology/Principal Findings In this article, we observe an evolution process of the scalings: the Zipf's law and the Heaps' law are naturally shaped to coexist at the initial time, while the crossover comes with the emergence of their inconsistency at the larger time before reaching a stable state, where the Heaps' law still exists with the disappearance of strict Zipf's law. Such findings are illustrated with a scenario of large-scale spatial epidemic spreading, and the empirical results of pandemic disease support a universal analysis of the relation between the two laws regardless of the biological details of disease. Employing the United States domestic air transportation and demographic data to construct a metapopulation model for simulating the pandemic spread at the U.S. country level, we uncover that the broad heterogeneity of the infrastructure plays a key role in the evolution of scaling emergence. Conclusions/Significance The analyses of large-scale spatial epidemic spreading help understand the temporal evolution of scalings, indicating the coexistence of the Zipf's law and the Heaps' law depends on the collective dynamics of epidemic processes, and the heterogeneity of epidemic spread indicates the significance of performing targeted containment strategies at the early time of a pandemic disease. PMID:21747932
A genome-wide structure-based survey of nucleotide binding proteins in M. tuberculosis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhagavat, Raghu; Kim, Heung -Bok; Kim, Chang -Yub

Nucleoside tri-phosphates (NTP) form an important class of small molecule ligands that participate in, and are essential to a large number of biological processes. Here, we seek to identify the NTP binding proteome (NTPome) in M. tuberculosis (M.tb), a deadly pathogen. Identifying the NTPome is useful not only for gaining functional insights of the individual proteins but also for identifying useful drug targets. From an earlier study, we had structural models of M.tb at a proteome scale from which a set of 13,858 small molecule binding pockets were identified. We use a set of NTP binding sub-structural motifs derived frommore » a previous study and scan the M.tb pocketome, and find that 1,768 proteins or 43% of the proteome can theoretically bind NTP ligands. Using an experimental proteomics approach involving dye-ligand affinity chromatography, we confirm NTP binding to 47 different proteins, of which 4 are hypothetical proteins. Our analysis also provides the precise list of binding site residues in each case, and the probable ligand binding pose. In conclusion, as the list includes a number of known and potential drug targets, the identification of NTP binding can directly facilitate structure-based drug design of these targets.« less
Proteomic Screening of Antigenic Proteins from the Hard Tick, Haemaphysalis longicornis (Acari: Ixodidae)

PubMed Central

Kim, Young-Ha; slam, Mohammad Saiful; You, Myung-Jo

2015-01-01

Proteomic tools allow large-scale, high-throughput analyses for the detection, identification, and functional investigation of proteome. For detection of antigens from Haemaphysalis longicornis, 1-dimensional electrophoresis (1-DE) quantitative immunoblotting technique combined with 2-dimensional electrophoresis (2-DE) immunoblotting was used for whole body proteins from unfed and partially fed female ticks. Reactivity bands and 2-DE immunoblotting were performed following 2-DE electrophoresis to identify protein spots. The proteome of the partially fed female had a larger number of lower molecular weight proteins than that of the unfed female tick. The total number of detected spots was 818 for unfed and 670 for partially fed female ticks. The 2-DE immunoblotting identified 10 antigenic spots from unfed females and 8 antigenic spots from partially fed females. Matrix Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry (MALDI-TOF) of relevant spots identified calreticulin, putative secreted WC salivary protein, and a conserved hypothetical protein from the National Center for Biotechnology Information and Swiss Prot protein sequence databases. These findings indicate that most of the whole body components of these ticks are non-immunogenic. The data reported here will provide guidance in the identification of antigenic proteins to prevent infestation and diseases transmitted by H. longicornis. PMID:25748713
A genome-wide structure-based survey of nucleotide binding proteins in M. tuberculosis

DOE PAGES

Bhagavat, Raghu; Kim, Heung -Bok; Kim, Chang -Yub; ...

2017-10-02

Nucleoside tri-phosphates (NTP) form an important class of small molecule ligands that participate in, and are essential to a large number of biological processes. Here, we seek to identify the NTP binding proteome (NTPome) in M. tuberculosis (M.tb), a deadly pathogen. Identifying the NTPome is useful not only for gaining functional insights of the individual proteins but also for identifying useful drug targets. From an earlier study, we had structural models of M.tb at a proteome scale from which a set of 13,858 small molecule binding pockets were identified. We use a set of NTP binding sub-structural motifs derived frommore » a previous study and scan the M.tb pocketome, and find that 1,768 proteins or 43% of the proteome can theoretically bind NTP ligands. Using an experimental proteomics approach involving dye-ligand affinity chromatography, we confirm NTP binding to 47 different proteins, of which 4 are hypothetical proteins. Our analysis also provides the precise list of binding site residues in each case, and the probable ligand binding pose. In conclusion, as the list includes a number of known and potential drug targets, the identification of NTP binding can directly facilitate structure-based drug design of these targets.« less
A flexible statistical model for alignment of label-free proteomics data – incorporating ion mobility and product ion information

PubMed Central

2013-01-01

Background The goal of many proteomics experiments is to determine the abundance of proteins in biological samples, and the variation thereof in various physiological conditions. High-throughput quantitative proteomics, specifically label-free LC-MS/MS, allows rapid measurement of thousands of proteins, enabling large-scale studies of various biological systems. Prior to analyzing these information-rich datasets, raw data must undergo several computational processing steps. We present a method to address one of the essential steps in proteomics data processing - the matching of peptide measurements across samples. Results We describe a novel method for label-free proteomics data alignment with the ability to incorporate previously unused aspects of the data, particularly ion mobility drift times and product ion information. We compare the results of our alignment method to PEPPeR and OpenMS, and compare alignment accuracy achieved by different versions of our method utilizing various data characteristics. Our method results in increased match recall rates and similar or improved mismatch rates compared to PEPPeR and OpenMS feature-based alignment. We also show that the inclusion of drift time and product ion information results in higher recall rates and more confident matches, without increases in error rates. Conclusions Based on the results presented here, we argue that the incorporation of ion mobility drift time and product ion information are worthy pursuits. Alignment methods should be flexible enough to utilize all available data, particularly with recent advancements in experimental separation methods. PMID:24341404
A flexible statistical model for alignment of label-free proteomics data--incorporating ion mobility and product ion information.

PubMed

Benjamin, Ashlee M; Thompson, J Will; Soderblom, Erik J; Geromanos, Scott J; Henao, Ricardo; Kraus, Virginia B; Moseley, M Arthur; Lucas, Joseph E

2013-12-16

The goal of many proteomics experiments is to determine the abundance of proteins in biological samples, and the variation thereof in various physiological conditions. High-throughput quantitative proteomics, specifically label-free LC-MS/MS, allows rapid measurement of thousands of proteins, enabling large-scale studies of various biological systems. Prior to analyzing these information-rich datasets, raw data must undergo several computational processing steps. We present a method to address one of the essential steps in proteomics data processing--the matching of peptide measurements across samples. We describe a novel method for label-free proteomics data alignment with the ability to incorporate previously unused aspects of the data, particularly ion mobility drift times and product ion information. We compare the results of our alignment method to PEPPeR and OpenMS, and compare alignment accuracy achieved by different versions of our method utilizing various data characteristics. Our method results in increased match recall rates and similar or improved mismatch rates compared to PEPPeR and OpenMS feature-based alignment. We also show that the inclusion of drift time and product ion information results in higher recall rates and more confident matches, without increases in error rates. Based on the results presented here, we argue that the incorporation of ion mobility drift time and product ion information are worthy pursuits. Alignment methods should be flexible enough to utilize all available data, particularly with recent advancements in experimental separation methods.
Nanoliter-Scale Oil-Air-Droplet Chip-Based Single Cell Proteomic Analysis.

PubMed

Li, Zi-Yi; Huang, Min; Wang, Xiu-Kun; Zhu, Ying; Li, Jin-Song; Wong, Catherine C L; Fang, Qun

2018-04-17

Single cell proteomic analysis provides crucial information on cellular heterogeneity in biological systems. Herein, we describe a nanoliter-scale oil-air-droplet (OAD) chip for achieving multistep complex sample pretreatment and injection for single cell proteomic analysis in the shotgun mode. By using miniaturized stationary droplet microreaction and manipulation techniques, our system allows all sample pretreatment and injection procedures to be performed in a nanoliter-scale droplet with minimum sample loss and a high sample injection efficiency (>99%), thus substantially increasing the analytical sensitivity for single cell samples. We applied the present system in the proteomic analysis of 100 ± 10, 50 ± 5, 10, and 1 HeLa cell(s), and protein IDs of 1360, 612, 192, and 51 were identified, respectively. The OAD chip-based system was further applied in single mouse oocyte analysis, with 355 protein IDs identified at the single oocyte level, which demonstrated its special advantages of high enrichment of sequence coverage, hydrophobic proteins, and enzymatic digestion efficiency over the traditional in-tube system.
Proteome-Scale Human Interactomics.

PubMed

Luck, Katja; Sheynkman, Gloria M; Zhang, Ivy; Vidal, Marc

2017-05-01

Cellular functions are mediated by complex interactome networks of physical, biochemical, and functional interactions between DNA sequences, RNA molecules, proteins, lipids, and small metabolites. A thorough understanding of cellular organization requires accurate and relatively complete models of interactome networks at proteome scale. The recent publication of four human protein-protein interaction (PPI) maps represents a technological breakthrough and an unprecedented resource for the scientific community, heralding a new era of proteome-scale human interactomics. Our knowledge gained from these and complementary studies provides fresh insights into the opportunities and challenges when analyzing systematically generated interactome data, defines a clear roadmap towards the generation of a first reference interactome, and reveals new perspectives on the organization of cellular life. Copyright © 2017 Elsevier Ltd. All rights reserved.
Evolution of neuronal signalling: transmitters and receptors.

PubMed

Hoyle, Charles H V

2011-11-16

Evolution is a dynamic process during which the genome should not be regarded as a static entity. Molecular and morphological information yield insights into the evolution of species and their phylogenetic relationships, and molecular information in particular provides information into the evolution of signalling processes. Many signalling systems have their origin in primitive, even unicellular, organisms. Through time, and as organismal complexity increased, certain molecules were employed as intercellular signal molecules. In the autonomic nervous system the basic unit of chemical transmission is a ligand and its cognate receptor. The general mechanisms underlying evolution of signal molecules and their cognate receptors have their basis in the alteration of the genome. In the past this has occurred in large-scale events, represented by two or more doublings of the whole genome, or large segments of the genome, early in the deuterostome lineage, after the emergence of urochordates and cephalochordates, and before the emergence of vertebrates. These duplications were followed by extensive remodelling involving subsequent small-scale changes, ranging from point mutations to exon duplication. Concurrent with these processes was multiple gene loss so that the modern genome contains roughly the same number of genes as in early deuterostomes despite the large-scale genomic duplications. In this review, the principles that underlie evolution that have led to large and small families of autonomic neurotransmitters and their receptors are discussed, with emphasis on G protein-coupled receptors. Copyright © 2010 Elsevier B.V. All rights reserved.
Real-time evolution of a large-scale relativistic jet

NASA Astrophysics Data System (ADS)

Martí, Josep; Luque-Escamilla, Pedro L.; Romero, Gustavo E.; Sánchez-Sutil, Juan R.; Muñoz-Arjonilla, Álvaro J.

2015-06-01

Context. Astrophysical jets are ubiquitous in the Universe on all scales, but their large-scale dynamics and evolution in time are hard to observe since they usually develop at a very slow pace. Aims: We aim to obtain the first observational proof of the expected large-scale evolution and interaction with the environment in an astrophysical jet. Only jets from microquasars offer a chance to witness the real-time, full-jet evolution within a human lifetime, since they combine a "short", few parsec length with relativistic velocities. Methods: The methodology of this work is based on a systematic recalibraton of interferometric radio observations of microquasars available in public archives. In particular, radio observations of the microquasar GRS 1758-258 over less than two decades have provided the most striking results. Results: Significant morphological variations in the extended jet structure of GRS 1758-258 are reported here that were previously missed. Its northern radio lobe underwent a major morphological variation that rendered the hotspot undetectable in 2001 and reappeared again in the following years. The reported changes confirm the Galactic nature of the source. We tentatively interpret them in terms of the growth of instabilities in the jet flow. There is also evidence of surrounding cocoon. These results can provide a testbed for models accounting for the evolution of jets and their interaction with the environment.
REVIEWS OF TOPICAL PROBLEMS: The large-scale structure of the universe

NASA Astrophysics Data System (ADS)

Shandarin, S. F.; Doroshkevich, A. G.; Zel'dovich, Ya B.

1983-01-01

A survey is given of theories for the origin of large-scale structure in the universe: clusters and superclusters of galaxies, and vast black regions practically devoid of galaxies. Special attention is paid to the theory of a neutrino-dominated universe—a cosmology in which electron neutrinos with a rest mass of a few tens of electron volts would contribute the bulk of the mean density. The evolution of small perturbations is discussed, and estimates are made for the temperature anisotropy of the microwave background radiation on various angular scales. The nonlinear stage in the evolution of smooth irrotational perturbations in a lowpressure medium is described in detail. Numerical experiments simulating large-scale structure formation processes are discussed, as well as their interpretation in the context of catastrophe theory.
The Use of Proteomic Tools to Address Challenges Faced in Clonal Propagation of Tropical Crops through Somatic Embryogenesis.

PubMed

Chin, Chiew Foan; Tan, Hooi Sin

2018-05-04

In many tropical countries with agriculture as the mainstay of the economy, tropical crops are commonly cultivated at the plantation scale. The successful establishment of crop plantations depends on the availability of a large quantity of elite seedling plants. Many plantation companies establish plant tissue culture laboratories to supply planting materials for their plantations and one of the most common applications of plant tissue culture is the mass propagation of true-to-type elite seedlings. However, problems encountered in tissue culture technology prevent its applications being widely adopted. Proteomics can be a powerful tool for use in the analysis of cultures, and to understand the biological processes that takes place at the cellular and molecular levels in order to address these problems. This mini review presents the tissue culture technologies commonly used in the propagation of tropical crops. It provides an outline of some the genes and proteins isolated that are associated with somatic embryogenesis and the use of proteomic technology in analysing tissue culture samples and processes in tropical crops.
Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0

NASA Astrophysics Data System (ADS)

The, Matthew; MacCoss, Michael J.; Noble, William S.; Käll, Lukas

2016-11-01

Percolator is a widely used software tool that increases yield in shotgun proteomics experiments and assigns reliable statistical confidence measures, such as q values and posterior error probabilities, to peptides and peptide-spectrum matches (PSMs) from such experiments. Percolator's processing speed has been sufficient for typical data sets consisting of hundreds of thousands of PSMs. With our new scalable approach, we can now also analyze millions of PSMs in a matter of minutes on a commodity computer. Furthermore, with the increasing awareness for the need for reliable statistics on the protein level, we compared several easy-to-understand protein inference methods and implemented the best-performing method—grouping proteins by their corresponding sets of theoretical peptides and then considering only the best-scoring peptide for each protein—in the Percolator package. We used Percolator 3.0 to analyze the data from a recent study of the draft human proteome containing 25 million spectra (PM:24870542). The source code and Ubuntu, Windows, MacOS, and Fedora binary packages are available from http://percolator.ms/ under an Apache 2.0 license.
Recent advances in methods for the analysis of protein o-glycosylation at proteome level.

PubMed

You, Xin; Qin, Hongqiang; Ye, Mingliang

2018-01-01

O-Glycosylation, which refers to the glycosylation of the hydroxyl group of side chains of Serine/Threonine/Tyrosine residues, is one of the most common post-translational modifications. Compared with N-linked glycosylation, O-glycosylation is less explored because of its complex structure and relatively low abundance. Recently, O-glycosylation has drawn more and more attention for its various functions in many sophisticated biological processes. To obtain a deep understanding of O-glycosylation, many efforts have been devoted to develop effective strategies to analyze the two most abundant types of O-glycosylation, i.e. O-N-acetylgalactosamine and O-N-acetylglucosamine glycosylation. In this review, we summarize the proteomics workflows to analyze these two types of O-glycosylation. For the large-scale analysis of mucin-type glycosylation, the glycan simplification strategies including the ''SimpleCell'' technology were introduced. A variety of enrichment methods including lectin affinity chromatography, hydrophilic interaction chromatography, hydrazide chemistry, and chemoenzymatic method were introduced for the proteomics analysis of O-N-acetylgalactosamine and O-N-acetylglucosamine glycosylation. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0.

PubMed

The, Matthew; MacCoss, Michael J; Noble, William S; Käll, Lukas

2016-11-01

Percolator is a widely used software tool that increases yield in shotgun proteomics experiments and assigns reliable statistical confidence measures, such as q values and posterior error probabilities, to peptides and peptide-spectrum matches (PSMs) from such experiments. Percolator's processing speed has been sufficient for typical data sets consisting of hundreds of thousands of PSMs. With our new scalable approach, we can now also analyze millions of PSMs in a matter of minutes on a commodity computer. Furthermore, with the increasing awareness for the need for reliable statistics on the protein level, we compared several easy-to-understand protein inference methods and implemented the best-performing method-grouping proteins by their corresponding sets of theoretical peptides and then considering only the best-scoring peptide for each protein-in the Percolator package. We used Percolator 3.0 to analyze the data from a recent study of the draft human proteome containing 25 million spectra (PM:24870542). The source code and Ubuntu, Windows, MacOS, and Fedora binary packages are available from http://percolator.ms/ under an Apache 2.0 license. Graphical Abstract ᅟ.
A large scale Plasmodium vivax- Saimiri boliviensis trophozoite-schizont transition proteome

PubMed Central

Lapp, Stacey A.; Barnwell, John W.; Galinski, Mary R.

2017-01-01

Plasmodium vivax is a complex protozoan parasite with over 6,500 genes and stage-specific differential expression. Much of the unique biology of this pathogen remains unknown, including how it modifies and restructures the host reticulocyte. Using a recently published P. vivax reference genome, we report the proteome from two biological replicates of infected Saimiri boliviensis host reticulocytes undergoing transition from the late trophozoite to early schizont stages. Using five database search engines, we identified a total of 2000 P. vivax and 3487 S. boliviensis proteins, making this the most comprehensive P. vivax proteome to date. PlasmoDB GO-term enrichment analysis of proteins identified at least twice by a search engine highlighted core metabolic processes and molecular functions such as glycolysis, translation and protein folding, cell components such as ribosomes, proteasomes and the Golgi apparatus, and a number of vesicle and trafficking related clusters. Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.8 enriched functional annotation clusters of S. boliviensis proteins highlighted vesicle and trafficking-related clusters, elements of the cytoskeleton, oxidative processes and response to oxidative stress, macromolecular complexes such as the proteasome and ribosome, metabolism, translation, and cell death. Host and parasite proteins potentially involved in cell adhesion were also identified. Over 25% of the P. vivax proteins have no functional annotation; this group includes 45 VIR members of the large PIR family. A number of host and pathogen proteins contained highly oxidized or nitrated residues, extending prior trophozoite-enriched stage observations from S. boliviensis infections, and supporting the possibility of oxidative stress in relation to the disease. This proteome significantly expands the size and complexity of the known P. vivax and Saimiri host iRBC proteomes, and provides in-depth data that will be valuable for ongoing research on this parasite’s biology and pathogenesis. PMID:28829774
Proteomics Analysis of the Causative Agent of Typhoid Fever

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ansong, Charles; Yoon, Hyunjin; Norbeck, Angela D.

2008-02-01

Typhoid fever is a potentially fatal disease caused by the bacterial pathogen Salmonella enterica serovar Typhi (S. typhi). S. typhi infection is a complex process that involves numerous bacterially-encoded virulence determinants, and these are thought to confer both stringent human host specificity and a high mortality rate. In the present study we used a liquid chromatography-mass spectrometry (LC-MS) based proteomics strategy to investigate the proteome of logarithmic, stationary phase, and low pH/low magnesium (MgM) S. typhi cultures. This represents the first large scale comprehensive characterization of the S. typhi proteome. Our analysis identified a total of 2066 S. typhi proteins.more » In an effort to identify putative S. typhi-specific virulence factors, we then compared our S. typhi results to those obtained in a previously published study of the S. typhimurium proteome under similar conditions (Adkins J.N. et al (2006) Mol Cell Prot). Comparative proteomic analysis of S. typhi (strain Ty2) and S. typhimurium (strains LT2 and 14028) revealed a subset of highly expressed proteins unique to S. typhi that were exclusively detected under conditions that mimic the infective state in macrophage cells. These proteins included CdtB, HlyE, and a conserved protein encoded by t1476. The differential expression of selected proteins was confirmed by Western blot analysis. Taken together with the current literature, our observations suggest that this subset of proteins may play a role in S. typhi pathogenesis and human host specificity. In addition, we observed products of the biotin (bio) operon displayed a higher abundance in the more virulent strains S. typhi-Ty2 and S. typhimurium-14028 compared to the virulence attenuated S. typhimurium strain LT2, suggesting bio proteins may contribute to Salmonella pathogenesis.« less
Rescuing discarded spectra: Full comprehensive analysis of a minimal proteome.

PubMed

Lluch-Senar, Maria; Mancuso, Francesco M; Climente-González, Héctor; Peña-Paz, Marcia I; Sabido, Eduard; Serrano, Luis

2016-02-01

A common problem encountered when performing large-scale MS proteome analysis is the loss of information due to the high percentage of unassigned spectra. To determine the causes behind this loss we have analyzed the proteome of one of the smallest living bacteria that can be grown axenically, Mycoplasma pneumoniae (729 ORFs). The proteome of M. pneumoniae cells, grown in defined media, was analyzed by MS. An initial search with both Mascot and a species-specific NCBInr database with common contaminants (NCBImpn), resulted in around 79% of the acquired spectra not having an assignment. The percentage of non-assigned spectra was reduced to 27% after re-analysis of the data with the PEAKS software, thereby increasing the proteome coverage of M. pneumoniae from the initial 60% to over 76%. Nonetheless, 33,413 spectra with assigned amino acid sequences could not be mapped to any NCBInr database protein sequence. Approximately, 1% of these unassigned peptides corresponded to PTMs and 4% to M. pneumoniae protein variants (deamidation and translation inaccuracies). The most abundant peptide sequence variants (Phe-Tyr and Ala-Ser) could be explained by alterations in the editing capacity of the corresponding tRNA synthases. About another 1% of the peptides not associated to any protein had repetitions of the same aromatic/hydrophobic amino acid at the N-terminus, or had Arg/Lys at the C-terminus. Thus, in a model system, we have maximized the number of assigned spectra to 73% (51,453 out of the 70,040 initial acquired spectra). All MS data have been deposited in the ProteomeXchange with identifier PXD002779 (http://proteomecentral.proteomexchange.org/dataset/PXD002779). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Proteomics in pharmaceutical research and development.

PubMed

Cutler, Paul; Voshol, Hans

2015-08-01

In the 20 years since its inception, the evolution of proteomics in pharmaceutical industry has mirrored the developments within academia and indeed other industries. From initial enthusiasm and subsequent disappointment in global protein expression profiling, pharma research saw the biggest impact when relating to more focused approaches, such as those exploring the interaction between proteins and drugs. Nowadays, proteomics technologies have been integrated in many areas of pharmaceutical R&D, ranging from the analysis of therapeutic proteins to the monitoring of clinical trials. Here, we review the development of proteomics in the drug discovery process, placing it in a historical context as well as reviewing the current status in light of the contributions to this special issue, which reflect some of the diverse demands of the drug and biomarker pipelines. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Single-cell-type quantitative proteomic and ionomic analysis of epidermal bladder cells from the halophyte model plant Mesembryanthemum crystallinum to identify salt-responsive proteins.

PubMed

Barkla, Bronwyn J; Vera-Estrella, Rosario; Raymond, Carolyn

2016-05-10

Epidermal bladder cells (EBC) are large single-celled, specialized, and modified trichomes found on the aerial parts of the halophyte Mesembryanthemum crystallinum. Recent development of a simple but high throughput technique to extract the contents from these cells has provided an opportunity to conduct detailed single-cell-type analyses of their molecular characteristics at high resolution to gain insight into the role of these cells in the salt tolerance of the plant. In this study, we carry out large-scale complementary quantitative proteomic studies using both a label (DIGE) and label-free (GeLC-MS) approach to identify salt-responsive proteins in the EBC extract. Additionally we perform an ionomics analysis (ICP-MS) to follow changes in the amounts of 27 different elements. Using these methods, we were able to identify 54 proteins and nine elements that showed statistically significant changes in the EBC from salt-treated plants. GO enrichment analysis identified a large number of transport proteins but also proteins involved in photosynthesis, primary metabolism and Crassulacean acid metabolism (CAM). Validation of results by western blot, confocal microscopy and enzyme analysis helped to strengthen findings and further our understanding into the role of these specialized cells. As expected EBC accumulated large quantities of sodium, however, the most abundant element was chloride suggesting the sequestration of this ion into the EBC vacuole is just as important for salt tolerance. This single-cell type omics approach shows that epidermal bladder cells of M. crystallinum are metabolically active modified trichomes, with primary metabolism supporting cell growth, ion accumulation, compatible solute synthesis and CAM. Data are available via ProteomeXchange with identifier PXD004045.

Phosphoproteomic analysis of the non-seed vascular plant model Selaginella moellendorffii

PubMed Central

2014-01-01

Background Selaginella (Selaginella moellendorffii) is a lycophyte which diverged from other vascular plants approximately 410 million years ago. As the first reported non-seed vascular plant genome, Selaginella genome data allow comparative analysis of genetic changes that may be associated with land plant evolution. Proteomics investigations on this lycophyte model have not been extensively reported. Phosphorylation represents the most common post-translational modifications and it is a ubiquitous regulatory mechanism controlling the functional expression of proteins inside living organisms. Results In this study, polyethylene glycol fractionation and immobilized metal ion affinity chromatography were employed to isolate phosphopeptides from wild-growing Selaginella. Using liquid chromatography-tandem mass spectrometry analysis, 1593 unique phosphopeptides spanning 1104 non-redundant phosphosites with confirmed localization on 716 phosphoproteins were identified. Analysis of the Selaginella dataset revealed features that are consistent with other plant phosphoproteomes, such as the relative proportions of phosphorylated Ser, Thr, and Tyr residues, the highest occurrence of phosphosites in the C-terminal regions of proteins, and the localization of phosphorylation events outside protein domains. In addition, a total of 97 highly conserved phosphosites in evolutionary conserved proteins were identified, indicating the conservation of phosphorylation-dependent regulatory mechanisms in phylogenetically distinct plant species. On the other hand, close examination of proteins involved in photosynthesis revealed phosphorylation events which may be unique to Selaginella evolution. Furthermore, phosphorylation motif analyses identified Pro-directed, acidic, and basic signatures which are recognized by typical protein kinases in plants. A group of Selaginella-specific phosphoproteins were found to be enriched in the Pro-directed motif class. Conclusions Our work provides the first large-scale atlas of phosphoproteins in Selaginella which occupies a unique position in the evolution of terrestrial plants. Future research into the functional roles of Selaginella-specific phosphorylation events in photosynthesis and other processes may offer insight into the molecular mechanisms leading to the distinct evolution of lycophytes. PMID:24628833
A Matter of Time: Faster Percolator Analysis via Efficient SVM Learning for Large-Scale Proteomics.

PubMed

Halloran, John T; Rocke, David M

2018-05-04

Percolator is an important tool for greatly improving the results of a database search and subsequent downstream analysis. Using support vector machines (SVMs), Percolator recalibrates peptide-spectrum matches based on the learned decision boundary between targets and decoys. To improve analysis time for large-scale data sets, we update Percolator's SVM learning engine through software and algorithmic optimizations rather than heuristic approaches that necessitate the careful study of their impact on learned parameters across different search settings and data sets. We show that by optimizing Percolator's original learning algorithm, l 2 -SVM-MFN, large-scale SVM learning requires nearly only a third of the original runtime. Furthermore, we show that by employing the widely used Trust Region Newton (TRON) algorithm instead of l 2 -SVM-MFN, large-scale Percolator SVM learning is reduced to nearly only a fifth of the original runtime. Importantly, these speedups only affect the speed at which Percolator converges to a global solution and do not alter recalibration performance. The upgraded versions of both l 2 -SVM-MFN and TRON are optimized within the Percolator codebase for multithreaded and single-thread use and are available under Apache license at bitbucket.org/jthalloran/percolator_upgrade .
Comparative Study of Human and Mouse Postsynaptic Proteomes Finds High Compositional Conservation and Abundance Differences for Key Synaptic Proteins

PubMed Central

Bayés, Àlex; Collins, Mark O.; Croning, Mike D. R.; van de Lagemaat, Louie N.; Choudhary, Jyoti S.; Grant, Seth G. N.

2012-01-01

Direct comparison of protein components from human and mouse excitatory synapses is important for determining the suitability of mice as models of human brain disease and to understand the evolution of the mammalian brain. The postsynaptic density is a highly complex set of proteins organized into molecular networks that play a central role in behavior and disease. We report the first direct comparison of the proteome of triplicate isolates of mouse and human cortical postsynaptic densities. The mouse postsynaptic density comprised 1556 proteins and the human one 1461. A large compositional overlap was observed; more than 70% of human postsynaptic density proteins were also observed in the mouse postsynaptic density. Quantitative analysis of postsynaptic density components in both species indicates a broadly similar profile of abundance but also shows that there is higher abundance variation between species than within species. Well known components of this synaptic structure are generally more abundant in the mouse postsynaptic density. Significant inter-species abundance differences exist in some families of key postsynaptic density proteins including glutamatergic neurotransmitter receptors and adaptor proteins. Furthermore, we have identified a closely interacting set of molecules enriched in the human postsynaptic density that could be involved in dendrite and spine structural plasticity. Understanding synapse proteome diversity within and between species will be important to further our understanding of brain complexity and disease. PMID:23071613
Evolutionary conservation of the polyproline II conformation surrounding intrinsically disordered phosphorylation sites.

PubMed

Elam, W Austin; Schrank, Travis P; Campagnolo, Andrew J; Hilser, Vincent J

2013-04-01

Intrinsically disordered (ID) proteins function in the absence of a unique stable structure and appear to challenge the classic structure-function paradigm. The extent to which ID proteins take advantage of subtle conformational biases to perform functions, and whether signals for such mechanism can be identified in proteome-wide studies is not well understood. Of particular interest is the polyproline II (PII) conformation, suggested to be highly populated in unfolded proteins. We experimentally determine a complete calorimetric propensity scale for the PII conformation. Projection of the scale into representative eukaryotic proteomes reveals significant PII bias in regions coding for ID proteins. Importantly, enrichment of PII in ID proteins, or protein segments, is also captured by other PII scales, indicating that this enrichment is robustly encoded and universally detectable regardless of the method of PII propensity determination. Gene ontology (GO) terms obtained using our PII scale and other scales demonstrate a consensus for molecular functions performed by high PII proteins across the proteome. Perhaps the most striking result of the GO analysis is conserved enrichment (P < 10(-8) ) of phosphorylation sites in high PII regions found by all PII scales. Subsequent conformational analysis reveals a phosphorylation-dependent modulation of PII, suggestive of a conserved "tunability" within these regions. In summary, the application of an experimentally determined polyproline II (PII) propensity scale to proteome-wide sequence analysis and gene ontology reveals an enrichment of PII bias near disordered phosphorylation sites that is conserved throughout eukaryotes. Copyright © 2013 The Protein Society.
Just enough inflation: power spectrum modifications at large scales

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cicoli, Michele; Downes, Sean; Dutta, Bhaskar

2014-12-01

We show that models of 'just enough' inflation, where the slow-roll evolution lasted only 50- 60 e-foldings, feature modifications of the CMB power spectrum at large angular scales. We perform a systematic analytic analysis in the limit of a sudden transition between any possible non-slow-roll background evolution and the final stage of slow-roll inflation. We find a high degree of universality since most common backgrounds like fast-roll evolution, matter or radiation-dominance give rise to a power loss at large angular scales and a peak together with an oscillatory behaviour at scales around the value of the Hubble parameter at themore » beginning of slow-roll inflation. Depending on the value of the equation of state parameter, different pre-inflationary epochs lead instead to an enhancement of power at low ℓ, and so seem disfavoured by recent observational hints for a lack of CMB power at ℓ∼< 40. We also comment on the importance of initial conditions and the possibility to have multiple pre-inflationary stages.« less
Combinatorial depletion analysis to assemble the network architecture of the SAGA and ADA chromatin remodeling complexes.

PubMed

Lee, Kenneth K; Sardiu, Mihaela E; Swanson, Selene K; Gilmore, Joshua M; Torok, Michael; Grant, Patrick A; Florens, Laurence; Workman, Jerry L; Washburn, Michael P

2011-07-05

Despite the availability of several large-scale proteomics studies aiming to identify protein interactions on a global scale, little is known about how proteins interact and are organized within macromolecular complexes. Here, we describe a technique that consists of a combination of biochemistry approaches, quantitative proteomics and computational methods using wild-type and deletion strains to investigate the organization of proteins within macromolecular protein complexes. We applied this technique to determine the organization of two well-studied complexes, Spt-Ada-Gcn5 histone acetyltransferase (SAGA) and ADA, for which no comprehensive high-resolution structures exist. This approach revealed that SAGA/ADA is composed of five distinct functional modules, which can persist separately. Furthermore, we identified a novel subunit of the ADA complex, termed Ahc2, and characterized Sgf29 as an ADA family protein present in all Gcn5 histone acetyltransferase complexes. Finally, we propose a model for the architecture of the SAGA and ADA complexes, which predicts novel functional associations within the SAGA complex and provides mechanistic insights into phenotypical observations in SAGA mutants.
Combinatorial depletion analysis to assemble the network architecture of the SAGA and ADA chromatin remodeling complexes

PubMed Central

Lee, Kenneth K; Sardiu, Mihaela E; Swanson, Selene K; Gilmore, Joshua M; Torok, Michael; Grant, Patrick A; Florens, Laurence; Workman, Jerry L; Washburn, Michael P

2011-01-01

Despite the availability of several large-scale proteomics studies aiming to identify protein interactions on a global scale, little is known about how proteins interact and are organized within macromolecular complexes. Here, we describe a technique that consists of a combination of biochemistry approaches, quantitative proteomics and computational methods using wild-type and deletion strains to investigate the organization of proteins within macromolecular protein complexes. We applied this technique to determine the organization of two well-studied complexes, Spt–Ada–Gcn5 histone acetyltransferase (SAGA) and ADA, for which no comprehensive high-resolution structures exist. This approach revealed that SAGA/ADA is composed of five distinct functional modules, which can persist separately. Furthermore, we identified a novel subunit of the ADA complex, termed Ahc2, and characterized Sgf29 as an ADA family protein present in all Gcn5 histone acetyltransferase complexes. Finally, we propose a model for the architecture of the SAGA and ADA complexes, which predicts novel functional associations within the SAGA complex and provides mechanistic insights into phenotypical observations in SAGA mutants. PMID:21734642
Advances in Proteomics Data Analysis and Display Using an Accurate Mass and Time Tag Approach

PubMed Central

Zimmer, Jennifer S.D.; Monroe, Matthew E.; Qian, Wei-Jun; Smith, Richard D.

2007-01-01

Proteomics has recently demonstrated utility in understanding cellular processes on the molecular level as a component of systems biology approaches and for identifying potential biomarkers of various disease states. The large amount of data generated by utilizing high efficiency (e.g., chromatographic) separations coupled to high mass accuracy mass spectrometry for high-throughput proteomics analyses presents challenges related to data processing, analysis, and display. This review focuses on recent advances in nanoLC-FTICR-MS-based proteomics approaches and the accompanying data processing tools that have been developed to display and interpret the large volumes of data being produced. PMID:16429408
Supporting Source Code Comprehension during Software Evolution and Maintenance

ERIC Educational Resources Information Center

Alhindawi, Nouh

2013-01-01

This dissertation addresses the problems of program comprehension to support the evolution of large-scale software systems. The research concerns how software engineers locate features and concepts along with categorizing changes within very large bodies of source code along with their versioned histories. More specifically, advanced Information…
Proteomic analysis reveals O-GlcNAc modification on proteins with key regulatory functions in Arabidopsis.

PubMed

Xu, Shou-Ling; Chalkley, Robert J; Maynard, Jason C; Wang, Wenfei; Ni, Weimin; Jiang, Xiaoyue; Shin, Kihye; Cheng, Ling; Savage, Dasha; Hühmer, Andreas F R; Burlingame, Alma L; Wang, Zhi-Yong

2017-02-21

Genetic studies have shown essential functions of O-linked N -acetylglucosamine (O-GlcNAc) modification in plants. However, the proteins and sites subject to this posttranslational modification are largely unknown. Here, we report a large-scale proteomic identification of O-GlcNAc-modified proteins and sites in the model plant Arabidopsis thaliana Using lectin weak affinity chromatography to enrich modified peptides, followed by mass spectrometry, we identified 971 O-GlcNAc-modified peptides belonging to 262 proteins. The modified proteins are involved in cellular regulatory processes, including transcription, translation, epigenetic gene regulation, and signal transduction. Many proteins have functions in developmental and physiological processes specific to plants, such as hormone responses and flower development. Mass spectrometric analysis of phosphopeptides from the same samples showed that a large number of peptides could be modified by either O-GlcNAcylation or phosphorylation, but cooccurrence of the two modifications in the same peptide molecule was rare. Our study generates a snapshot of the O-GlcNAc modification landscape in plants, indicating functions in many cellular regulation pathways and providing a powerful resource for further dissecting these functions at the molecular level.
MIGHTEE: The MeerKAT International GHz Tiered Extragalactic Exploration

NASA Astrophysics Data System (ADS)

Taylor, A. Russ; Jarvis, Matt

2017-05-01

The MeerKAT telescope is the precursor of the Square Kilometre Array mid-frequency dish array to be deployed later this decade on the African continent. MIGHTEE is one of the MeerKAT large survey projects designed to pathfind SKA key science in cosmology and galaxy evolution. Through a tiered radio continuum deep imaging project including several fields totaling 20 square degrees to microJy sensitivities and an ultra-deep image of a single 1 square degree field of view, MIGHTEE will explore dark matter and large scale structure, the evolution of galaxies, including AGN activity and star formation as a function of cosmic time and environment, the emergence and evolution of magnetic fields in galaxies, and the magnetic counter part to large scale structure of the universe.
A Proteomic Workflow Using High-Throughput De Novo Sequencing Towards Complementation of Genome Information for Improved Comparative Crop Science.

PubMed

Turetschek, Reinhard; Lyon, David; Desalegn, Getinet; Kaul, Hans-Peter; Wienkoop, Stefanie

2016-01-01

The proteomic study of non-model organisms, such as many crop plants, is challenging due to the lack of comprehensive genome information. Changing environmental conditions require the study and selection of adapted cultivars. Mutations, inherent to cultivars, hamper protein identification and thus considerably complicate the qualitative and quantitative comparison in large-scale systems biology approaches. With this workflow, cultivar-specific mutations are detected from high-throughput comparative MS analyses, by extracting sequence polymorphisms with de novo sequencing. Stringent criteria are suggested to filter for confidential mutations. Subsequently, these polymorphisms complement the initially used database, which is ready to use with any preferred database search algorithm. In our example, we thereby identified 26 specific mutations in two cultivars of Pisum sativum and achieved an increased number (17 %) of peptide spectrum matches.
Culture rather than genes provides greater scope for the evolution of large-scale human prosociality

PubMed Central

Bell, Adrian V.; Richerson, Peter J.; McElreath, Richard

2009-01-01

Whether competition among large groups played an important role in human social evolution is dependent on how variation, whether cultural or genetic, is maintained between groups. Comparisons between genetic and cultural differentiation between neighboring groups show how natural selection on large groups is more plausible on cultural rather than genetic variation. PMID:19822753
Quantitative proteomics reveals a role of JAZ7 in plant defense response to Pseudomonas syringae DC3000.

PubMed

Zhang, Tong; Meng, Li; Kong, Wenwen; Yin, Zepeng; Wang, Yang; Schneider, Jacqueline D; Chen, Sixue

2018-03-20

Jasmonate ZIM-domain (JAZ) proteins are key transcriptional repressors regulating various biological processes. Although many studies have studied JAZ proteins by genetic and biochemical analyses, little is known about JAZ7-associated global protein networks and how JAZ7 contributes to bacterial pathogen defense. In this study, we aim to fill this knowledge gap by conducting unbiased large-scale quantitative proteomics using tandem mass tags (TMT). We compared the proteomes of a JAZ7 knock-out line, a JAZ7 overexpression line, as well as the wild type Arabidopsis plants in the presence and absence of Pseudomonas syringae DC3000 infection. Both pairwise comparison and multi-factor analysis of variance reveal that differential proteins are enriched in biological processes such as primary and secondary metabolism, redox regulation, and response to stress. The differential regulation in these pathways may account for the alterations in plant size, redox homeostasis and accumulation of glucosinolates. In addition, possible interplay between genotype and environment is suggested as the abundance of seven proteins is influenced by the interaction of the two factors. Collectively, we demonstrate a role of JAZ7 in pathogen defense and provide a list of proteins that are uniquely responsive to genetic disruption, pathogen infection, or the interaction between genotypes and environmental factors. We report proteomic changes as a result of genetic perturbation of JAZ7, and the contribution of JAZ7 in plant immunity. Specifically, the similarity between the proteomes of a JAZ7 knockout mutant and the wild type plants confirmed the functional redundancy of JAZs. In contrast, JAZ7 overexpression plants were much different, and proteomic analysis of the JAZ7 overexpression plants under Pst DC3000 infection revealed that JAZ7 may regulate plant immunity via ROS modulation, energy balance and glucosinolate biosynthesis. Multiple variate analysis for this two-factor proteomics experiment suggests that protein abundance is determined by genotype, environment and the interaction between them. Copyright © 2018 Elsevier B.V. All rights reserved.
Annotation of Protein Domains Reveals Remarkable Conservation in the Functional Make up of Proteomes Across Superkingdoms

PubMed Central

Nasir, Arshan; Naeem, Aisha; Khan, Muhammad Jawad; Lopez-Nicora, Horacio D.; Caetano-Anollés, Gustavo

2011-01-01

The functional repertoire of a cell is largely embodied in its proteome, the collection of proteins encoded in the genome of an organism. The molecular functions of proteins are the direct consequence of their structure and structure can be inferred from sequence using hidden Markov models of structural recognition. Here we analyze the functional annotation of protein domain structures in almost a thousand sequenced genomes, exploring the functional and structural diversity of proteomes. We find there is a remarkable conservation in the distribution of domains with respect to the molecular functions they perform in the three superkingdoms of life. In general, most of the protein repertoire is spent in functions related to metabolic processes but there are significant differences in the usage of domains for regulatory and extra-cellular processes both within and between superkingdoms. Our results support the hypotheses that the proteomes of superkingdom Eukarya evolved via genome expansion mechanisms that were directed towards innovating new domain architectures for regulatory and extra/intracellular process functions needed for example to maintain the integrity of multicellular structure or to interact with environmental biotic and abiotic factors (e.g., cell signaling and adhesion, immune responses, and toxin production). Proteomes of microbial superkingdoms Archaea and Bacteria retained fewer numbers of domains and maintained simple and smaller protein repertoires. Viruses appear to play an important role in the evolution of superkingdoms. We finally identify few genomic outliers that deviate significantly from the conserved functional design. These include Nanoarchaeum equitans, proteobacterial symbionts of insects with extremely reduced genomes, Tenericutes and Guillardia theta. These organisms spend most of their domains on information functions, including translation and transcription, rather than on metabolism and harbor a domain repertoire characteristic of parasitic organisms. In contrast, the functional repertoire of the proteomes of the Planctomycetes-Verrucomicrobia-Chlamydiae superphylum was no different than the rest of bacteria, failing to support claims of them representing a separate superkingdom. In turn, Protista and Bacteria shared similar functional distribution patterns suggesting an ancestral evolutionary link between these groups. PMID:24710297
Bioinformatics strategies in life sciences: from data processing and data warehousing to biological knowledge extraction.

PubMed

Thiele, Herbert; Glandorf, Jörg; Hufnagel, Peter

2010-05-27

With the large variety of Proteomics workflows, as well as the large variety of instruments and data-analysis software available, researchers today face major challenges validating and comparing their Proteomics data. Here we present a new generation of the ProteinScape bioinformatics platform, now enabling researchers to manage Proteomics data from the generation and data warehousing to a central data repository with a strong focus on the improved accuracy, reproducibility and comparability demanded by many researchers in the field. It addresses scientists; current needs in proteomics identification, quantification and validation. But producing large protein lists is not the end point in Proteomics, where one ultimately aims to answer specific questions about the biological condition or disease model of the analyzed sample. In this context, a new tool has been developed at the Spanish Centro Nacional de Biotecnologia Proteomics Facility termed PIKE (Protein information and Knowledge Extractor) that allows researchers to control, filter and access specific information from genomics and proteomic databases, to understand the role and relationships of the proteins identified in the experiments. Additionally, an EU funded project, ProDac, has coordinated systematic data collection in public standards-compliant repositories like PRIDE. This will cover all aspects from generating MS data in the laboratory, assembling the whole annotation information and storing it together with identifications in a standardised format.
Proteomics research in India: an update.

PubMed

Reddy, Panga Jaipal; Atak, Apurva; Ghantasala, Saicharan; Kumar, Saurabh; Gupta, Shabarni; Prasad, T S Keshava; Zingde, Surekha M; Srivastava, Sanjeeva

2015-09-08

After a successful completion of the Human Genome Project, deciphering the mystery surrounding the human proteome posed a major challenge. Despite not being largely involved in the Human Genome Project, the Indian scientific community contributed towards proteomic research along with the global community. Currently, more than 76 research/academic institutes and nearly 145 research labs are involved in core proteomic research across India. The Indian researchers have been major contributors in drafting the "human proteome map" along with international efforts. In addition to this, virtual proteomics labs, proteomics courses and remote triggered proteomics labs have helped to overcome the limitations of proteomics education posed due to expensive lab infrastructure. The establishment of Proteomics Society, India (PSI) has created a platform for the Indian proteomic researchers to share ideas, research collaborations and conduct annual conferences and workshops. Indian proteomic research is really moving forward with the global proteomics community in a quest to solve the mysteries of proteomics. A draft map of the human proteome enhances the enthusiasm among intellectuals to promote proteomic research in India to the world.This article is part of a Special Issue entitled: Proteomics in India. Copyright © 2015 Elsevier B.V. All rights reserved.
A pursuit of lineage-specific and niche-specific proteome features in the world of archaea

PubMed Central

2012-01-01

Background Archaea evoke interest among researchers for two enigmatic characteristics –a combination of bacterial and eukaryotic components in their molecular architectures and an enormous diversity in their life-style and metabolic capabilities. Despite considerable research efforts, lineage- specific/niche-specific molecular features of the whole archaeal world are yet to be fully unveiled. The study offers the first large-scale in silico proteome analysis of all archaeal species of known genome sequences with a special emphasis on methanogenic and sulphur-metabolising archaea. Results Overall amino acid usage in archaea is dominated by GC-bias. But the environmental factors like oxygen requirement or thermal adaptation seem to play important roles in selection of residues with no GC-bias at the codon level. All methanogens, irrespective of their thermal/salt adaptation, show higher usage of Cys and have relatively acidic proteomes, while the proteomes of sulphur-metabolisers have higher aromaticity and more positive charges. Despite of exhibiting thermophilic life-style, korarchaeota possesses an acidic proteome. Among the distinct trends prevailing in COGs (Cluster of Orthologous Groups of proteins) distribution profiles, crenarchaeal organisms display higher intra-order variations in COGs repertoire, especially in the metabolic ones, as compared to euryarchaea. All methanogens are characterised by a presence of 22 exclusive COGs. Conclusions Divergences in amino acid usage, aromaticity/charge profiles and COG repertoire among methanogens and sulphur-metabolisers, aerobic and anaerobic archaea or korarchaeota and nanoarchaeota, as elucidated in the present study, point towards the presence of distinct molecular strategies for niche specialization in the archaeal world. PMID:22691113
A pursuit of lineage-specific and niche-specific proteome features in the world of archaea.

PubMed

Roy Chowdhury, Anindya; Dutta, Chitra

2012-06-12

Archaea evoke interest among researchers for two enigmatic characteristics -a combination of bacterial and eukaryotic components in their molecular architectures and an enormous diversity in their life-style and metabolic capabilities. Despite considerable research efforts, lineage- specific/niche-specific molecular features of the whole archaeal world are yet to be fully unveiled. The study offers the first large-scale in silico proteome analysis of all archaeal species of known genome sequences with a special emphasis on methanogenic and sulphur-metabolising archaea. Overall amino acid usage in archaea is dominated by GC-bias. But the environmental factors like oxygen requirement or thermal adaptation seem to play important roles in selection of residues with no GC-bias at the codon level. All methanogens, irrespective of their thermal/salt adaptation, show higher usage of Cys and have relatively acidic proteomes, while the proteomes of sulphur-metabolisers have higher aromaticity and more positive charges. Despite of exhibiting thermophilic life-style, korarchaeota possesses an acidic proteome. Among the distinct trends prevailing in COGs (Cluster of Orthologous Groups of proteins) distribution profiles, crenarchaeal organisms display higher intra-order variations in COGs repertoire, especially in the metabolic ones, as compared to euryarchaea. All methanogens are characterised by a presence of 22 exclusive COGs. Divergences in amino acid usage, aromaticity/charge profiles and COG repertoire among methanogens and sulphur-metabolisers, aerobic and anaerobic archaea or korarchaeota and nanoarchaeota, as elucidated in the present study, point towards the presence of distinct molecular strategies for niche specialization in the archaeal world.
A Large-Scale Quantitative Proteomic Approach to Identifying Sulfur Mustard-Induced Protein Phosphorylation Cascades

DTIC Science & Technology

2010-01-01

snapshot of SM-induced toxicity. Over the past few years, innovations in systems biology and biotechnology have led to important advances in our under...perturbations. SILAC has been used to study tumor metastasis (3, 4), focal adhesion- associated proteins, growth factor signaling, and insulin regula- tion (5...stained with colloidal Coomassie blue. After it was destained, the gel lane was excised into six regions, and each region was cut into 1 mm cubes

A Proteomic Analysis of Eccrine Sweat: Implications for the Discovery of Schizophrenia Biomarker Proteins

PubMed Central

Raiszadeh, Michelle M.; Ross, Mark M.; Russo, Paul S.; Schaepper, Mary Ann H.; Zhou, Weidong; Deng, Jianghong; Ng, Daniel; Dickson, April; Dickson, Cindy; Strom, Monica; Osorio, Carolina; Soeprono, Thomas; Wulfkuhle, Julia D.; Kabbani, Nadine; Petricoin, Emanuel F.; Liotta, Lance A.; Kirsch, Wolff M.

2012-01-01

Liquid chromatography tandem mass spectrometry (LC-MS/MS) and multiple reaction monitoring mass spectrometry (MRM-MS) proteomics analyses were performed on eccrine sweat of healthy controls, and the results were compared with those from individuals diagnosed with schizophrenia (SZ). This is the first large scale study of the sweat proteome. First, we performed LC-MS/MS on pooled SZ samples and pooled control samples for global proteomics analysis. Results revealed a high abundance of diverse proteins and peptides in eccrine sweat. Most of the proteins identified from sweat samples were found to be different than the most abundant proteins from serum, which indicates that eccrine sweat is not simply a plasma transudate, and may thereby be a source of unique disease-associated biomolecules. A second independent set of patient and control sweat samples were analyzed by LC-MS/MS and spectral counting to determine qualitative protein differential abundances between the control and disease groups. Differential abundances of selected proteins, initially determined by spectral counting, were verified by MRM-MS analyses. Seventeen proteins showed a differential abundance of approximately two-fold or greater between the SZ pooled sample and the control pooled sample. This study demonstrates the utility of LC-MS/MS and MRM-MS as a viable strategy for the discovery and verification of potential sweat protein disease biomarkers. PMID:22256890
Global analysis of the rat and human platelet proteome – the molecular blueprint for illustrating multi-functional platelets and cross-species function evolution

PubMed Central

Yu, Yanbao; Leng, Taohua; Yun, Dong; Liu, Na; Yao, Jun; Dai, Ying; Yang, Pengyuan; Chen, Xian

2013-01-01

Emerging evidences indicate that blood platelets function in multiple biological processes including immune response, bone metastasis and liver regeneration in addition to their known roles in hemostasis and thrombosis. Global elucidation of platelet proteome will provide the molecular base of these platelet functions. Here, we set up a high throughput platform for maximum exploration of the rat/human platelet proteome using integrated proteomics technologies, and then applied to identify the largest number of the proteins expressed in both rat and human platelets. After stringent statistical filtration, a total of 837 unique proteins matched with at least two unique peptides were precisely identified, making it the first comprehensive protein database so far for rat platelets. Meanwhile, quantitative analyses of the thrombin-stimulated platelets offered great insights into the biological functions of platelet proteins and therefore confirmed our global profiling data. A comparative proteomic analysis between rat and human platelets was also conducted, which revealed not only a significant similarity, but also an across-species evolutionary link that the orthologous proteins representing ‘core proteome’, and the ‘evolutionary proteome’ is actually a relatively static proteome. PMID:20443191
An Investigation of the Large Scale Evolution and Topology of Coronal Mass Ejections in the Solar Wind

NASA Technical Reports Server (NTRS)

Riley, Peter

2000-01-01

This investigation is concerned with the large-scale evolution and topology of coronal mass ejections (CMEs) in the solar wind. During this reporting period we have focused on several aspects of CME properties, their identification and their evolution in the solar wind. The work included both analysis of Ulysses and ACE observations as well as fluid and magnetohydrodynamic simulations. In addition, we analyzed a series of "density holes" observed in the solar wind, that bear many similarities with CMEs. Finally, this work was communicated to the scientific community at three meetings and has led to three scientific papers that are in various stages of review.
The Observations of Redshift Evolution in Large Scale Environments (ORELSE) Survey

NASA Astrophysics Data System (ADS)

Squires, Gordon K.; Lubin, L. M.; Gal, R. R.

2007-05-01

We present the motivation, design, and latest results from the Observations of Redshift Evolution in Large Scale Environments (ORELSE) Survey, a systematic search for structure on scales greater than 10 Mpc around 20 known galaxy clusters at z > 0.6. When complete, the survey will cover nearly 5 square degrees, all targeted at high-density regions, making it complementary and comparable to field surveys such as DEEP2, GOODS, and COSMOS. For the survey, we are using the Large Format Camera on the Palomar 5-m and SuPRIME-Cam on the Subaru 8-m to obtain optical/near-infrared imaging of an approximately 30 arcmin region around previously studied high-redshift clusters. Colors are used to identify likely member galaxies which are targeted for follow-up spectroscopy with the DEep Imaging Multi-Object Spectrograph on the Keck 10-m. This technique has been used to identify successfully the Cl 1604 supercluster at z = 0.9, a large scale structure containing at least eight clusters (Gal & Lubin 2004; Gal, Lubin & Squires 2005). We present the most recent structures to be photometrically and spectroscopically confirmed through this program, discuss the properties of the member galaxies as a function of environment, and describe our planned multi-wavelength (radio, mid-IR, and X-ray) observations of these systems. The goal of this survey is to identify and examine a statistical sample of large scale structures during an active period in the assembly history of the most massive clusters. With such a sample, we can begin to constrain large scale cluster dynamics and determine the effect of the larger environment on galaxy evolution.
Computational clustering for viral reference proteomes

PubMed Central

Chen, Chuming; Huang, Hongzhan; Mazumder, Raja; Natale, Darren A.; McGarvey, Peter B.; Zhang, Jian; Polson, Shawn W.; Wang, Yuqi; Wu, Cathy H.

2016-01-01

Motivation: The enormous number of redundant sequenced genomes has hindered efforts to analyze and functionally annotate proteins. As the taxonomy of viruses is not uniformly defined, viral proteomes pose special challenges in this regard. Grouping viruses based on the similarity of their proteins at proteome scale can normalize against potential taxonomic nomenclature anomalies. Results: We present Viral Reference Proteomes (Viral RPs), which are computed from complete virus proteomes within UniProtKB. Viral RPs based on 95, 75, 55, 35 and 15% co-membership in proteome similarity based clusters are provided. Comparison of our computational Viral RPs with UniProt’s curator-selected Reference Proteomes indicates that the two sets are consistent and complementary. Furthermore, each Viral RP represents a cluster of virus proteomes that was consistent with virus or host taxonomy. We provide BLASTP search and FTP download of Viral RP protein sequences, and a browser to facilitate the visualization of Viral RPs. Availability and implementation: http://proteininformationresource.org/rps/viruses/ Contact: chenc@udel.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153712
Noumeavirus replication relies on a transient remote control of the host nucleus

PubMed Central

Fabre, Elisabeth; Jeudy, Sandra; Santini, Sébastien; Legendre, Matthieu; Trauchessec, Mathieu; Couté, Yohann; Claverie, Jean-Michel; Abergel, Chantal

2017-01-01

Acanthamoeba are infected by a remarkable diversity of large dsDNA viruses, the infectious cycles of which have been characterized using genomics, transcriptomics and electron microscopy. Given their gene content and the persistence of the host nucleus throughout their infectious cycle, the Marseilleviridae were initially assumed to fully replicate in the cytoplasm. Unexpectedly, we find that their virions do not incorporate the virus-encoded transcription machinery, making their replication nucleus-dependent. However, instead of delivering their DNA to the nucleus, the Marseilleviridae initiate their replication by transiently recruiting the nuclear transcription machinery to their cytoplasmic viral factory. The nucleus recovers its integrity after becoming leaky at an early stage. This work highlights the importance of virion proteomic analyses to complement genome sequencing in the elucidation of the replication scheme and evolution of large dsDNA viruses. PMID:28429720
Evaluation of a Genome-Scale In Silico Metabolic Model for Geobacter metallireducens Using Proteomic Data from a Field Biostimulation Experiment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fang, Yilin; Wilkins, Michael J.; Yabusaki, Steven B.

2012-12-12

Biomass and shotgun global proteomics data that reflected relative protein abundances from samples collected during the 2008 experiment at the U.S. Department of Energy Integrated Field-Scale Subsurface Research Challenge site in Rifle, Colorado, provided an unprecedented opportunity to validate a genome-scale metabolic model of Geobacter metallireducens and assess its performance with respect to prediction of metal reduction, biomass yield, and growth rate under dynamic field conditions. Reconstructed from annotated genomic sequence, biochemical, and physiological data, the constraint-based in silico model of G. metallireducens relates an annotated genome sequence to the physiological functions with 697 reactions controlled by 747 enzyme-coding genes.more » Proteomic analysis showed that 180 of the 637 G. metallireducens proteins detected during the 2008 experiment were associated with specific metabolic reactions in the in silico model. When the field-calibrated Fe(III) terminal electron acceptor process reaction in a reactive transport model for the field experiments was replaced with the genome-scale model, the model predicted that the largest metabolic fluxes through the in silico model reactions generally correspond to the highest abundances of proteins that catalyze those reactions. Central metabolism predicted by the model agrees well with protein abundance profiles inferred from proteomic analysis. Model discrepancies with the proteomic data, such as the relatively low fluxes through amino acid transport and metabolism, revealed pathways or flux constraints in the in silico model that could be updated to more accurately predict metabolic processes that occur in the subsurface environment.« less
TRDistiller: a rapid filter for enrichment of sequence datasets with proteins containing tandem repeats.

PubMed

Richard, François D; Kajava, Andrey V

2014-06-01

The dramatic growth of sequencing data evokes an urgent need to improve bioinformatics tools for large-scale proteome analysis. Over the last two decades, the foremost efforts of computer scientists were devoted to proteins with aperiodic sequences having globular 3D structures. However, a large portion of proteins contain periodic sequences representing arrays of repeats that are directly adjacent to each other (so called tandem repeats or TRs). These proteins frequently fold into elongated fibrous structures carrying different fundamental functions. Algorithms specific to the analysis of these regions are urgently required since the conventional approaches developed for globular domains have had limited success when applied to the TR regions. The protein TRs are frequently not perfect, containing a number of mutations, and some of them cannot be easily identified. To detect such "hidden" repeats several algorithms have been developed. However, the most sensitive among them are time-consuming and, therefore, inappropriate for large scale proteome analysis. To speed up the TR detection we developed a rapid filter that is based on the comparison of composition and order of short strings in the adjacent sequence motifs. Tests show that our filter discards up to 22.5% of proteins which are known to be without TRs while keeping almost all (99.2%) TR-containing sequences. Thus, we are able to decrease the size of the initial sequence dataset enriching it with TR-containing proteins which allows a faster subsequent TR detection by other methods. The program is available upon request. Copyright © 2014 Elsevier Inc. All rights reserved.
Compositional complexity of the mitochondrial proteome of a unicellular eukaryote (Acanthamoeba castellanii, supergroup Amoebozoa) rivals that of animals, fungi, and plants.

PubMed

Gawryluk, Ryan M R; Chisholm, Kenneth A; Pinto, Devanand M; Gray, Michael W

2014-09-23

We present a combined proteomic and bioinformatic investigation of mitochondrial proteins from the amoeboid protist Acanthamoeba castellanii, the first such comprehensive investigation in a free-living member of the supergroup Amoebozoa. This protist was chosen both for its phylogenetic position (as a sister to animals and fungi) and its ecological ubiquity and physiological flexibility. We report 1033 A. castellanii mitochondrial protein sequences, 709 supported by mass spectrometry data (676 nucleus-encoded and 33 mitochondrion-encoded), including two previously unannotated mtDNA-encoded proteins, which we identify as highly divergent mitochondrial ribosomal proteins. Other notable findings include duplicate proteins for all of the enzymes of the tricarboxylic acid (TCA) cycle-which, along with the identification of a mitochondrial malate synthase-isocitrate lyase fusion protein, suggests the interesting possibility that the glyoxylate cycle operates in A. castellanii mitochondria. Additionally, the A. castellanii genome encodes an unusually high number (at least 29) of mitochondrion-targeted pentatricopeptide repeat (PPR) proteins, organellar RNA metabolism factors in other organisms. We discuss several key mitochondrial pathways, including DNA replication, transcription and translation, protein degradation, protein import and Fe-S cluster biosynthesis, highlighting similarities and differences in these pathways in other eukaryotes. In compositional and functional complexity, the mitochondrial proteome of A. castellanii rivals that of multicellular eukaryotes. Comprehensive proteomic surveys of mitochondria have been undertaken in a limited number of predominantly multicellular eukaryotes. This phylogenetically narrow perspective constrains and biases our insights into mitochondrial function and evolution, as it neglects protists, which account for most of the evolutionary and functional diversity within eukaryotes. We report here the first comprehensive investigation of the mitochondrial proteome in a member (A. castellanii) of the eukaryotic supergroup Amoebozoa. Through a combination of tandem mass spectrometry (MS/MS) and in silico data mining, we have retrieved 1033 candidate mitochondrial protein sequences, 709 having MS support. These data were used to reconstruct the metabolic pathways and protein complexes of A. castellanii mitochondria, and were integrated with data from other characterized mitochondrial proteomes to augment our understanding of mitochondrial proteome evolution. Our results demonstrate the power of combining direct proteomic and bioinformatic approaches in the discovery of novel mitochondrial proteins, both nucleus-encoded and mitochondrion-encoded, and highlight the compositional complexity of the A. castellanii mitochondrial proteome, which rivals that of animals, fungi and plants. Copyright © 2014 Elsevier B.V. All rights reserved.
pyQms enables universal and accurate quantification of mass spectrometry data.

PubMed

Leufken, Johannes; Niehues, Anna; Sarin, L Peter; Wessel, Florian; Hippler, Michael; Leidel, Sebastian A; Fufezan, Christian

2017-10-01

Quantitative mass spectrometry (MS) is a key technique in many research areas (1), including proteomics, metabolomics, glycomics, and lipidomics. Because all of the corresponding molecules can be described by chemical formulas, universal quantification tools are highly desirable. Here, we present pyQms, an open-source software for accurate quantification of all types of molecules measurable by MS. pyQms uses isotope pattern matching that offers an accurate quality assessment of all quantifications and the ability to directly incorporate mass spectrometer accuracy. pyQms is, due to its universal design, applicable to every research field, labeling strategy, and acquisition technique. This opens ultimate flexibility for researchers to design experiments employing innovative and hitherto unexplored labeling strategies. Importantly, pyQms performs very well to accurately quantify partially labeled proteomes in large scale and high throughput, the most challenging task for a quantification algorithm. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Ribosome display: next-generation display technologies for production of antibodies in vitro.

PubMed

He, Mingyue; Khan, Farid

2005-06-01

Antibodies represent an important and growing class of biologic research reagents and biopharmaceutical products. They can be used as therapeutics in a variety of diseases. With the rapid expansion of proteomic studies and biomarker discovery, there is a need for the generation of highly specific binding reagents to study the vast number of proteins encoded by the genome. Display technologies provide powerful tools for obtaining antibodies. Aside from the preservation of natural antibody repertoires, they are capable of exploiting diversity by DNA recombination to create very large libraries for selection of novel molecules. In contrast to in vivo immunization processes, display technologies allow selection of antibodies under in vitro-defined selection condition(s), resulting in enrichment of antibodies with desired properties from large populations. In addition, in vitro selection enables the isolation of antibodies against difficult antigens including self-antigens, and this can be applied to the generation of human antibodies against human targets. Display technologies can also be combined with DNA mutagenesis for antibody evolution in vitro. Some methods are amenable to automation, permitting high-throughput generation of antibodies. Ribosome display is considered as representative of the next generation of display technologies since it overcomes the limitations of cell-based display methods by using a cell-free system, offering advantages of screening larger libraries and continuously expanding new diversity during selection. Production of display-derived antibodies can be achieved by choosing one of a variety of prokaryotic and eukaryotic cell-based expression systems. In the near future, cell-free protein synthesis may be developed as an alternative for large-scale generation of antibodies.
Serial isoelectric focusing as an effective and economic way to obtain maximal resolution and high-throughput in 2D-based comparative proteomics of scarce samples: proof-of-principle.

PubMed

Farhoud, Murtada H; Wessels, Hans J C T; Wevers, Ron A; van Engelen, Baziel G; van den Heuvel, Lambert P; Smeitink, Jan A

2005-01-01

In 2D-based comparative proteomics of scarce samples, such as limited patient material, established methods for prefractionation and subsequent use of different narrow range IPG strips to increase overall resolution are difficult to apply. Also, a high number of samples, a prerequisite for drawing meaningful conclusions when pathological and control samples are considered, will increase the associated amount of work almost exponentially. Here, we introduce a novel, effective, and economic method designed to obtain maximum 2D resolution while maintaining the high throughput necessary to perform large-scale comparative proteomics studies. The method is based on connecting different IPG strips serially head-to-tail so that a complete line of different IPG strips with sequential pH regions can be focused in the same experiment. We show that when 3 IPG strips (covering together the pH range of 3-11) are connected head-to-tail an optimal resolution is achieved along the whole pH range. Sample consumption, time required, and associated costs are reduced by almost 70%, and the workload is reduced significantly.
The Explorer of Diffuse Galactic Emission (EDGE): Determination of Large-Scale Structure Evolution from Measurement of the Anisotropy of the Cosmic Infrared Background

NASA Technical Reports Server (NTRS)

Silverberg, R. F.; Cheng, E. S.; Cottingham, D. A.; Fixsen, D. J.; Meyer, S. S.; Wilson, G. W.

2004-01-01

The formation of the first objects, stars and galaxies and their subsequent evolution remain a cosmological unknown. Few observational probes of these processes exist. The Cosmic Infrared Background (CIB) originates from this era, and can provide information to test models of both galaxy evolution and the growth of primordial structure. The Explorer of Diffuse Galactic Emission (EDGE) is a proposed balloon-borne mission designed to measure the spatial fluctuations in the CIB from 200 micrometers to 1 millimeter on 6' to 3 degree scales with 2 microKelvin sensitivity/resolution element. Such measurements would provide a sensitive probe of the large-scale variation in protogalaxy density at redshifts approximately 0.5-3. In this paper, we present the scientific justification for the mission and show a concept for the instrument and observations.
Cosmological Ohm's law and dynamics of non-minimal electromagnetism

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hollenstein, Lukas; Jain, Rajeev Kumar; Urban, Federico R., E-mail: lukas.hollenstein@cea.fr, E-mail: jain@cp3.dias.sdu.dk, E-mail: furban@ulb.ac.be

2013-01-01

The origin of large-scale magnetic fields in cosmic structures and the intergalactic medium is still poorly understood. We explore the effects of non-minimal couplings of electromagnetism on the cosmological evolution of currents and magnetic fields. In this context, we revisit the mildly non-linear plasma dynamics around recombination that are known to generate weak magnetic fields. We use the covariant approach to obtain a fully general and non-linear evolution equation for the plasma currents and derive a generalised Ohm law valid on large scales as well as in the presence of non-minimal couplings to cosmological (pseudo-)scalar fields. Due to the sizeablemore » conductivity of the plasma and the stringent observational bounds on such couplings, we conclude that modifications of the standard (adiabatic) evolution of magnetic fields are severely limited in these scenarios. Even at scales well beyond a Mpc, any departure from flux freezing behaviour is inhibited.« less
The Role of Small-Scale Processes in Solar Active Region Decay

NASA Astrophysics Data System (ADS)

Meyer, Karen; Mackay, Duncan

2017-08-01

Active regions are locations of intense magnetic activity on the Sun, whose evolution can result in highly energetic eruptive phenomena such as solar flares and coronal mass ejections (CMEs). Therefore, fast and accurate simulation of their evolution and decay is essential in the prediction of Space Weather events. In this talk we present initial results from our new model for the photospheric evolution of active region magnetic fields. Observations show that small-scale processes appear to play a role in the dispersal and decay of solar active regions, for example through cancellation at the boundary of sunspot outflows and erosion of flux by surrounding convective cells. Our active region model is coupled to our existing model for the evolution of small-scale photospheric magnetic features. Focusing first on the active region decay phase, we consider the evolution of its magnetic field due to both large-scale (e.g. differential rotation) and small-scale processes, such as its interaction with surrounding small-scale magnetic features and convective flows.This project is funded by The Carnegie Trust for the Universities of Scotland, through their Research Incentives Grant scheme.
Chemical composition and the potential for proteomic transformation in cancer, hypoxia, and hyperosmotic stress

PubMed Central

2017-01-01

The changes of protein expression that are monitored in proteomic experiments are a type of biological transformation that also involves changes in chemical composition. Accompanying the myriad molecular-level interactions that underlie any proteomic transformation, there is an overall thermodynamic potential that is sensitive to microenvironmental conditions, including local oxidation and hydration potential. Here, up- and down-expressed proteins identified in 71 comparative proteomics studies were analyzed using the average oxidation state of carbon (ZC) and water demand per residue (\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}${\\overline{n}}_{{\\mathrm{H}}_{2}\\mathrm{O}}$\\end{document}n¯H2O), calculated using elemental abundances and stoichiometric reactions to form proteins from basis species. Experimental lowering of oxygen availability (hypoxia) or water activity (hyperosmotic stress) generally results in decreased ZC or \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}${\\overline{n}}_{{\\mathrm{H}}_{2}\\mathrm{O}}$\\end{document}n¯H2O of up-expressed compared to down-expressed proteins. This correspondence of chemical composition with experimental conditions provides evidence for attraction of the proteomes to a low-energy state. An opposite compositional change, toward higher average oxidation or hydration state, is found for proteomic transformations in colorectal and pancreatic cancer, and in two experiments for adipose-derived stem cells. Calculations of chemical affinity were used to estimate the thermodynamic potentials for proteomic transformations as a function of fugacity of O2 and activity of H2O, which serve as scales of oxidation and hydration potential. Diagrams summarizing the relative potential for formation of up- and down-expressed proteins have predicted equipotential lines that cluster around particular values of oxygen fugacity and water activity for similar datasets. The changes in chemical composition of proteomes are likely linked with reactions among other cellular molecules. A redox balance calculation indicates that an increase in the lipid to protein ratio in cancer cells by 20% over hypoxic cells would generate a large enough electron sink for oxidation of the cancer proteomes. The datasets and computer code used here are made available in a new R package, canprot. PMID:28603672
Imaging and Molecular Markers for Patients with Lung Cancer: Approaches with Molecular Targets, Complementary/Innovative Treatment, and Therapeutic Modalities

DTIC Science & Technology

2011-02-01

Thrombocytopenia, Grade 3 in 1 patient • Hypomagnesemia, Grade 3 in 1 patient • Hypokalemia, Grade 3 in 2 patient • Pneumonia , Grade 3 in 7 patients...urgently needed. While the molecular events involved in lung cancer pathogenesis are being unraveled by ongoing large scale genomics, proteomics, and...tumor initiation, progression and metastasis are an important first step leading to the development of new prognostic markers and targets for therapy
Impact phenomena as factors in the evolution of the Earth

NASA Technical Reports Server (NTRS)

Grieve, R. A. F.; Parmentier, E. M.

1984-01-01

It is estimated that 30 to 200 large impact basins could have been formed on the early Earth. These large impacts may have resulted in extensive volcanism and enhanced endogenic geologic activity over large areas. Initial modelling of the thermal and subsidence history of large terrestrial basins indicates that they created geologic and thermal anomalies which lasted for geologically significant times. The role of large-scale impact in the biological evolution of the Earth has been highlighted by the discovery of siderophile anomalies at the Cretaceous-Tertiary boundary and associated with North American microtektites. Although in neither case has an associated crater been identified, the observations are consistent with the deposition of projectile-contaminated high-speed ejecta from major impact events. Consideration of impact processes reveals a number of mechanisms by which large-scale impact may induce extinctions.
Thermosensitivity of growth is determined by chaperone-mediated proteome reallocation

PubMed Central

Chen, Ke; Gao, Ye; Mih, Nathan; O’Brien, Edward J.; Yang, Laurence; Palsson, Bernhard O.

2017-01-01

Maintenance of a properly folded proteome is critical for bacterial survival at notably different growth temperatures. Understanding the molecular basis of thermoadaptation has progressed in two main directions, the sequence and structural basis of protein thermostability and the mechanistic principles of protein quality control assisted by chaperones. Yet we do not fully understand how structural integrity of the entire proteome is maintained under stress and how it affects cellular fitness. To address this challenge, we reconstruct a genome-scale protein-folding network for Escherichia coli and formulate a computational model, FoldME, that provides statistical descriptions of multiscale cellular response consistent with many datasets. FoldME simulations show (i) that the chaperones act as a system when they respond to unfolding stress rather than achieving efficient folding of any single component of the proteome, (ii) how the proteome is globally balanced between chaperones for folding and the complex machinery synthesizing the proteins in response to perturbation, (iii) how this balancing determines growth rate dependence on temperature and is achieved through nonspecific regulation, and (iv) how thermal instability of the individual protein affects the overall functional state of the proteome. Overall, these results expand our view of cellular regulation, from targeted specific control mechanisms to global regulation through a web of nonspecific competing interactions that modulate the optimal reallocation of cellular resources. The methodology developed in this study enables genome-scale integration of environment-dependent protein properties and a proteome-wide study of cellular stress responses. PMID:29073085
Extreme diversity of scorpion venom peptides and proteins revealed by transcriptomic analysis: implication for proteome evolution of scorpion venom arsenal.

PubMed

Ma, Yibao; He, Yawen; Zhao, Ruiming; Wu, Yingliang; Li, Wenxin; Cao, Zhijian

2012-02-16

Venom is an important genetic development crucial to the survival of scorpions for over 400 million years. We studied the evolution of the scorpion venom arsenal by means of comparative transcriptome analysis of venom glands and phylogenetic analysis of shared types of venom peptides and proteins between buthids and euscorpiids. Fifteen types of venom peptides and proteins were sequenced during the venom gland transcriptome analyses of two Buthidae species (Lychas mucronatus and Isometrus maculatus) and one Euscorpiidae species (Scorpiops margerisonae). Great diversity has been observed in translated amino acid sequences of these transcripts for venom peptides and proteins. Seven types of venom peptides and proteins were shared between buthids and euscorpiids. Molecular phylogenetic analysis revealed that at least five of the seven common types of venom peptides and proteins were likely recruited into the scorpion venom proteome before the lineage split between Buthidae and Euscorpiidae with their corresponding genes undergoing individual or multiple gene duplication events. These are α-KTxs, βKSPNs (β-KTxs and scorpines), anionic peptides, La1-like peptides, and SPSVs (serine proteases from scorpion venom). Multiple types of venom peptides and proteins were demonstrated to be continuously recruited into the venom proteome during the evolution process of individual scorpion lineages. Our results provide an insight into the recruitment pattern of the scorpion venom arsenal for the first time. Copyright © 2011 Elsevier B.V. All rights reserved.

Current Progress in Tonoplast Proteomics Reveals Insights into the Function of the Large Central Vacuole

PubMed Central

Trentmann, Oliver; Haferkamp, Ilka

2013-01-01

Vacuoles of plants fulfill various biologically important functions, like turgor generation and maintenance, detoxification, solute sequestration, or protein storage. Different types of plant vacuoles (lytic versus protein storage) are characterized by different functional properties apparently caused by a different composition/abundance and regulation of transport proteins in the surrounding membrane, the tonoplast. Proteome analyses allow the identification of vacuolar proteins and provide an informative basis for assigning observed transport processes to specific carriers or channels. This review summarizes techniques required for vacuolar proteome analyses, like e.g., isolation of the large central vacuole or tonoplast membrane purification. Moreover, an overview about diverse published vacuolar proteome studies is provided. It becomes evident that qualitative proteomes from different plant species represent just the tip of the iceberg. During the past few years, mass spectrometry achieved immense improvement concerning its accuracy, sensitivity, and application. As a consequence, modern tonoplast proteome approaches are suited for detecting alterations in membrane protein abundance in response to changing environmental/physiological conditions and help to clarify the regulation of tonoplast transport processes. PMID:23459586
Proteomics of gliomas: Initial biomarker discovery and evolution of technology

PubMed Central

Kalinina, Juliya; Peng, Junmin; Ritchie, James C.; Van Meir, Erwin G.

2011-01-01

Gliomas are a group of aggressive brain tumors that diffusely infiltrate adjacent brain tissues, rendering them largely incurable, even with multiple treatment modalities and agents. Mostly asymptomatic at early stages, they present in several subtypes with astrocytic or oligodendrocytic features and invariably progress to malignant forms. Gliomas are difficult to classify precisely because of interobserver variability during histopathologic grading. Identifying biological signatures of each glioma subtype through protein biomarker profiling of tumor or tumor-proximal fluids is therefore of high priority. Such profiling not only may provide clues regarding tumor classification but may identify clinical biomarkers and pathologic targets for the development of personalized treatments. In the past decade, differential proteomic profiling techniques have utilized tumor, cerebrospinal fluid, and plasma from glioma patients to identify the first candidate diagnostic, prognostic, predictive, and therapeutic response markers, highlighting the potential for glioma biomarker discovery. The number of markers identified, however, has been limited, their reproducibility between studies is unclear, and none have been validated for clinical use. Recent technological advancements in methodologies for high-throughput profiling, which provide easy access, rapid screening, low sample consumption, and accurate protein identification, are anticipated to accelerate brain tumor biomarker discovery. Reliable tools for biomarker verification forecast translation of the biomarkers into clinical diagnostics in the foreseeable future. Herein we update the reader on the recent trends and directions in glioma proteomics, including key findings and established and emerging technologies for analysis, together with challenges we are still facing in identifying and verifying potential glioma biomarkers. PMID:21852429
New Genes and Functional Innovation in Mammals.

PubMed

Luis Villanueva-Cañas, José; Ruiz-Orera, Jorge; Agea, M Isabel; Gallo, Maria; Andreu, David; Albà, M Mar

2017-07-01

The birth of genes that encode new protein sequences is a major source of evolutionary innovation. However, we still understand relatively little about how these genes come into being and which functions they are selected for. To address these questions, we have obtained a large collection of mammalian-specific gene families that lack homologues in other eukaryotic groups. We have combined gene annotations and de novo transcript assemblies from 30 different mammalian species, obtaining ∼6,000 gene families. In general, the proteins in mammalian-specific gene families tend to be short and depleted in aromatic and negatively charged residues. Proteins which arose early in mammalian evolution include milk and skin polypeptides, immune response components, and proteins involved in reproduction. In contrast, the functions of proteins which have a more recent origin remain largely unknown, despite the fact that these proteins also have extensive proteomics support. We identify several previously described cases of genes originated de novo from noncoding genomic regions, supporting the idea that this mechanism frequently underlies the evolution of new protein-coding genes in mammals. Finally, we show that most young mammalian genes are preferentially expressed in testis, suggesting that sexual selection plays an important role in the emergence of new functional genes. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
The Cosmic Century

NASA Astrophysics Data System (ADS)

Longair, Malcolm S.

2013-04-01

Part I. Stars and Stellar Evolution up to the Second World War: 1. The legacy of the nineteenth century; 2. The classification of stellar spectra; 3. Stellar structure and evolution; 4. The end points of stellar evolution; Part II. The Large-Scale Structure of the Universe, 1900-1939: 5. The Galaxy and the nature of spiral nebulae; 6. The origins of astrophysical cosmology; Part III. The Opening up of the Electromagnetic Spectrum: 7. The opening up of the electromagnetic spectrum and the new astronomies; Part IV. The Astrophysics of Stars and Galaxies since 1945: 8. Stars and stellar evolution; 9. The physics of the interstellar medium; 10. The physics of galaxies and clusters of galaxies; 11. High-energy astrophysics; Part V. Astrophysical Cosmology since 1945: 12. Astrophysical cosmology; 13. The determination of cosmological parameters; 14. The evolution of galaxies and active galaxies with cosmic epoch; 15. The origin of galaxies and the large-scale structure of the Universe; 16. The very early Universe; References; Name index; Object index; Subject index.
Large-scale proteome analysis of abscisic acid and ABSCISIC ACID INSENSITIVE3-dependent proteins related to desiccation tolerance in Physcomitrella patens

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yotsui, Izumi, E-mail: izumi.yotsui@riken.jp; Serada, Satoshi, E-mail: serada@nibiohn.go.jp; Naka, Tetsuji, E-mail: tnaka@nibiohn.go.jp

2016-03-18

Desiccation tolerance is an ancestral feature of land plants and is still retained in non-vascular plants such as bryophytes and some vascular plants. However, except for seeds and spores, this trait is absent in vegetative tissues of vascular plants. Although many studies have focused on understanding the molecular basis underlying desiccation tolerance using transcriptome and proteome approaches, the critical molecular differences between desiccation tolerant plants and non-desiccation plants are still not clear. The moss Physcomitrella patens cannot survive rapid desiccation under laboratory conditions, but if cells of the protonemata are treated by the phytohormone abscisic acid (ABA) prior to desiccation,more » it can survive 24 h exposure to desiccation and regrow after rehydration. The desiccation tolerance induced by ABA (AiDT) is specific to this hormone, but also depends on a plant transcription factor ABSCISIC ACID INSENSITIVE3 (ABI3). Here we report the comparative proteomic analysis of AiDT between wild type and ABI3 deleted mutant (Δabi3) of P. patens using iTRAQ (Isobaric Tags for Relative and Absolute Quantification). From a total of 1980 unique proteins that we identified, only 16 proteins are significantly altered in Δabi3 compared to wild type after desiccation following ABA treatment. Among this group, three of the four proteins that were severely affected in Δabi3 tissue were Arabidopsis orthologous genes, which were expressed in maturing seeds under the regulation of ABI3. These included a Group 1 late embryogenesis abundant (LEA) protein, a short-chain dehydrogenase, and a desiccation-related protein. Our results suggest that at least three of these proteins expressed in desiccation tolerant cells of both Arabidopsis and the moss are very likely to play important roles in acquisition of desiccation tolerance in land plants. Furthermore, our results suggest that the regulatory machinery of ABA- and ABI3-mediated gene expression for desiccation tolerance might have evolved in ancestral land plants before the separation of bryophytes and vascular plants. - Highlights: • Large-scale proteomics highlighted proteins related to plant desiccation tolerance. • The proteins were regulated by both the phytohormone ABA and ABI3. • The proteins accumulated in desiccation tolerant cells of both Arabidopsis and moss. • Evolutionary origin of regulatory machinery for desiccation tolerance is proposed.« less
Evaluation of a genome-scale in silico metabolic model for Geobacter metallireducens by using proteomic data from a field biostimulation experiment.

PubMed

Fang, Yilin; Wilkins, Michael J; Yabusaki, Steven B; Lipton, Mary S; Long, Philip E

2012-12-01

Accurately predicting the interactions between microbial metabolism and the physical subsurface environment is necessary to enhance subsurface energy development, soil and groundwater cleanup, and carbon management. This study was an initial attempt to confirm the metabolic functional roles within an in silico model using environmental proteomic data collected during field experiments. Shotgun global proteomics data collected during a subsurface biostimulation experiment were used to validate a genome-scale metabolic model of Geobacter metallireducens-specifically, the ability of the metabolic model to predict metal reduction, biomass yield, and growth rate under dynamic field conditions. The constraint-based in silico model of G. metallireducens relates an annotated genome sequence to the physiological functions with 697 reactions controlled by 747 enzyme-coding genes. Proteomic analysis showed that 180 of the 637 G. metallireducens proteins detected during the 2008 experiment were associated with specific metabolic reactions in the in silico model. When the field-calibrated Fe(III) terminal electron acceptor process reaction in a reactive transport model for the field experiments was replaced with the genome-scale model, the model predicted that the largest metabolic fluxes through the in silico model reactions generally correspond to the highest abundances of proteins that catalyze those reactions. Central metabolism predicted by the model agrees well with protein abundance profiles inferred from proteomic analysis. Model discrepancies with the proteomic data, such as the relatively low abundances of proteins associated with amino acid transport and metabolism, revealed pathways or flux constraints in the in silico model that could be updated to more accurately predict metabolic processes that occur in the subsurface environment.
Nano-LC FTICR tandem mass spectrometry for top-down proteomics: routine baseline unit mass resolution of whole cell lysate proteins up to 72 kDa.

PubMed

Tipton, Jeremiah D; Tran, John C; Catherman, Adam D; Ahlf, Dorothy R; Durbin, Kenneth R; Lee, Ji Eun; Kellie, John F; Kelleher, Neil L; Hendrickson, Christopher L; Marshall, Alan G

2012-03-06

Current high-throughput top-down proteomic platforms provide routine identification of proteins less than 25 kDa with 4-D separations. This short communication reports the application of technological developments over the past few years that improve protein identification and characterization for masses greater than 25 kDa. Advances in separation science have allowed increased numbers of proteins to be identified, especially by nanoliquid chromatography (nLC) prior to mass spectrometry (MS) analysis. Further, a goal of high-throughput top-down proteomics is to extend the mass range for routine nLC MS analysis up to 80 kDa because gene sequence analysis predicts that ~70% of the human proteome is transcribed to be less than 80 kDa. Normally, large proteins greater than 50 kDa are identified and characterized by top-down proteomics through fraction collection and direct infusion at relatively low throughput. Further, other MS-based techniques provide top-down protein characterization, however at low resolution for intact mass measurement. Here, we present analysis of standard (up to 78 kDa) and whole cell lysate proteins by Fourier transform ion cyclotron resonance mass spectrometry (nLC electrospray ionization (ESI) FTICR MS). The separation platform reduced the complexity of the protein matrix so that, at 14.5 T, proteins from whole cell lysate up to 72 kDa are baseline mass resolved on a nano-LC chromatographic time scale. Further, the results document routine identification of proteins at improved throughput based on accurate mass measurement (less than 10 ppm mass error) of precursor and fragment ions for proteins up to 50 kDa.
Systematic Analysis of Compositional Order of Proteins Reveals New Characteristics of Biological Functions and a Universal Correlate of Macroevolution

PubMed Central

Persi, Erez; Horn, David

2013-01-01

We present a novel analysis of compositional order (CO) based on the occurrence of Frequent amino-acid Triplets (FTs) that appear much more than random in protein sequences. The method captures all types of proteomic compositional order including single amino-acid runs, tandem repeats, periodic structure of motifs and otherwise low complexity amino-acid regions. We introduce new order measures, distinguishing between ‘regularity’, ‘periodicity’ and ‘vocabulary’, to quantify these phenomena and to facilitate the identification of evolutionary effects. Detailed analysis of representative species across the tree-of-life demonstrates that CO proteins exhibit numerous functional enrichments, including a wide repertoire of particular patterns of dependencies on regularity and periodicity. Comparison between human and mouse proteomes further reveals the interplay of CO with evolutionary trends, such as faster substitution rate in mouse leading to decrease of periodicity, while innovation along the human lineage leads to larger regularity. Large-scale analysis of 94 proteomes leads to systematic ordering of all major taxonomic groups according to FT-vocabulary size. This is measured by the count of Different Frequent Triplets (DFT) in proteomes. The latter provides a clear hierarchical delineation of vertebrates, invertebrates, plants, fungi and prokaryotes, with thermophiles showing the lowest level of FT-vocabulary. Among eukaryotes, this ordering correlates with phylogenetic proximity. Interestingly, in all kingdoms CO accumulation in the proteome has universal characteristics. We suggest that CO is a genomic-information correlate of both macroevolution and various protein functions. The results indicate a mechanism of genomic ‘innovation’ at the peptide level, involved in protein elongation, shaped in a universal manner by mutational and selective forces. PMID:24278003
A Comprehensive Proteomics Analysis of the Human Iris Tissue: Ready to Embrace Postgenomics Precision Medicine in Ophthalmology?

PubMed

Murthy, Krishna R; Dammalli, Manjunath; Pinto, Sneha M; Murthy, Kalpana Babu; Nirujogi, Raja Sekhar; Madugundu, Anil K; Dey, Gourav; Subbannayya, Yashwanth; Mishra, Uttam Kumar; Nair, Bipin; Gowda, Harsha; Prasad, T S Keshava

2016-09-01

The annual economic burden of visual disorders in the United States was estimated at $139 billion. Ophthalmology is therefore one of the salient application fields of postgenomics biotechnologies such as proteomics in the pursuit of global precision medicine. Interestingly, the protein composition of the human iris tissue still remains largely unexplored. In this context, the uveal tract constitutes the vascular middle coat of the eye and is formed by the choroid, ciliary body, and iris. The iris forms the anterior most part of the uvea. It is a thin muscular diaphragm with a central perforation called pupil. Inflammation of the uvea is termed uveitis and causes reduced vision or blindness. However, the pathogenesis of the spectrum of diseases causing uveitis is still not very well understood. We investigated the proteome of the iris tissue harvested from healthy donor eyes that were enucleated within 6 h of death using high-resolution Fourier transform mass spectrometry. A total of 4959 nonredundant proteins were identified in the human iris, which included proteins involved in signaling, cell communication, metabolism, immune response, and transport. This study is the first attempt to comprehensively profile the global proteome of the human iris tissue and, thus, offers the potential to facilitate biomedical research into pathological diseases of the uvea such as Behcet's disease, Vogt Koyonagi Harada's disease, and juvenile rheumatoid arthritis. Finally, we make a call to the broader visual health and ophthalmology community that proteomics offers a veritable prospect to obtain a systems scale, functional, and dynamic picture of the eye tissue in health and disease. This knowledge is ultimately pertinent for precision medicine diagnostics and therapeutics innovation to address the pressing needs of the 21st century visual health.
Proteome Analysis of Liver Cells Expressing a Full- Length Hepatitis C Virus (HCV) Replicon and Biopsy Specimens of Posttransplantation Liver from HCV-Infected Patients

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jacobs, Jon M.; Diamond, Deborah L.; Chan, Eric Y.

2005-06-01

The development of a reproducible model system for the study of Hepatitis C virus (HCV) infection has the potential to significantly enhance the study of virus-host interactions and provide future direction for modeling the pathogenesis of HCV. While there are studies describing global gene expression changes associated with HCV infection, changes in the proteome have not been characterized. We report the first large scale proteome analysis of the highly permissive Huh-7.5 cell line containing a full length HCV replicon. We detected > 4,400 proteins in this cell line, including HCV replicon proteins, using multidimensional liquid chromatographic (LC) separations coupled tomore » mass spectrometry (MS). The set of Huh-7.5 proteins confidently identified is, to our knowledge, the most comprehensive yet reported for a human cell line. Consistent with the literature, a comparison of Huh-7.5 cells (+) and (-) the HCV replicon identified expression changes of proteins involved in lipid metabolism. We extended these analyses to liver biopsy material from HCV-infected patients where > 1,500 proteins were detected from 2 {micro}g protein lysate using the Huh-7.5 protein database and the accurate mass and time (AMT) tag strategy. These findings demonstrate the utility of multidimensional proteome analysis of the HCV replicon model system for assisting the determination of proteins/pathways affected by HCV infection. Our ability to extend these analyses to the highly complex proteome of small liver biopsies with limiting protein yields offers the unique opportunity to begin evaluating the clinical significance of protein expression changes associated with HCV infection.« less
Analysis of passive scalar advection in parallel shear flows: Sorting of modes at intermediate time scales

NASA Astrophysics Data System (ADS)

Camassa, Roberto; McLaughlin, Richard M.; Viotti, Claudio

2010-11-01

The time evolution of a passive scalar advected by parallel shear flows is studied for a class of rapidly varying initial data. Such situations are of practical importance in a wide range of applications from microfluidics to geophysics. In these contexts, it is well-known that the long-time evolution of the tracer concentration is governed by Taylor's asymptotic theory of dispersion. In contrast, we focus here on the evolution of the tracer at intermediate time scales. We show how intermediate regimes can be identified before Taylor's, and in particular, how the Taylor regime can be delayed indefinitely by properly manufactured initial data. A complete characterization of the sorting of these time scales and their associated spatial structures is presented. These analytical predictions are compared with highly resolved numerical simulations. Specifically, this comparison is carried out for the case of periodic variations in the streamwise direction on the short scale with envelope modulations on the long scales, and show how this structure can lead to "anomalously" diffusive transients in the evolution of the scalar onto the ultimate regime governed by Taylor dispersion. Mathematically, the occurrence of these transients can be viewed as a competition in the asymptotic dominance between large Péclet (Pe) numbers and the long/short scale aspect ratios (LVel/LTracer≡k), two independent nondimensional parameters of the problem. We provide analytical predictions of the associated time scales by a modal analysis of the eigenvalue problem arising in the separation of variables of the governing advection-diffusion equation. The anomalous time scale in the asymptotic limit of large k Pe is derived for the short scale periodic structure of the scalar's initial data, for both exactly solvable cases and in general with WKBJ analysis. In particular, the exactly solvable sawtooth flow is especially important in that it provides a short cut to the exact solution to the eigenvalue problem for the physically relevant vanishing Neumann boundary conditions in linear-shear channel flow. We show that the life of the corresponding modes at large Pe for this case is shorter than the ones arising from shear free zones in the fluid's interior. A WKBJ study of the latter modes provides a longer intermediate time evolution. This part of the analysis is technical, as the corresponding spectrum is dominated by asymptotically coalescing turning points in the limit of large Pe numbers. When large scale initial data components are present, the transient regime of the WKBJ (anomalous) modes evolves into one governed by Taylor dispersion. This is studied by a regular perturbation expansion of the spectrum in the small wavenumber regimes.
[ProteoСat: a tool for planning of proteomic experiments].

PubMed

Skvortsov, V S; Alekseychuk, N N; Khudyakov, D V; Mikurova, A V; Rybina, A V; Novikova, S E; Tikhonova, O V

2015-01-01

ProteoCat is a computer program has been designed to help researchers in the planning of large-scale proteomic experiments. The central part of this program is the subprogram of hydrolysis simulation that supports 4 proteases (trypsin, lysine C, endoproteinases AspN and GluC). For the peptides obtained after virtual hydrolysis or loaded from data file a number of properties important in mass-spectrometric experiments can be calculated or predicted. The data can be analyzed or filtered to reduce a set of peptides. The program is using new and improved modification of our methods developed to predict pI and probability of peptide detection; pI can also be predicted for a number of popular pKa's scales, proposed by other investigators. The algorithm for prediction of peptide retention time was realized similar to the algorithm used in the program SSRCalc. ProteoCat can estimate the coverage of amino acid sequences of proteins under defined limitation on peptides detection, as well as the possibility of assembly of peptide fragments with user-defined size of "sticky" ends. The program has a graphical user interface, written on JAVA and available at http://www.ibmc.msk.ru/LPCIT/ProteoCat.
Demonstrating the feasibility of large-scale development of standardized assays to quantify human proteins

PubMed Central

Kennedy, Jacob J.; Abbatiello, Susan E.; Kim, Kyunggon; Yan, Ping; Whiteaker, Jeffrey R.; Lin, Chenwei; Kim, Jun Seok; Zhang, Yuzheng; Wang, Xianlong; Ivey, Richard G.; Zhao, Lei; Min, Hophil; Lee, Youngju; Yu, Myeong-Hee; Yang, Eun Gyeong; Lee, Cheolju; Wang, Pei; Rodriguez, Henry; Kim, Youngsoo; Carr, Steven A.; Paulovich, Amanda G.

2014-01-01

The successful application of MRM in biological specimens raises the exciting possibility that assays can be configured to measure all human proteins, resulting in an assay resource that would promote advances in biomedical research. We report the results of a pilot study designed to test the feasibility of a large-scale, international effort in MRM assay generation. We have configured, validated across three laboratories, and made publicly available as a resource to the community 645 novel MRM assays representing 319 proteins expressed in human breast cancer. Assays were multiplexed in groups of >150 peptides and deployed to quantify endogenous analyte in a panel of breast cancer-related cell lines. Median assay precision was 5.4%, with high inter-laboratory correlation (R2 >0.96). Peptide measurements in breast cancer cell lines were able to discriminate amongst molecular subtypes and identify genome-driven changes in the cancer proteome. These results establish the feasibility of a scaled, international effort. PMID:24317253
Global proteomics profiling improves drug sensitivity prediction: results from a multi-omics, pan-cancer modeling approach.

PubMed

Ali, Mehreen; Khan, Suleiman A; Wennerberg, Krister; Aittokallio, Tero

2018-04-15

Proteomics profiling is increasingly being used for molecular stratification of cancer patients and cell-line panels. However, systematic assessment of the predictive power of large-scale proteomic technologies across various drug classes and cancer types is currently lacking. To that end, we carried out the first pan-cancer, multi-omics comparative analysis of the relative performance of two proteomic technologies, targeted reverse phase protein array (RPPA) and global mass spectrometry (MS), in terms of their accuracy for predicting the sensitivity of cancer cells to both cytotoxic chemotherapeutics and molecularly targeted anticancer compounds. Our results in two cell-line panels demonstrate how MS profiling improves drug response predictions beyond that of the RPPA or the other omics profiles when used alone. However, frequent missing MS data values complicate its use in predictive modeling and required additional filtering, such as focusing on completely measured or known oncoproteins, to obtain maximal predictive performance. Rather strikingly, the two proteomics profiles provided complementary predictive signal both for the cytotoxic and targeted compounds. Further, information about the cellular-abundance of primary target proteins was found critical for predicting the response of targeted compounds, although the non-target features also contributed significantly to the predictive power. The clinical relevance of the selected protein markers was confirmed in cancer patient data. These results provide novel insights into the relative performance and optimal use of the widely applied proteomic technologies, MS and RPPA, which should prove useful in translational applications, such as defining the best combination of omics technologies and marker panels for understanding and predicting drug sensitivities in cancer patients. Processed datasets, R as well as Matlab implementations of the methods are available at https://github.com/mehr-een/bemkl-rbps. mehreen.ali@helsinki.fi or tero.aittokallio@fimm.fi. Supplementary data are available at Bioinformatics online.
Unifying expression scale for peptide hydrophobicity in proteomic reversed phase high-pressure liquid chromatography experiments.

PubMed

Grigoryan, Marine; Shamshurin, Dmitry; Spicer, Victor; Krokhin, Oleg V

2013-11-19

As an initial step in our efforts to unify the expression of peptide retention times in proteomic liquid chromatography-mass spectrometry (LC-MS) experiments, we aligned the chromatographic properties of a number of peptide retention standards against a collection of peptides commonly observed in proteomic experiments. The standard peptide mixtures and tryptic digests of samples of different origins were separated under the identical chromatographic condition most commonly employed in proteomics: 100 Å C18 sorbent with 0.1% formic acid as an ion-pairing modifier. Following our original approach (Krokhin, O. V.; Spicer, V. Anal. Chem. 2009, 81, 9522-9530) the retention characteristics of these standards and collection of tryptic peptides were mapped into hydrophobicity index (HI) or acetonitrile percentage units. This scale allows for direct visualization of the chromatographic outcome of LC-MS acquisitions, monitors the performance of the gradient LC system, and simplifies method development and interlaboratory data alignment. Wide adoption of this approach would significantly aid understanding the basic principles of gradient peptide RP-HPLC and solidify our collective efforts in acquiring confident peptide retention libraries, a key component in the development of targeted proteomic approaches.
Mantle flow influence on subduction evolution

NASA Astrophysics Data System (ADS)

Chertova, Maria V.; Spakman, Wim; Steinberger, Bernhard

2018-05-01

The impact of remotely forced mantle flow on regional subduction evolution is largely unexplored. Here we investigate this by means of 3D thermo-mechanical numerical modeling using a regional modeling domain. We start with simplified models consisting of a 600 km (or 1400 km) wide subducting plate surrounded by other plates. Mantle inflow of ∼3 cm/yr is prescribed during 25 Myr of slab evolution on a subset of the domain boundaries while the other side boundaries are open. Our experiments show that the influence of imposed mantle flow on subduction evolution is the least for trench-perpendicular mantle inflow from either the back or front of the slab leading to 10-50 km changes in slab morphology and trench position while no strong slab dip changes were observed, as compared to a reference model with no imposed mantle inflow. In experiments with trench-oblique mantle inflow we notice larger effects of slab bending and slab translation of the order of 100-200 km. Lastly, we investigate how subduction in the western Mediterranean region is influenced by remotely excited mantle flow that is computed by back-advection of a temperature and density model scaled from a global seismic tomography model. After 35 Myr of subduction evolution we find 10-50 km changes in slab position and slab morphology and a slight change in overall slab tilt. Our study shows that remotely forced mantle flow leads to secondary effects on slab evolution as compared to slab buoyancy and plate motion. Still these secondary effects occur on scales, 10-50 km, typical for the large-scale deformation of the overlying crust and thus may still be of large importance for understanding geological evolution.
Evolution of IPv6 Internet topology with unusual sudden changes

NASA Astrophysics Data System (ADS)

Ai, Jun; Zhao, Hai; Kathleen, M. Carley; Su, Zhan; Li, Hui

2013-07-01

The evolution of Internet topology is not always smooth but sometimes with unusual sudden changes. Consequently, identifying patterns of unusual topology evolution is critical for Internet topology modeling and simulation. We analyze IPv6 Internet topology evolution in IP-level graph to demonstrate how it changes in uncommon ways to restructure the Internet. After evaluating the changes of average degree, average path length, and some other metrics over time, we find that in the case of a large-scale growing the Internet becomes more robust; whereas in a top—bottom connection enhancement the Internet maintains its efficiency with links largely decreased.
Advanced proteomic liquid chromatography

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xie, Fang; Smith, Richard D.; Shen, Yufeng

2012-10-26

Liquid chromatography coupled with mass spectrometry is the predominant platform used to analyze proteomics samples consisting of large numbers of proteins and their proteolytic products (e.g., truncated polypeptides) and spanning a wide range of relative concentrations. This review provides an overview of advanced capillary liquid chromatography techniques and methodologies that greatly improve separation resolving power and proteomics analysis coverage, sensitivity, and throughput.
Investigation of the Large Scale Evolution and Topology of Coronal Mass Ejections in the Solar Wind

NASA Technical Reports Server (NTRS)

Riley, Peter

1999-01-01

This investigation is concerned with the large-scale evolution and topology of Coronal Mass Ejections (CMEs) in the solar wind. During this reporting period we have analyzed a series of low density intervals in the ACE (Advanced Composition Explorer) plasma data set that bear many similarities to CMEs. We have begun a series of 3D, MHD (Magnetohydrodynamics) coronal models to probe potential causes of these events. We also edited two manuscripts concerning the properties of CMEs in the solar wind. One was re-submitted to the Journal of Geophysical Research.
An automated method for detecting alternatively spliced protein domains.

PubMed

Coelho, Vitor; Sammeth, Michael

2018-06-01

Alternative splicing (AS) has been demonstrated to play a role in shaping eukaryotic gene diversity at the transcriptional level. However, the impact of AS on the proteome is still controversial. Studies that seek to explore the effect of AS at the proteomic level are hampered by technical difficulties in the cumbersome process of casting forth and back between genome, transcriptome and proteome space coordinates, and the naïve prediction of protein domains in the presence of AS suffers many redundant sequence scans that emerge from constitutively spliced regions that are shared between alternative products of a gene. We developed the AstaFunk pipeline that computes for every generic transcriptome all domains that are altered by AS events in a systematic and efficient manner. In a nutshell, our method employs Viterbi dynamic programming, which guarantees to find all score-optimal hits of the domains under consideration, while complementary optimisations at different levels avoid redundant and other irrelevant computations. We evaluate AstaFunk qualitatively and quantitatively using RNAseq in well-studied genes with AS, and on large-scale employing entire transcriptomes. Our study confirms complementary reports that the effect of most AS events on the proteome seems to be rather limited, but our results also pinpoint several cases where AS could have a major impact on the function of a protein domain. The JAVA implementation of AstaFunk is available as an open source project on http://astafunk.sammeth.net. micha@sammeth.net. Supplementary data are available at Bioinformatics online.

Epigenetics and Proteomics Join Transcriptomics in the Quest for Tuberculosis Biomarkers

PubMed Central

Esterhuyse, Maria M.; Weiner, January; Caron, Etienne; Loxton, Andre G.; Iannaccone, Marco; Wagman, Chandre; Saikali, Philippe; Stanley, Kim; Wolski, Witold E.; Mollenkopf, Hans-Joachim; Schick, Matthias; Aebersold, Ruedi; Linhart, Heinz; Walzl, Gerhard

2015-01-01

ABSTRACT An estimated one-third of the world’s population is currently latently infected with Mycobacterium tuberculosis. Latent M. tuberculosis infection (LTBI) progresses into active tuberculosis (TB) disease in ~5 to 10% of infected individuals. Diagnostic and prognostic biomarkers to monitor disease progression are urgently needed to ensure better care for TB patients and to decrease the spread of TB. Biomarker development is primarily based on transcriptomics. Our understanding of biology combined with evolving technical advances in high-throughput techniques led us to investigate the possibility of additional platforms (epigenetics and proteomics) in the quest to (i) understand the biology of the TB host response and (ii) search for multiplatform biosignatures in TB. We engaged in a pilot study to interrogate the DNA methylome, transcriptome, and proteome in selected monocytes and granulocytes from TB patients and healthy LTBI participants. Our study provides first insights into the levels and sources of diversity in the epigenome and proteome among TB patients and LTBI controls, despite limitations due to small sample size. Functionally the differences between the infection phenotypes (LTBI versus active TB) observed in the different platforms were congruent, thereby suggesting regulation of function not only at the transcriptional level but also by DNA methylation and microRNA. Thus, our data argue for the development of a large-scale study of the DNA methylome, with particular attention to study design in accounting for variation based on gender, age, and cell type. PMID:26374119
Genome-scale prediction of proteins with long intrinsically disordered regions.

PubMed

Peng, Zhenling; Mizianty, Marcin J; Kurgan, Lukasz

2014-01-01

Proteins with long disordered regions (LDRs), defined as having 30 or more consecutive disordered residues, are abundant in eukaryotes, and these regions are recognized as a distinct class of biologically functional domains. LDRs facilitate various cellular functions and are important for target selection in structural genomics. Motivated by the lack of methods that directly predict proteins with LDRs, we designed Super-fast predictor of proteins with Long Intrinsically DisordERed regions (SLIDER). SLIDER utilizes logistic regression that takes an empirically chosen set of numerical features, which consider selected physicochemical properties of amino acids, sequence complexity, and amino acid composition, as its inputs. Empirical tests show that SLIDER offers competitive predictive performance combined with low computational cost. It outperforms, by at least a modest margin, a comprehensive set of modern disorder predictors (that can indirectly predict LDRs) and is 16 times faster compared to the best currently available disorder predictor. Utilizing our time-efficient predictor, we characterized abundance and functional roles of proteins with LDRs over 110 eukaryotic proteomes. Similar to related studies, we found that eukaryotes have many (on average 30.3%) proteins with LDRs with majority of proteomes having between 25 and 40%, where higher abundance is characteristic to proteomes that have larger proteins. Our first-of-its-kind large-scale functional analysis shows that these proteins are enriched in a number of cellular functions and processes including certain binding events, regulation of catalytic activities, cellular component organization, biogenesis, biological regulation, and some metabolic and developmental processes. A webserver that implements SLIDER is available at http://biomine.ece.ualberta.ca/SLIDER/. Copyright © 2013 Wiley Periodicals, Inc.
Evolutionary Proteomics Uncovers Ancient Associations of Cilia with Signaling Pathways.

PubMed

Sigg, Monika Abedin; Menchen, Tabea; Lee, Chanjae; Johnson, Jeffery; Jungnickel, Melissa K; Choksi, Semil P; Garcia, Galo; Busengdal, Henriette; Dougherty, Gerard W; Pennekamp, Petra; Werner, Claudius; Rentzsch, Fabian; Florman, Harvey M; Krogan, Nevan; Wallingford, John B; Omran, Heymut; Reiter, Jeremy F

2017-12-18

Cilia are organelles specialized for movement and signaling. To infer when during evolution signaling pathways became associated with cilia, we characterized the proteomes of cilia from sea urchins, sea anemones, and choanoflagellates. We identified 437 high-confidence ciliary candidate proteins conserved in mammals and discovered that Hedgehog and G-protein-coupled receptor pathways were linked to cilia before the origin of bilateria and transient receptor potential (TRP) channels before the origin of animals. We demonstrated that candidates not previously implicated in ciliary biology localized to cilia and further investigated ENKUR, a TRP channel-interacting protein identified in the cilia of all three organisms. ENKUR localizes to motile cilia and is required for patterning the left-right axis in vertebrates. Moreover, mutation of ENKUR causes situs inversus in humans. Thus, proteomic profiling of cilia from diverse eukaryotes defines a conserved ciliary proteome, reveals ancient connections to signaling, and uncovers a ciliary protein that underlies development and human disease. Copyright © 2017 Elsevier Inc. All rights reserved.
Evolution of wealth in a non-conservative economy driven by local Nash equilibria.

PubMed

Degond, Pierre; Liu, Jian-Guo; Ringhofer, Christian

2014-11-13

We develop a model for the evolution of wealth in a non-conservative economic environment, extending a theory developed in Degond et al. (2014 J. Stat. Phys. 154, 751-780 (doi:10.1007/s10955-013-0888-4)). The model considers a system of rational agents interacting in a game-theoretical framework. This evolution drives the dynamics of the agents in both wealth and economic configuration variables. The cost function is chosen to represent a risk-averse strategy of each agent. That is, the agent is more likely to interact with the market, the more predictable the market, and therefore the smaller its individual risk. This yields a kinetic equation for an effective single particle agent density with a Nash equilibrium serving as the local thermodynamic equilibrium. We consider a regime of scale separation where the large-scale dynamics is given by a hydrodynamic closure with this local equilibrium. A class of generalized collision invariants is developed to overcome the difficulty of the non-conservative property in the hydrodynamic closure derivation of the large-scale dynamics for the evolution of wealth distribution. The result is a system of gas dynamics-type equations for the density and average wealth of the agents on large scales. We recover the inverse Gamma distribution, which has been previously considered in the literature, as a local equilibrium for particular choices of the cost function. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
The recent breakup of an asteroid in the main-belt region.

PubMed

Nesvorný, David; Bottke, William F; Dones, Luke; Levison, Harold F

2002-06-13

The present population of asteroids in the main belt is largely the result of many past collisions. Ideally, the asteroid fragments resulting from each impact event could help us understand the large-scale collisions that shaped the planets during early epochs. Most known asteroid fragment families, however, are very old and have therefore undergone significant collisional and dynamical evolution since their formation. This evolution has masked the properties of the original collisions. Here we report the discovery of a family of asteroids that formed in a disruption event only 5.8 +/- 0.2 million years ago, and which has subsequently undergone little dynamical and collisional evolution. We identified 39 fragments, two of which are large and comparable in size (diameters of approximately 19 and approximately 14 km), with the remainder exhibiting a continuum of sizes in the range 2-7 km. The low measured ejection velocities suggest that gravitational re-accumulation after a collision may be a common feature of asteroid evolution. Moreover, these data can be used to check numerical models of larger-scale collisions.
Trade-off between Transcriptome Plasticity and Genome Evolution in Cephalopods.

PubMed

Liscovitch-Brauer, Noa; Alon, Shahar; Porath, Hagit T; Elstein, Boaz; Unger, Ron; Ziv, Tamar; Admon, Arie; Levanon, Erez Y; Rosenthal, Joshua J C; Eisenberg, Eli

2017-04-06

RNA editing, a post-transcriptional process, allows the diversification of proteomes beyond the genomic blueprint; however it is infrequently used among animals for this purpose. Recent reports suggesting increased levels of RNA editing in squids thus raise the question of the nature and effects of these events. We here show that RNA editing is particularly common in behaviorally sophisticated coleoid cephalopods, with tens of thousands of evolutionarily conserved sites. Editing is enriched in the nervous system, affecting molecules pertinent for excitability and neuronal morphology. The genomic sequence flanking editing sites is highly conserved, suggesting that the process confers a selective advantage. Due to the large number of sites, the surrounding conservation greatly reduces the number of mutations and genomic polymorphisms in protein-coding regions. This trade-off between genome evolution and transcriptome plasticity highlights the importance of RNA recoding as a strategy for diversifying proteins, particularly those associated with neural function. PAPERCLIP. Copyright © 2017 Elsevier Inc. All rights reserved.
BioPlex Display: An Interactive Suite for Large-Scale AP-MS Protein-Protein Interaction Data.

PubMed

Schweppe, Devin K; Huttlin, Edward L; Harper, J Wade; Gygi, Steven P

2018-01-05

The development of large-scale data sets requires a new means to display and disseminate research studies to large audiences. Knowledge of protein-protein interaction (PPI) networks has become a principle interest of many groups within the field of proteomics. At the confluence of technologies, such as cross-linking mass spectrometry, yeast two-hybrid, protein cofractionation, and affinity purification mass spectrometry (AP-MS), detection of PPIs can uncover novel biological inferences at a high-throughput. Thus new platforms to provide community access to large data sets are necessary. To this end, we have developed a web application that enables exploration and dissemination of the growing BioPlex interaction network. BioPlex is a large-scale interactome data set based on AP-MS of baits from the human ORFeome. The latest BioPlex data set release (BioPlex 2.0) contains 56 553 interactions from 5891 AP-MS experiments. To improve community access to this vast compendium of interactions, we developed BioPlex Display, which integrates individual protein querying, access to empirical data, and on-the-fly annotation of networks within an easy-to-use and mobile web application. BioPlex Display enables rapid acquisition of data from BioPlex and development of hypotheses based on protein interactions.
Limits on transverse momentum dependent evolution from semi-inclusive deep inelastic scattering at moderate Q

NASA Astrophysics Data System (ADS)

Aidala, C. A.; Field, B.; Gamberg, L. P.; Rogers, T. C.

2014-05-01

In the QCD evolution of transverse momentum dependent parton distribution and fragmentation functions, the Collins-Soper evolution kernel includes both a perturbative short-distance contribution and a large-distance nonperturbative, but strongly universal, contribution. In the past, global fits, based mainly on larger Q Drell-Yan-like processes, have found substantial contributions from nonperturbative regions in the Collins-Soper evolution kernel. In this article, we investigate semi-inclusive deep inelastic scattering measurements in the region of relatively small Q, of the order of a few GeV, where sensitivity to nonperturbative transverse momentum dependence may become more important or even dominate the evolution. Using recently available deep inelastic scattering data from the COMPASS experiment, we provide estimates of the regions of coordinate space that dominate in transverse momentum dependent (TMD) processes when the hard scale is of the order of only a few GeV. We find that distance scales that are much larger than those commonly probed in large Q measurements become important, suggesting that the details of nonperturbative effects in TMD evolution are especially significant in the region of intermediate Q. We highlight the strongly universal nature of the nonperturbative component of evolution and its potential to be tightly constrained by fits from a wide variety of observables that include both large and moderate Q. On this basis, we recommend detailed treatments of the nonperturbative component of the Collins-Soper evolution kernel for future TMD studies.
Resources for Functional Genomics Studies in Drosophila melanogaster

PubMed Central

Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert

2014-01-01

Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003
Spatial and temporal dynamics of the cardiac mitochondrial proteome.

PubMed

Lau, Edward; Huang, Derrick; Cao, Quan; Dincer, T Umut; Black, Caitie M; Lin, Amanda J; Lee, Jessica M; Wang, Ding; Liem, David A; Lam, Maggie P Y; Ping, Peipei

2015-04-01

Mitochondrial proteins alter in their composition and quantity drastically through time and space in correspondence to changing energy demands and cellular signaling events. The integrity and permutations of this dynamism are increasingly recognized to impact the functions of the cardiac proteome in health and disease. This article provides an overview on recent advances in defining the spatial and temporal dynamics of mitochondrial proteins in the heart. Proteomics techniques to characterize dynamics on a proteome scale are reviewed and the physiological consequences of altered mitochondrial protein dynamics are discussed. Lastly, we offer our perspectives on the unmet challenges in translating mitochondrial dynamics markers into the clinic.
ICPC Pilots International Student Training, Paving a Path for Tomorrow’s Cancer Researchers | Office of Cancer Clinical Proteomics Research

Cancer.gov

The Office of Cancer Clinical Proteomics Research at the National Cancer Institute, part of the United States National Institutes of Health, is spearheading the preparationand training of the proteogenomic research workforce on an international scale.
Elucidating the fungal stress response by proteomics.

PubMed

Kroll, Kristin; Pähtz, Vera; Kniemeyer, Olaf

2014-01-31

Fungal species need to cope with stress, both in the natural environment and during interaction of human- or plant pathogenic fungi with their host. Many regulatory circuits governing the fungal stress response have already been discovered. However, there are still large gaps in the knowledge concerning the changes of the proteome during adaptation to environmental stress conditions. With the application of proteomic methods, particularly 2D-gel and gel-free, LC/MS-based methods, first insights into the composition and dynamic changes of the fungal stress proteome could be obtained. Here, we review the recent proteome data generated for filamentous fungi and yeasts. This article is part of a Special Issue entitled: Trends in Microbial Proteomics. Copyright © 2013 Elsevier B.V. All rights reserved.
Supermassive Black Holes and Galaxy Evolution

NASA Technical Reports Server (NTRS)

Merritt, D.

2004-01-01

Supermassive black holes appear to be generic components of galactic nuclei. The formation and growth of black holes is intimately connected with the evolution of galaxies on a wide range of scales. For instance, mergers between galaxies containing nuclear black holes would produce supermassive binaries which eventually coalesce via the emission of gravitational radiation. The formation and decay of these binaries is expected to produce a number of observable signatures in the stellar distribution. Black holes can also affect the large-scale structure of galaxies by perturbing the orbits of stars that pass through the nucleus. Large-scale N-body simulations are beginning to generate testable predictions about these processes which will allow us to draw inferences about the formation history of supermassive black holes.
Evolution of the Busbar Structure in Large-Scale Aluminum Reduction Cells

NASA Astrophysics Data System (ADS)

Zhang, Hongliang; Liang, Jinding; Li, Jie; Sun, Kena; Xiao, Jin

2017-02-01

Studies of magnetic field and magneto-hydro-dynamics are regarded as the foundation for the development of large-scale aluminum reduction cells, while due to the direct relationship between the busbar configuration and magnetic compensation, the actual key content is the configuration of the busbar. As the line current has been increased from 160 kA to 600 kA, the configuration of the busbar was becoming more complex. To summarize and explore the evolution of busbar configuration in aluminum reduction cells, this paper has reviewed various representative large-scale pre-baked aluminum reduction cell busbar structures, such as end-to-end potlines, side-by-side potlines and external compensation current. The advantages and disadvantages in the magnetic distribution or technical specifications have also been introduced separately, especially for the configurations of the mainstream 400-kA potlines. In the end, the development trends of the bus structure configuration were prospected, based on the recent successful applications of super-scale cell busbar structures in China (500-600 kA).
Large-scale structure in a texture-seeded cold dark matter cosmogony

NASA Technical Reports Server (NTRS)

Park, Changbom; Spergel, David N.; Turok, Nail

1991-01-01

This paper studies the formation of large-scale structure by global texture in a flat universe dominated by cold dark matter. A code for evolution of the texture fields was combined with an N-body code for evolving the dark matter. The results indicate some promising aspects: with only one free parameter, the observed galaxy-galaxy correlation function is reproduced, clusters of galaxies are found to be significantly clustered on a scale of 20-50/h Mpc, and coherent structures of over 50/h Mpc in the galaxy distribution were found. The large-scale streaming motions observed are in good agreement with the observations: the average magnitude of the velocity field smoothed over 30/h Mpc is 430 km/sec. Global texture produces a cosmic Mach number that is compatible with observation. Also, significant evolution of clusters at low redshift was seen. Possible problems for the theory include too high velocity dispersions in clusters, and voids which are not as empty as those observed.
Plant Aquaporins: Genome-Wide Identification, Transcriptomics, Proteomics, and Advanced Analytical Tools.

PubMed

Deshmukh, Rupesh K; Sonah, Humira; Bélanger, Richard R

2016-01-01

Aquaporins (AQPs) are channel-forming integral membrane proteins that facilitate the movement of water and many other small molecules. Compared to animals, plants contain a much higher number of AQPs in their genome. Homology-based identification of AQPs in sequenced species is feasible because of the high level of conservation of protein sequences across plant species. Genome-wide characterization of AQPs has highlighted several important aspects such as distribution, genetic organization, evolution and conserved features governing solute specificity. From a functional point of view, the understanding of AQP transport system has expanded rapidly with the help of transcriptomics and proteomics data. The efficient analysis of enormous amounts of data generated through omic scale studies has been facilitated through computational advancements. Prediction of protein tertiary structures, pore architecture, cavities, phosphorylation sites, heterodimerization, and co-expression networks has become more sophisticated and accurate with increasing computational tools and pipelines. However, the effectiveness of computational approaches is based on the understanding of physiological and biochemical properties, transport kinetics, solute specificity, molecular interactions, sequence variations, phylogeny and evolution of aquaporins. For this purpose, tools like Xenopus oocyte assays, yeast expression systems, artificial proteoliposomes, and lipid membranes have been efficiently exploited to study the many facets that influence solute transport by AQPs. In the present review, we discuss genome-wide identification of AQPs in plants in relation with recent advancements in analytical tools, and their availability and technological challenges as they apply to AQPs. An exhaustive review of omics resources available for AQP research is also provided in order to optimize their efficient utilization. Finally, a detailed catalog of computational tools and analytical pipelines is offered as a resource for AQP research.
Plant Aquaporins: Genome-Wide Identification, Transcriptomics, Proteomics, and Advanced Analytical Tools

PubMed Central

Deshmukh, Rupesh K.; Sonah, Humira; Bélanger, Richard R.

2016-01-01

Aquaporins (AQPs) are channel-forming integral membrane proteins that facilitate the movement of water and many other small molecules. Compared to animals, plants contain a much higher number of AQPs in their genome. Homology-based identification of AQPs in sequenced species is feasible because of the high level of conservation of protein sequences across plant species. Genome-wide characterization of AQPs has highlighted several important aspects such as distribution, genetic organization, evolution and conserved features governing solute specificity. From a functional point of view, the understanding of AQP transport system has expanded rapidly with the help of transcriptomics and proteomics data. The efficient analysis of enormous amounts of data generated through omic scale studies has been facilitated through computational advancements. Prediction of protein tertiary structures, pore architecture, cavities, phosphorylation sites, heterodimerization, and co-expression networks has become more sophisticated and accurate with increasing computational tools and pipelines. However, the effectiveness of computational approaches is based on the understanding of physiological and biochemical properties, transport kinetics, solute specificity, molecular interactions, sequence variations, phylogeny and evolution of aquaporins. For this purpose, tools like Xenopus oocyte assays, yeast expression systems, artificial proteoliposomes, and lipid membranes have been efficiently exploited to study the many facets that influence solute transport by AQPs. In the present review, we discuss genome-wide identification of AQPs in plants in relation with recent advancements in analytical tools, and their availability and technological challenges as they apply to AQPs. An exhaustive review of omics resources available for AQP research is also provided in order to optimize their efficient utilization. Finally, a detailed catalog of computational tools and analytical pipelines is offered as a resource for AQP research. PMID:28066459
Biogeoscience from a Metallomic and Proteomic Perspective

NASA Astrophysics Data System (ADS)

Anbar, A. D.; Shock, E.

2004-12-01

In the wake of the genomics revolution, life scientists are expanding their focus from the genome to the "proteome" - the assemblage of all proteins in a cell - and the "metallome" - the distribution of inorganic species in a cell. The proteome and metallome are tightly connected because proteins and protein products are intimately involved in the transport and homeostasis of inorganic elements, and because many enzymes depend on inorganic elements for catalytic activity. Together, they are at the heart of metabolic function. Unlike the relatively static genome, the proteome and metallome are extremely dynamic, changing rapidly in response to environmental cues. They are substantially more complex than the genome; for example, in humans, some 30,000 genes code for approximately 500,000 proteins. Metaphorically, the proteome and metallome constitute the complex, dynamic "language" by which the genome and the environment communicate. Therefore biogeochemists, like life scientists, are moving beyond a strictly genomic perspective. Research guided by proteomic and metallomic perspectives and methodologies should provide new insights into the connections between life and the inorganic Earth in modern environments, and the evolution of these connections through time. For example, biogeochemical research in modern environments, such as Yellowstone hot springs, is hindered by the gap between genomic determinations of metabolic potential in ecosystems and geochemical characterizations of the energetic boundary conditions faced by these ecosystems; genomics tells us "who is there" and geochemistry tells us "what they might be doing", but neither genomics nor geochemistry easily provide quantitative information about which metabolisms are actually active or a framework for understanding why ecosystems do not fully exploit the energy available in their surroundings. Such questions are fundamentally kinetic rather than thermodynamic and therefore demand that we characterize and understand the proteins and inorganic elements used by organisms to catalyze reactions and capture energy from their surroundings. Similar challenges are faced when attempting to map the evolutionary relationships inferred from phylogenetic analyses of genomes to ecological histories determined by geochemists and paleobiologists - for example, ongoing efforts to understand the evolutionary history of eukaryotes and metazoa - because the driving forces for the evolution and ecological radiation of organisms lie at the intersection of metabolism and environment, and hence in the gap between genomes and geochemistry. Future progress in understanding the biogeochemistry of modern and ancient environments will be spurred by integrating proteomic and metallomic methods and perspectives.
Cloud-based solution to identify statistically significant MS peaks differentiating sample categories.

PubMed

Ji, Jun; Ling, Jeffrey; Jiang, Helen; Wen, Qiaojun; Whitin, John C; Tian, Lu; Cohen, Harvey J; Ling, Xuefeng B

2013-03-23

Mass spectrometry (MS) has evolved to become the primary high throughput tool for proteomics based biomarker discovery. Until now, multiple challenges in protein MS data analysis remain: large-scale and complex data set management; MS peak identification, indexing; and high dimensional peak differential analysis with the concurrent statistical tests based false discovery rate (FDR). "Turnkey" solutions are needed for biomarker investigations to rapidly process MS data sets to identify statistically significant peaks for subsequent validation. Here we present an efficient and effective solution, which provides experimental biologists easy access to "cloud" computing capabilities to analyze MS data. The web portal can be accessed at http://transmed.stanford.edu/ssa/. Presented web application supplies large scale MS data online uploading and analysis with a simple user interface. This bioinformatic tool will facilitate the discovery of the potential protein biomarkers using MS.
Proteomic Characterization of Differential Abundant Proteins Accumulated between Lower and Upper Epidermises of Fleshy Scales in Onion (Allium cepa L.) Bulbs

PubMed Central

Wu, Xiaolin

2016-01-01

The onion (Allium cepa L.) is widely planted worldwide as a valuable vegetable crop. The scales of an onion bulb are a modified type of leaf. The one-layer-cell epidermis of onion scales is commonly used as a model experimental material in botany and molecular biology. The lower epidermis (LE) and upper epidermis (UE) of onion scales display obvious differences in microscopic structure, cell differentiation and pigment synthesis; however, associated proteomic differences are unclear. LE and UE can be easily sampled as single-layer-cell tissues for comparative proteomic analysis. In this study, a proteomic approach based on 2-DE and mass spectrometry (MS) was applied to compare LE and UE of fleshy scales from yellow and red onions. We identified 47 differential abundant protein spots (representing 31 unique proteins) between LE and UE in red and yellow onions. These proteins are mainly involved in pigment synthesis, stress response, and cell division. Particularly, the differentially accumulated chalcone-flavanone isomerase and flavone O-methyltransferase 1-like in LE may result in the differences in the onion scale color between red and yellow onions. Moreover, stress-related proteins abundantly accumulated in both LE and UE. In addition, the differential accumulation of UDP-arabinopyranose mutase 1-like protein and β-1,3-glucanase in the LE may be related to the different cell sizes between LE and UE of the two types of onion. The data derived from this study provides new insight into the differences in differentiation and developmental processes between onion epidermises. This study may also make a contribution to onion breeding, such as improving resistances and changing colors. PMID:28036352

Proteomic Characterization of Differential Abundant Proteins Accumulated between Lower and Upper Epidermises of Fleshy Scales in Onion (Allium cepa L.) Bulbs.

PubMed

Wu, Si; Ning, Fen; Wu, Xiaolin; Wang, Wei

2016-01-01

The onion (Allium cepa L.) is widely planted worldwide as a valuable vegetable crop. The scales of an onion bulb are a modified type of leaf. The one-layer-cell epidermis of onion scales is commonly used as a model experimental material in botany and molecular biology. The lower epidermis (LE) and upper epidermis (UE) of onion scales display obvious differences in microscopic structure, cell differentiation and pigment synthesis; however, associated proteomic differences are unclear. LE and UE can be easily sampled as single-layer-cell tissues for comparative proteomic analysis. In this study, a proteomic approach based on 2-DE and mass spectrometry (MS) was applied to compare LE and UE of fleshy scales from yellow and red onions. We identified 47 differential abundant protein spots (representing 31 unique proteins) between LE and UE in red and yellow onions. These proteins are mainly involved in pigment synthesis, stress response, and cell division. Particularly, the differentially accumulated chalcone-flavanone isomerase and flavone O-methyltransferase 1-like in LE may result in the differences in the onion scale color between red and yellow onions. Moreover, stress-related proteins abundantly accumulated in both LE and UE. In addition, the differential accumulation of UDP-arabinopyranose mutase 1-like protein and β-1,3-glucanase in the LE may be related to the different cell sizes between LE and UE of the two types of onion. The data derived from this study provides new insight into the differences in differentiation and developmental processes between onion epidermises. This study may also make a contribution to onion breeding, such as improving resistances and changing colors.
Development and application of automated systems for plasmid-based functional proteomics to improve syntheitc biology of engineered industrial microbes for high level expression of proteases for biofertilizer production

USDA-ARS?s Scientific Manuscript database

In addition to microarray technology, which provides a robust method to study protein function in a rapid, economical, and proteome-wide fashion, plasmid-based functional proteomics is an important technology for rapidly obtaining large quantities of protein and determining protein function across a...
Advanced proteomic liquid chromatography

PubMed Central

Xie, Fang; Smith, Richard D.; Shen, Yufeng

2012-01-01

Liquid chromatography coupled with mass spectrometry is the predominant platform used to analyze proteomics samples consisting of large numbers of proteins and their proteolytic products (e.g., truncated polypeptides) and spanning a wide range of relative concentrations. This review provides an overview of advanced capillary liquid chromatography techniques and methodologies that greatly improve separation resolving power and proteomics analysis coverage, sensitivity, and throughput. PMID:22840822
Why proteomics is not the new genomics and the future of mass spectrometry in cell biology.

PubMed

Sidoli, Simone; Kulej, Katarzyna; Garcia, Benjamin A

2017-01-02

Mass spectrometry (MS) is an essential part of the cell biologist's proteomics toolkit, allowing analyses at molecular and system-wide scales. However, proteomics still lag behind genomics in popularity and ease of use. We discuss key differences between MS-based -omics and other booming -omics technologies and highlight what we view as the future of MS and its role in our increasingly deep understanding of cell biology. © 2017 Sidoli et al.
Unification of small and large time scales for biological evolution: deviations from power law.

PubMed

Chowdhury, Debashish; Stauffer, Dietrich; Kunwar, Ambarish

2003-02-14

We develop a unified model that describes both "micro" and "macro" evolutions within a single theoretical framework. The ecosystem is described as a dynamic network; the population dynamics at each node of this network describes the "microevolution" over ecological time scales (i.e., birth, ageing, and natural death of individual organisms), while the appearance of new nodes, the slow changes of the links, and the disappearance of existing nodes accounts for the "macroevolution" over geological time scales (i.e., the origination, evolution, and extinction of species). In contrast to several earlier claims in the literature, we observe strong deviations from power law in the regime of long lifetimes.
Formation Stellaire Aux Échelles Des Galaxies

NASA Astrophysics Data System (ADS)

Boissier, S.

2012-12-01

Star Formation is at the very core of the evolution of galaxies. From their gas reservoir (filled by infall or fusions), stars form at the "Star Formation Rate" (SFR), with an enormous impact on many aspects of the evolution of galaxies. This HDR presents first the formalism concerning star formation (SFR, IMF), some theoretical suggestions on physical processes that may affect star formation on various galactic scales, and the methods used to determine the SFR from observations. A large part is dedicated to the "Star Formation Laws" (e.g. Schmidt law) on various scales (local, radial, and global law). Finally, the last part concerns the largest scales (evolution of the "cosmic" SFR and effect of the environment).
Cis-regulatory Elements and Human Evolution

PubMed Central

Siepel, Adam

2014-01-01

Modification of gene regulation has long been considered an important force in human evolution, particularly through changes to cis-regulatory elements (CREs) that function in transcriptional regulation. For decades, however, the study of cis-regulatory evolution was severely limited by the available data. New data sets describing the locations of CREs and genetic variation within and between species have now made it possible to study CRE evolution much more directly on a genome-wide scale. Here, we review recent research on the evolution of CREs in humans based on large-scale genomic data sets. We consider inferences based on primate divergence, human polymorphism, and combinations of divergence and polymorphism. We then consider “new frontiers” in this field stemming from recent research on transcriptional regulation. PMID:25218861
The evolution and function of protein tandem repeats in plants.

PubMed

Schaper, Elke; Anisimova, Maria

2015-04-01

Sequence tandem repeats (TRs) are abundant in proteomes across all domains of life. For plants, little is known about their distribution or contribution to protein function. We exhaustively annotated TRs and studied the evolution of TR unit variations for all Ensembl plants. Using phylogenetic patterns of TR units, we detected conserved TRs with unit number and order preserved during evolution, and those TRs that have diverged via recent TR unit gains/losses. We correlated the mode of evolution of TRs to protein function. TR number was strongly correlated with proteome size, with about one-half of all TRs recognized as common protein domains. The majority of TRs have been highly conserved over long evolutionary distances, some since the separation of red algae and green plants c. 1.6 billion yr ago. Conversely, recurrent recent TR unit mutations were rare. Our results suggest that the first TRs by far predate the first plants, and that TR appearance is an ongoing process with similar rates across the plant kingdom. Interestingly, the few detected highly mutable TRs might provide a source of variation for rapid adaptation. In particular, such TRs are enriched in leucine-rich repeats (LRRs) commonly found in R genes, where TR unit gain/loss may facilitate resistance to emerging pathogens. © 2014 The Authors. New Phytologist © 2014 New Phytologist Trust.
Directed Evolution of a Cyclized Peptoid-Peptide Chimera against a Cell-Free Expressed Protein and Proteomic Profiling of the Interacting Proteins to Create a Protein-Protein Interaction Inhibitor.

PubMed

Kawakami, Takashi; Ogawa, Koji; Hatta, Tomohisa; Goshima, Naoki; Natsume, Tohru

2016-06-17

N-alkyl amino acids are useful building blocks for the in vitro display evolution of ribosomally synthesized peptides because they can increase the proteolytic stability and cell permeability of these peptides. However, the translation initiation substrate specificity of nonproteinogenic N-alkyl amino acids has not been investigated. In this study, we screened various N-alkyl amino acids and nonamino carboxylic acids for translation initiation with an Escherichia coli reconstituted cell-free translation system (PURE system) and identified those that efficiently initiated translation. Using seven of these efficiently initiating acids, we next performed in vitro display evolution of cyclized peptidomimetics against an arbitrarily chosen model human protein (β-catenin) cell-free expressed from its cloned cDNA (HUPEX) and identified a novel β-catenin-binding cyclized peptoid-peptide chimera. Furthermore, by a proteomic approach using direct nanoflow liquid chromatography-tandem mass spectrometry (DNLC-MS/MS), we successfully identified which protein-β-catenin interaction is inhibited by the chimera. The combination of in vitro display evolution of cyclized N-alkyl peptidomimetics and in vitro expression of human proteins would be a powerful approach for the high-speed discovery of diverse human protein-targeted cyclized N-alkyl peptidomimetics.
Evolution of egg coats: linking molecular biology and ecology.

PubMed

Shu, Longfei; Suter, Marc J-F; Räsänen, Katja

2015-08-01

One central goal of evolutionary biology is to explain how biological diversity emerges and is maintained in nature. Given the complexity of the phenotype and the multifaceted nature of inheritance, modern evolutionary ecological studies rely heavily on the use of molecular tools. Here, we show how molecular tools help to gain insight into the role of egg coats (i.e. the extracellular structures surrounding eggs and embryos) in evolutionary diversification. Egg coats are maternally derived structures that have many biological functions from mediating fertilization to protecting the embryo from environmental hazards. They show great molecular, structural and functional diversity across species, but intraspecific variability and the role of ecology in egg coat evolution have largely been overlooked. Given that much of the variation that influences egg coat function is ultimately determined by their molecular phenotype, cutting-edge molecular tools (e.g. proteomics, glycomics and transcriptomics), combined with functional assays, are needed for rigorous inferences on their evolutionary ecology. Here, we identify key research areas and highlight emerging molecular techniques that can increase our understanding of the role of egg coats in the evolution of biological diversity, from adaptation to speciation. © 2015 John Wiley & Sons Ltd.
Viscous anisotropy of textured olivine aggregates: 2. Micromechanical model

NASA Astrophysics Data System (ADS)

Hansen, Lars N.; Conrad, Clinton P.; Boneh, Yuval; Skemer, Philip; Warren, Jessica M.; Kohlstedt, David L.

2016-10-01

The significant viscous anisotropy that results from crystallographic alignment (texture) of olivine grains in deformed upper mantle rocks strongly influences a large variety of geodynamic processes. Our ability to explore the effects of anisotropic viscosity in simulations of these processes requires a mechanical model that can predict the magnitude of anisotropy and its evolution. Unfortunately, existing models of olivine textural evolution and viscous anisotropy are calibrated for relatively small deformations and simple strain paths, making them less general than desired for many large-scale geodynamic scenarios. Here we develop a new set of micromechanical models to describe the mechanical behavior and textural evolution of olivine through a large range of strains and complex strain histories. For the mechanical behavior, we explore two extreme scenarios, one in which each grain experiences the same stress tensor (Sachs model) and one in which each grain undergoes a strain rate as close as possible to the macroscopic strain rate (pseudo-Taylor model). For the textural evolution, we develop a new model in which the director method is used to control the rate of grain rotation and the available slip systems in olivine are used to control the axis of rotation. Only recently has enough laboratory data on the deformation of olivine become available to calibrate these models. We use these new data to conduct inversions for the best parameters to characterize both the mechanical and textural evolution models. These inversions demonstrate that the calibrated pseudo-Taylor model best reproduces the mechanical observations. Additionally, the pseudo-Taylor textural evolution model can reasonably reproduce the observed texture strength, shape, and orientation after large and complex deformations. A quantitative comparison between our calibrated models and previously published models reveals that our new models excel in predicting the magnitude of viscous anisotropy and the details of the textural evolution. In addition, we demonstrate that the mechanical and textural evolution models can be coupled and used to reproduce mechanical evolution during large-strain torsion tests. This set of models therefore provides a new geodynamic tool for incorporating viscous anisotropy into large-scale numerical simulations.
The evolutionary history of protein fold families and proteomes confirms that the archaeal ancestor is more ancient than the ancestors of other superkingdoms

PubMed Central

2012-01-01

Background The entire evolutionary history of life can be studied using myriad sequences generated by genomic research. This includes the appearance of the first cells and of superkingdoms Archaea, Bacteria, and Eukarya. However, the use of molecular sequence information for deep phylogenetic analyses is limited by mutational saturation, differential evolutionary rates, lack of sequence site independence, and other biological and technical constraints. In contrast, protein structures are evolutionary modules that are highly conserved and diverse enough to enable deep historical exploration. Results Here we build phylogenies that describe the evolution of proteins and proteomes. These phylogenetic trees are derived from a genomic census of protein domains defined at the fold family (FF) level of structural classification. Phylogenomic trees of FF structures were reconstructed from genomic abundance levels of 2,397 FFs in 420 proteomes of free-living organisms. These trees defined timelines of domain appearance, with time spanning from the origin of proteins to the present. Timelines are divided into five different evolutionary phases according to patterns of sharing of FFs among superkingdoms: (1) a primordial protein world, (2) reductive evolution and the rise of Archaea, (3) the rise of Bacteria from the common ancestor of Bacteria and Eukarya and early development of the three superkingdoms, (4) the rise of Eukarya and widespread organismal diversification, and (5) eukaryal diversification. The relative ancestry of the FFs shows that reductive evolution by domain loss is dominant in the first three phases and is responsible for both the diversification of life from a universal cellular ancestor and the appearance of superkingdoms. On the other hand, domain gains are predominant in the last two phases and are responsible for organismal diversification, especially in Bacteria and Eukarya. Conclusions The evolution of functions that are associated with corresponding FFs along the timeline reveals that primordial metabolic domains evolved earlier than informational domains involved in translation and transcription, supporting the metabolism-first hypothesis rather than the RNA world scenario. In addition, phylogenomic trees of proteomes reconstructed from FFs appearing in each of the five phases of the protein world show that trees reconstructed from ancient domain structures were consistently rooted in archaeal lineages, supporting the proposal that the archaeal ancestor is more ancient than the ancestors of other superkingdoms. PMID:22284070
Contrasting patterns of evolutionary constraint and novelty revealed by comparative sperm proteomic analysis in Lepidoptera.

PubMed

Whittington, Emma; Forsythe, Desiree; Borziak, Kirill; Karr, Timothy L; Walters, James R; Dorus, Steve

2017-12-02

Rapid evolution is a hallmark of reproductive genetic systems and arises through the combined processes of sequence divergence, gene gain and loss, and changes in gene and protein expression. While studies aiming to disentangle the molecular ramifications of these processes are progressing, we still know little about the genetic basis of evolutionary transitions in reproductive systems. Here we conduct the first comparative analysis of sperm proteomes in Lepidoptera, a group that exhibits dichotomous spermatogenesis, in which males produce a functional fertilization-competent sperm (eupyrene) and an incompetent sperm morph lacking nuclear DNA (apyrene). Through the integrated application of evolutionary proteomics and genomics, we characterize the genomic patterns potentially associated with the origination and evolution of this unique spermatogenic process and assess the importance of genetic novelty in Lepidopteran sperm biology. Comparison of the newly characterized Monarch butterfly (Danaus plexippus) sperm proteome to those of the Carolina sphinx moth (Manduca sexta) and the fruit fly (Drosophila melanogaster) demonstrated conservation at the level of protein abundance and post-translational modification within Lepidoptera. In contrast, comparative genomic analyses across insects reveals significant divergence at two levels that differentiate the genetic architecture of sperm in Lepidoptera from other insects. First, a significant reduction in orthology among Monarch sperm genes relative to the remainder of the genome in non-Lepidopteran insect species was observed. Second, a substantial number of sperm proteins were found to be specific to Lepidoptera, in that they lack detectable homology to the genomes of more distantly related insects. Lastly, the functional importance of Lepidoptera specific sperm proteins is broadly supported by their increased abundance relative to proteins conserved across insects. Our results identify a burst of genetic novelty amongst sperm proteins that may be associated with the origin of heteromorphic spermatogenesis in ancestral Lepidoptera and/or the subsequent evolution of this system. This pattern of genomic diversification is distinct from the remainder of the genome and thus suggests that this transition has had a marked impact on lepidopteran genome evolution. The identification of abundant sperm proteins unique to Lepidoptera, including proteins distinct between specific lineages, will accelerate future functional studies aiming to understand the developmental origin of dichotomous spermatogenesis and the functional diversification of the fertilization incompetent apyrene sperm morph.
Proteomic profile of the Bradysia odoriphaga in response to the microbial secondary metabolite benzothiazole.

PubMed

Zhao, Yunhe; Cui, Kaidi; Xu, Chunmei; Wang, Qiuhong; Wang, Yao; Zhang, Zhengqun; Liu, Feng; Mu, Wei

2016-11-24

Benzothiazole, a microbial secondary metabolite, has been demonstrated to possess fumigant activity against Sclerotinia sclerotiorum, Ditylenchus destructor and Bradysia odoriphaga. However, to facilitate the development of novel microbial pesticides, the mode of action of benzothiazole needs to be elucidated. Here, we employed iTRAQ-based quantitative proteomics analysis to investigate the effects of benzothiazole on the proteomic expression of B. odoriphaga. In response to benzothiazole, 92 of 863 identified proteins in B. odoriphaga exhibited altered levels of expression, among which 14 proteins were related to the action mechanism of benzothiazole, 11 proteins were involved in stress responses, and 67 proteins were associated with the adaptation of B. odoriphaga to benzothiazole. Further bioinformatics analysis indicated that the reduction in energy metabolism, inhibition of the detoxification process and interference with DNA and RNA synthesis were potentially associated with the mode of action of benzothiazole. The myosin heavy chain, succinyl-CoA synthetase and Ca + -transporting ATPase proteins may be related to the stress response. Increased expression of proteins involved in carbohydrate metabolism, energy production and conversion pathways was responsible for the adaptive response of B. odoriphaga. The results of this study provide novel insight into the molecular mechanisms of benzothiazole at a large-scale translation level and will facilitate the elucidation of the mechanism of action of benzothiazole.
PSEA-Quant: a protein set enrichment analysis on label-free and label-based protein quantification data.

PubMed

Lavallée-Adam, Mathieu; Rauniyar, Navin; McClatchy, Daniel B; Yates, John R

2014-12-05

The majority of large-scale proteomics quantification methods yield long lists of quantified proteins that are often difficult to interpret and poorly reproduced. Computational approaches are required to analyze such intricate quantitative proteomics data sets. We propose a statistical approach to computationally identify protein sets (e.g., Gene Ontology (GO) terms) that are significantly enriched with abundant proteins with reproducible quantification measurements across a set of replicates. To this end, we developed PSEA-Quant, a protein set enrichment analysis algorithm for label-free and label-based protein quantification data sets. It offers an alternative approach to classic GO analyses, models protein annotation biases, and allows the analysis of samples originating from a single condition, unlike analogous approaches such as GSEA and PSEA. We demonstrate that PSEA-Quant produces results complementary to GO analyses. We also show that PSEA-Quant provides valuable information about the biological processes involved in cystic fibrosis using label-free protein quantification of a cell line expressing a CFTR mutant. Finally, PSEA-Quant highlights the differences in the mechanisms taking place in the human, rat, and mouse brain frontal cortices based on tandem mass tag quantification. Our approach, which is available online, will thus improve the analysis of proteomics quantification data sets by providing meaningful biological insights.
PSEA-Quant: A Protein Set Enrichment Analysis on Label-Free and Label-Based Protein Quantification Data

PubMed Central

2015-01-01

The majority of large-scale proteomics quantification methods yield long lists of quantified proteins that are often difficult to interpret and poorly reproduced. Computational approaches are required to analyze such intricate quantitative proteomics data sets. We propose a statistical approach to computationally identify protein sets (e.g., Gene Ontology (GO) terms) that are significantly enriched with abundant proteins with reproducible quantification measurements across a set of replicates. To this end, we developed PSEA-Quant, a protein set enrichment analysis algorithm for label-free and label-based protein quantification data sets. It offers an alternative approach to classic GO analyses, models protein annotation biases, and allows the analysis of samples originating from a single condition, unlike analogous approaches such as GSEA and PSEA. We demonstrate that PSEA-Quant produces results complementary to GO analyses. We also show that PSEA-Quant provides valuable information about the biological processes involved in cystic fibrosis using label-free protein quantification of a cell line expressing a CFTR mutant. Finally, PSEA-Quant highlights the differences in the mechanisms taking place in the human, rat, and mouse brain frontal cortices based on tandem mass tag quantification. Our approach, which is available online, will thus improve the analysis of proteomics quantification data sets by providing meaningful biological insights. PMID:25177766
A novel spectral library workflow to enhance protein identifications.

PubMed

Li, Haomin; Zong, Nobel C; Liang, Xiangbo; Kim, Allen K; Choi, Jeong Ho; Deng, Ning; Zelaya, Ivette; Lam, Maggie; Duan, Huilong; Ping, Peipei

2013-04-09

The innovations in mass spectrometry-based investigations in proteome biology enable systematic characterization of molecular details in pathophysiological phenotypes. However, the process of delineating large-scale raw proteomic datasets into a biological context requires high-throughput data acquisition and processing. A spectral library search engine makes use of previously annotated experimental spectra as references for subsequent spectral analyses. This workflow delivers many advantages, including elevated analytical efficiency and specificity as well as reduced demands in computational capacity. In this study, we created a spectral matching engine to address challenges commonly associated with a library search workflow. Particularly, an improved sliding dot product algorithm, that is robust to systematic drifts of mass measurement in spectra, is introduced. Furthermore, a noise management protocol distinguishes spectra correlation attributed from noise and peptide fragments. It enables elevated separation between target spectral matches and false matches, thereby suppressing the possibility of propagating inaccurate peptide annotations from library spectra to query spectra. Moreover, preservation of original spectra also accommodates user contributions to further enhance the quality of the library. Collectively, this search engine supports reproducible data analyses using curated references, thereby broadening the accessibility of proteomics resources to biomedical investigators. This article is part of a Special Issue entitled: From protein structures to clinical applications. Copyright © 2013 Elsevier B.V. All rights reserved.
Genomic and Proteomic Dissection of the Ubiquitous Plant Pathogen, Armillaria mellea: Toward a New Infection Model System

PubMed Central

2013-01-01

Armillaria mellea is a major plant pathogen. Yet, no large-scale “-omics” data are available to enable new studies, and limited experimental models are available to investigate basidiomycete pathogenicity. Here we reveal that the A. mellea genome comprises 58.35 Mb, contains 14473 gene models, of average length 1575 bp (4.72 introns/gene). Tandem mass spectrometry identified 921 mycelial (n = 629 unique) and secreted (n = 183 unique) proteins. Almost 100 mycelial proteins were either species-specific or previously unidentified at the protein level. A number of proteins (n = 111) was detected in both mycelia and culture supernatant extracts. Signal sequence occurrence was 4-fold greater for secreted (50.2%) compared to mycelial (12%) proteins. Analyses revealed a rich reservoir of carbohydrate degrading enzymes, laccases, and lignin peroxidases in the A. mellea proteome, reminiscent of both basidiomycete and ascomycete glycodegradative arsenals. We discovered that A. mellea exhibits a specific killing effect against Candida albicans during coculture. Proteomic investigation of this interaction revealed the unique expression of defensive and potentially offensive A. mellea proteins (n = 30). Overall, our data reveal new insights into the origin of basidiomycete virulence and we present a new model system for further studies aimed at deciphering fungal pathogenic mechanisms. PMID:23656496
Screening of missing proteins in the human liver proteome by improved MRM-approach-based targeted proteomics.

PubMed

Chen, Chen; Liu, Xiaohui; Zheng, Weimin; Zhang, Lei; Yao, Jun; Yang, Pengyuan

2014-04-04

To completely annotate the human genome, the task of identifying and characterizing proteins that currently lack mass spectrometry (MS) evidence is inevitable and urgent. In this study, as the first effort to screen missing proteins in large scale, we developed an approach based on SDS-PAGE followed by liquid chromatography-multiple reaction monitoring (LC-MRM), for screening of those missing proteins with only a single peptide hit in the previous liver proteome data set. Proteins extracted from normal human liver were separated in SDS-PAGE and digested in split gel slice, and the resulting digests were then subjected to LC-schedule MRM analysis. The MRM assays were developed through synthesized crude peptides for target peptides. In total, the expressions of 57 target proteins were confirmed from 185 MRM assays in normal human liver tissues. Among the proved 57 one-hit wonders, 50 proteins are of the minimally redundant set in the PeptideAtlas database, 7 proteins even have none MS-based information previously in various biological processes. We conclude that our SDS-PAGE-MRM workflow can be a powerful approach to screen missing or poorly characterized proteins in different samples and to provide their quantity if detected. The MRM raw data have been uploaded to ISB/SRM Atlas/PASSEL (PXD000648).
Gas-Phase Enrichment of Multiply Charged Peptide Ions by Differential Ion Mobility Extend the Comprehensiveness of SUMO Proteome Analyses

NASA Astrophysics Data System (ADS)

Pfammatter, Sibylle; Bonneil, Eric; McManus, Francis P.; Thibault, Pierre

2018-04-01

The small ubiquitin-like modifier (SUMO) is a member of the family of ubiquitin-like modifiers (UBLs) and is involved in important cellular processes, including DNA damage response, meiosis and cellular trafficking. The large-scale identification of SUMO peptides in a site-specific manner is challenging not only because of the low abundance and dynamic nature of this modification, but also due to the branched structure of the corresponding peptides that further complicate their identification using conventional search engines. Here, we exploited the unusual structure of SUMO peptides to facilitate their separation by high-field asymmetric waveform ion mobility spectrometry (FAIMS) and increase the coverage of SUMO proteome analysis. Upon trypsin digestion, branched peptides contain a SUMO remnant side chain and predominantly form triply protonated ions that facilitate their gas-phase separation using FAIMS. We evaluated the mobility characteristics of synthetic SUMO peptides and further demonstrated the application of FAIMS to profile the changes in protein SUMOylation of HEK293 cells following heat shock, a condition known to affect this modification. FAIMS typically provided a 10-fold improvement of detection limit of SUMO peptides, and enabled a 36% increase in SUMO proteome coverage compared to the same LC-MS/MS analyses performed without FAIMS. [Figure not available: see fulltext.

Ursgal, Universal Python Module Combining Common Bottom-Up Proteomics Tools for Large-Scale Analysis.

PubMed

Kremer, Lukas P M; Leufken, Johannes; Oyunchimeg, Purevdulam; Schulze, Stefan; Fufezan, Christian

2016-03-04

Proteomics data integration has become a broad field with a variety of programs offering innovative algorithms to analyze increasing amounts of data. Unfortunately, this software diversity leads to many problems as soon as the data is analyzed using more than one algorithm for the same task. Although it was shown that the combination of multiple peptide identification algorithms yields more robust results, it is only recently that unified approaches are emerging; however, workflows that, for example, aim to optimize search parameters or that employ cascaded style searches can only be made accessible if data analysis becomes not only unified but also and most importantly scriptable. Here we introduce Ursgal, a Python interface to many commonly used bottom-up proteomics tools and to additional auxiliary programs. Complex workflows can thus be composed using the Python scripting language using a few lines of code. Ursgal is easily extensible, and we have made several database search engines (X!Tandem, OMSSA, MS-GF+, Myrimatch, MS Amanda), statistical postprocessing algorithms (qvality, Percolator), and one algorithm that combines statistically postprocessed outputs from multiple search engines ("combined FDR") accessible as an interface in Python. Furthermore, we have implemented a new algorithm ("combined PEP") that combines multiple search engines employing elements of "combined FDR", PeptideShaker, and Bayes' theorem.
Covering complete proteomes with X-ray structures: A current snapshot

DOE PAGES

Mizianty, Marcin J.; Fan, Xiao; Yan, Jing; ...

2014-10-23

Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtainedmore » through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-ray structure determination were determined.« less
Monoclonal antibody proteomics: use of antibody mimotope displaying phages and the relevant synthetic peptides for mAb scouting.

PubMed

Hajdú, István; Flachner, Beáta; Bognár, Melinda; Végh, Barbara M; Dobi, Krisztina; Lőrincz, Zsolt; Lázár, József; Cseh, Sándor; Takács, László; Kurucz, István

2014-08-01

Monoclonal antibody proteomics uses nascent libraries or cloned (Plasmascan™, QuantiPlasma™) libraries of mAbs that react with individual epitopes of proteins in the human plasma. At the initial phase of library creation, cognate protein antigen and the epitope interacting with the antibodies are not known. Scouting for monoclonal antibodies (mAbs) with the best binding characteristics is of high importance for mAb based biomarker assay development. However, in the absence of the identity of the cognate antigen the task represents a challenge. We combined phage display, and surface plasmon resonance (Biacore) experiments to test whether specific phages and the respective mimotope peptides obtained from large scale studies are applicable to determine key features of antibodies for scouting. We show here that mAb captured phage-mimotope heterogeneity that is the diversity of the selected peptide sequences, is inversely correlated with an important binding descriptor; the off-rate of the antibodies and that represents clues for driving the selection of useful mAbs for biomarker assay development. Carefully chosen synthetic mimotope peptides are suitable for specificity testing in competitive assays using the target proteome, in our case the human plasma. Copyright © 2014 Elsevier B.V. All rights reserved.
Transcriptomic and proteomic responses of Serratia marcescens to spaceflight conditions involve large-scale changes in metabolic pathways

NASA Astrophysics Data System (ADS)

Wang, Yajuan; Yuan, Yanting; Liu, Jinwen; Su, Longxiang; Chang, De; Guo, Yinghua; Chen, Zhenhong; Fang, Xiangqun; Wang, Junfeng; Li, Tianzhi; Zhou, Lisha; Fang, Chengxiang; Yang, Ruifu; Liu, Changting

2014-04-01

The microgravity environment of spaceflight expeditions has been associated with altered microbial responses. This study explores the characterization of Serratia marcescensis grown in a spaceflight environment at the phenotypic, transcriptomic and proteomic levels. From November 1, 2011 to November 17, 2011, a strain of S. marcescensis was sent into space for 398 h on the Shenzhou VIII spacecraft, and ground simulation was performed as a control (LCT-SM213). After the flight, two mutant strains (LCT-SM166 and LCT-SM262) were selected for further analysis. Although no changes in the morphology, post-culture growth kinetics, hemolysis or antibiotic sensitivity were observed, the two mutant strains exhibited significant changes in their metabolic profiles after exposure to spaceflight. Enrichment analysis of the transcriptome showed that the differentially expressed genes of the two spaceflight strains and the ground control strain mainly included those involved in metabolism and degradation. The proteome revealed that changes at the protein level were also associated with metabolic functions, such as glycolysis/gluconeogenesis, pyruvate metabolism, arginine and proline metabolism and the degradation of valine, leucine and isoleucine. In summary S. marcescens showed alterations primarily in genes and proteins that were associated with metabolism under spaceflight conditions, which gave us valuable clues for future research.
A Comprehensive Guide for Performing Sample Preparation and Top-Down Protein Analysis

PubMed Central

Padula, Matthew P.; Berry, Iain J.; O′Rourke, Matthew B.; Raymond, Benjamin B.A.; Santos, Jerran; Djordjevic, Steven P.

2017-01-01

Methodologies for the global analysis of proteins in a sample, or proteome analysis, have been available since 1975 when Patrick O′Farrell published the first paper describing two-dimensional gel electrophoresis (2D-PAGE). This technique allowed the resolution of single protein isoforms, or proteoforms, into single ‘spots’ in a polyacrylamide gel, allowing the quantitation of changes in a proteoform′s abundance to ascertain changes in an organism′s phenotype when conditions change. In pursuit of the comprehensive profiling of the proteome, significant advances in technology have made the identification and quantitation of intact proteoforms from complex mixtures of proteins more routine, allowing analysis of the proteome from the ‘Top-Down’. However, the number of proteoforms detected by Top-Down methodologies such as 2D-PAGE or mass spectrometry has not significantly increased since O’Farrell’s paper when compared to Bottom-Up, peptide-centric techniques. This article explores and explains the numerous methodologies and technologies available to analyse the proteome from the Top-Down with a strong emphasis on the necessity to analyse intact proteoforms as a better indicator of changes in biology and phenotype. We arrive at the conclusion that the complete and comprehensive profiling of an organism′s proteome is still, at present, beyond our reach but the continuing evolution of protein fractionation techniques and mass spectrometry brings comprehensive Top-Down proteome profiling closer. PMID:28387712
A Comprehensive Guide for Performing Sample Preparation and Top-Down Protein Analysis.

PubMed

Padula, Matthew P; Berry, Iain J; O Rourke, Matthew B; Raymond, Benjamin B A; Santos, Jerran; Djordjevic, Steven P

2017-04-07

Methodologies for the global analysis of proteins in a sample, or proteome analysis, have been available since 1975 when Patrick O'Farrell published the first paper describing two-dimensional gel electrophoresis (2D-PAGE). This technique allowed the resolution of single protein isoforms, or proteoforms, into single 'spots' in a polyacrylamide gel, allowing the quantitation of changes in a proteoform's abundance to ascertain changes in an organism's phenotype when conditions change. In pursuit of the comprehensive profiling of the proteome, significant advances in technology have made the identification and quantitation of intact proteoforms from complex mixtures of proteins more routine, allowing analysis of the proteome from the 'Top-Down'. However, the number of proteoforms detected by Top-Down methodologies such as 2D-PAGE or mass spectrometry has not significantly increased since O'Farrell's paper when compared to Bottom-Up, peptide-centric techniques. This article explores and explains the numerous methodologies and technologies available to analyse the proteome from the Top-Down with a strong emphasis on the necessity to analyse intact proteoforms as a better indicator of changes in biology and phenotype. We arrive at the conclusion that the complete and comprehensive profiling of an organism's proteome is still, at present, beyond our reach but the continuing evolution of protein fractionation techniques and mass spectrometry brings comprehensive Top-Down proteome profiling closer.
Peroxisome Biogenesis and Function

PubMed Central

Kaur, Navneet; Reumann, Sigrun; Hu, Jianping

2009-01-01

Peroxisomes are small and single membrane-delimited organelles that execute numerous metabolic reactions and have pivotal roles in plant growth and development. In recent years, forward and reverse genetic studies along with biochemical and cell biological analyses in Arabidopsis have enabled researchers to identify many peroxisome proteins and elucidate their functions. This review focuses on the advances in our understanding of peroxisome biogenesis and metabolism, and further explores the contribution of large-scale analysis, such as in sillco predictions and proteomics, in augmenting our knowledge of peroxisome function In Arabidopsis. PMID:22303249
Simulation of coherent nonlinear neutrino flavor transformation in the supernova environment: Correlated neutrino trajectories

NASA Astrophysics Data System (ADS)

Duan, Huaiyu; Fuller, George M.; Carlson, J.; Qian, Yong-Zhong

2006-11-01

We present results of large-scale numerical simulations of the evolution of neutrino and antineutrino flavors in the region above the late-time post-supernova-explosion proto-neutron star. Our calculations are the first to allow explicit flavor evolution histories on different neutrino trajectories and to self-consistently couple flavor development on these trajectories through forward scattering-induced quantum coupling. Employing the atmospheric-scale neutrino mass-squared difference (|δm2|≃3×10-3eV2) and values of θ13 allowed by current bounds, we find transformation of neutrino and antineutrino flavors over broad ranges of energy and luminosity in roughly the “bi-polar” collective mode. We find that this large-scale flavor conversion, largely driven by the flavor off-diagonal neutrino-neutrino forward scattering potential, sets in much closer to the proto-neutron star than simple estimates based on flavor-diagonal potentials and Mikheyev-Smirnov-Wolfenstein evolution would indicate. In turn, this suggests that models of r-process nucleosynthesis sited in the neutrino-driven wind could be affected substantially by active-active neutrino flavor mixing, even with the small measured neutrino mass-squared differences.
Parasites, proteomes and systems: has Descartes' clock run out of time?

PubMed

Wastling, J M; Armstrong, S D; Krishna, R; Xia, D

2012-08-01

Systems biology aims to integrate multiple biological data types such as genomics, transcriptomics and proteomics across different levels of structure and scale; it represents an emerging paradigm in the scientific process which challenges the reductionism that has dominated biomedical research for hundreds of years. Systems biology will nevertheless only be successful if the technologies on which it is based are able to deliver the required type and quality of data. In this review we discuss how well positioned is proteomics to deliver the data necessary to support meaningful systems modelling in parasite biology. We summarise the current state of identification proteomics in parasites, but argue that a new generation of quantitative proteomics data is now needed to underpin effective systems modelling. We discuss the challenges faced to acquire more complete knowledge of protein post-translational modifications, protein turnover and protein-protein interactions in parasites. Finally we highlight the central role of proteome-informatics in ensuring that proteomics data is readily accessible to the user-community and can be translated and integrated with other relevant data types.
Parasites, proteomes and systems: has Descartes’ clock run out of time?

PubMed Central

WASTLING, J. M.; ARMSTRONG, S. D.; KRISHNA, R.; XIA, D.

2012-01-01

SUMMARY Systems biology aims to integrate multiple biological data types such as genomics, transcriptomics and proteomics across different levels of structure and scale; it represents an emerging paradigm in the scientific process which challenges the reductionism that has dominated biomedical research for hundreds of years. Systems biology will nevertheless only be successful if the technologies on which it is based are able to deliver the required type and quality of data. In this review we discuss how well positioned is proteomics to deliver the data necessary to support meaningful systems modelling in parasite biology. We summarise the current state of identification proteomics in parasites, but argue that a new generation of quantitative proteomics data is now needed to underpin effective systems modelling. We discuss the challenges faced to acquire more complete knowledge of protein post-translational modifications, protein turnover and protein-protein interactions in parasites. Finally we highlight the central role of proteome-informatics in ensuring that proteomics data is readily accessible to the user-community and can be translated and integrated with other relevant data types. PMID:22828391
Large-scale and Long-duration Simulation of a Multi-stage Eruptive Solar Event

NASA Astrophysics Data System (ADS)

Jiang, chaowei; Hu, Qiang; Wu, S. T.

2015-04-01

We employ a data-driven 3D MHD active region evolution model by using the Conservation Element and Solution Element (CESE) numerical method. This newly developed model retains the full MHD effects, allowing time-dependent boundary conditions and time evolution studies. The time-dependent simulation is driven by measured vector magnetograms and the method of MHD characteristics on the bottom boundary. We have applied the model to investigate the coronal magnetic field evolution of AR11283 which was characterized by a pre-existing sigmoid structure in the core region and multiple eruptions, both in relatively small and large scales. We have succeeded in producing the core magnetic field structure and the subsequent eruptions of flux-rope structures (see https://dl.dropboxusercontent.com/u/96898685/large.mp4 for an animation) as the measured vector magnetograms on the bottom boundary evolve in time with constant flux emergence. The whole process, lasting for about an hour in real time, compares well with the corresponding SDO/AIA and coronagraph imaging observations. From these results, we show the capability of the model, largely data-driven, that is able to simulate complex, topological, and highly dynamic active region evolutions. (We acknowledge partial support of NSF grants AGS 1153323 and AGS 1062050, and data support from SDO/HMI and AIA teams).
A Library of Phosphoproteomic and Chromatin Signatures for Characterizing Cellular Responses to Drug Perturbations.

PubMed

Litichevskiy, Lev; Peckner, Ryan; Abelin, Jennifer G; Asiedu, Jacob K; Creech, Amanda L; Davis, John F; Davison, Desiree; Dunning, Caitlin M; Egertson, Jarrett D; Egri, Shawn; Gould, Joshua; Ko, Tak; Johnson, Sarah A; Lahr, David L; Lam, Daniel; Liu, Zihan; Lyons, Nicholas J; Lu, Xiaodong; MacLean, Brendan X; Mungenast, Alison E; Officer, Adam; Natoli, Ted E; Papanastasiou, Malvina; Patel, Jinal; Sharma, Vagisha; Toder, Courtney; Tubelli, Andrew A; Young, Jennie Z; Carr, Steven A; Golub, Todd R; Subramanian, Aravind; MacCoss, Michael J; Tsai, Li-Huei; Jaffe, Jacob D

2018-04-25

Although the value of proteomics has been demonstrated, cost and scale are typically prohibitive, and gene expression profiling remains dominant for characterizing cellular responses to perturbations. However, high-throughput sentinel assays provide an opportunity for proteomics to contribute at a meaningful scale. We present a systematic library resource (90 drugs × 6 cell lines) of proteomic signatures that measure changes in the reduced-representation phosphoproteome (P100) and changes in epigenetic marks on histones (GCP). A majority of these drugs elicited reproducible signatures, but notable cell line- and assay-specific differences were observed. Using the "connectivity" framework, we compared signatures across cell types and integrated data across assays, including a transcriptional assay (L1000). Consistent connectivity among cell types revealed cellular responses that transcended lineage, and consistent connectivity among assays revealed unexpected associations between drugs. We further leveraged the resource against public data to formulate hypotheses for treatment of multiple myeloma and acute lymphocytic leukemia. This resource is publicly available at https://clue.io/proteomics. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.
Comparative proteomic analysis of differentially expressed proteins in β-aminobutyric acid enhanced Arabidopsis thaliana tolerance to simulated acid rain.

PubMed

Liu, Tingwu; Jiang, Xinwu; Shi, Wuliang; Chen, Juan; Pei, Zhenming; Zheng, Hailei

2011-05-01

Acid rain is a worldwide environmental issue that has seriously destroyed forest ecosystems. As a highly effective and broad-spectrum plant resistance-inducing agent, β-aminobutyric acid could elevate the tolerance of Arabidopsis when subjected to simulated acid rain. Using comparative proteomic strategies, we analyzed 203 significantly varied proteins of which 175 proteins were identified responding to β-aminobutyric acid in the absence and presence of simulated acid rain. They could be divided into ten groups according to their biological functions. Among them, the majority was cell rescue, development and defense-related proteins, followed by transcription, protein synthesis, folding, modification and destination-associated proteins. Our conclusion is β-aminobutyric acid can lead to a large-scale primary metabolism change and simultaneously activate antioxidant system and salicylic acid, jasmonic acid, abscisic acid signaling pathways. In addition, β-aminobutyric acid can reinforce physical barriers to defend simulated acid rain stress. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Proteomic analyses bring new insights into the effect of a dark stress on lipid biosynthesis in Phaeodactylum tricornutum

PubMed Central

Bai, Xiaocui; Song, Hao; Lavoie, Michel; Zhu, Kun; Su, Yiyuan; Ye, Hanqi; Chen, Si; Fu, Zhengwei; Qian, Haifeng

2016-01-01

Microalgae biosynthesize high amount of lipids and show high potential for renewable biodiesel production. However, the production cost of microalgae-derived biodiesel hampers large-scale biodiesel commercialization and new strategies for increasing lipid production efficiency from algae are urgently needed. Here we submitted the marine algae Phaeodactylum tricornutum to a 4-day dark stress, a condition increasing by 2.3-fold the total lipid cell quotas, and studied the cellular mechanisms leading to lipid accumulation using a combination of physiological, proteomic (iTRAQ) and genomic (qRT-PCR) approaches. Our results show that the expression of proteins in the biochemical pathways of glycolysis and the synthesis of fatty acids were induced in the dark, potentially using excess carbon and nitrogen produced from protein breakdown. Treatment of algae in the dark, which increased algal lipid cell quotas at low cost, combined with optimal growth treatment could help optimizing biodiesel production. PMID:27147218
Capillary nano-immunoassays: advancing quantitative proteomics analysis, biomarker assessment, and molecular diagnostics.

PubMed

Chen, Jin-Qiu; Wakefield, Lalage M; Goldstein, David J

2015-06-06

There is an emerging demand for the use of molecular profiling to facilitate biomarker identification and development, and to stratify patients for more efficient treatment decisions with reduced adverse effects. In the past decade, great strides have been made to advance genomic, transcriptomic and proteomic approaches to address these demands. While there has been much progress with these large scale approaches, profiling at the protein level still faces challenges due to limitations in clinical sample size, poor reproducibility, unreliable quantitation, and lack of assay robustness. A novel automated capillary nano-immunoassay (CNIA) technology has been developed. This technology offers precise and accurate measurement of proteins and their post-translational modifications using either charge-based or size-based separation formats. The system not only uses ultralow nanogram levels of protein but also allows multi-analyte analysis using a parallel single-analyte format for increased sensitivity and specificity. The high sensitivity and excellent reproducibility of this technology make it particularly powerful for analysis of clinical samples. Furthermore, the system can distinguish and detect specific protein post-translational modifications that conventional Western blot and other immunoassays cannot easily capture. This review will summarize and evaluate the latest progress to optimize the CNIA system for comprehensive, quantitative protein and signaling event characterization. It will also discuss how the technology has been successfully applied in both discovery research and clinical studies, for signaling pathway dissection, proteomic biomarker assessment, targeted treatment evaluation and quantitative proteomic analysis. Lastly, a comparison of this novel system with other conventional immuno-assay platforms is performed.
A GLOBAL GALACTIC DYNAMO WITH A CORONA CONSTRAINED BY RELATIVE HELICITY

DOE Office of Scientific and Technical Information (OSTI.GOV)

Prasad, A.; Mangalam, A., E-mail: avijeet@iiap.res.in, E-mail: mangalam@iiap.res.in

We present a model for a global axisymmetric turbulent dynamo operating in a galaxy with a corona that treats the parameters of turbulence driven by supernovae and by magneto-rotational instability under a common formalism. The nonlinear quenching of the dynamo is alleviated by the inclusion of small-scale advective and diffusive magnetic helicity fluxes, which allow the gauge-invariant magnetic helicity to be transferred outside the disk and consequently to build up a corona during the course of dynamo action. The time-dependent dynamo equations are expressed in a separable form and solved through an eigenvector expansion constructed using the steady-state solutions ofmore » the dynamo equation. The parametric evolution of the dynamo solution allows us to estimate the final structure of the global magnetic field and the saturated value of the turbulence parameter α{sub m}, even before solving the dynamical equations for evolution of magnetic fields in the disk and the corona, along with α-quenching. We then solve these equations simultaneously to study the saturation of the large-scale magnetic field, its dependence on the small-scale magnetic helicity fluxes, and the corresponding evolution of the force-free field in the corona. The quadrupolar large-scale magnetic field in the disk is found to reach equipartition strength within a timescale of 1 Gyr. The large-scale magnetic field in the corona obtained is much weaker than the field inside the disk and has only a weak impact on the dynamo operation.« less
Energetic Consistency and Coupling of the Mean and Covariance Dynamics

NASA Technical Reports Server (NTRS)

Cohn, Stephen E.

2008-01-01

The dynamical state of the ocean and atmosphere is taken to be a large dimensional random vector in a range of large-scale computational applications, including data assimilation, ensemble prediction, sensitivity analysis, and predictability studies. In each of these applications, numerical evolution of the covariance matrix of the random state plays a central role, because this matrix is used to quantify uncertainty in the state of the dynamical system. Since atmospheric and ocean dynamics are nonlinear, there is no closed evolution equation for the covariance matrix, nor for the mean state. Therefore approximate evolution equations must be used. This article studies theoretical properties of the evolution equations for the mean state and covariance matrix that arise in the second-moment closure approximation (third- and higher-order moment discard). This approximation was introduced by EPSTEIN [1969] in an early effort to introduce a stochastic element into deterministic weather forecasting, and was studied further by FLEMING [1971a,b], EPSTEIN and PITCHER [1972], and PITCHER [1977], also in the context of atmospheric predictability. It has since fallen into disuse, with a simpler one being used in current large-scale applications. The theoretical results of this article make a case that this approximation should be reconsidered for use in large-scale applications, however, because the second moment closure equations possess a property of energetic consistency that the approximate equations now in common use do not possess. A number of properties of solutions of the second-moment closure equations that result from this energetic consistency will be established.
Activity-based protein profiling: from enzyme chemistry to proteomic chemistry.

PubMed

Cravatt, Benjamin F; Wright, Aaron T; Kozarich, John W

2008-01-01

Genome sequencing projects have provided researchers with a complete inventory of the predicted proteins produced by eukaryotic and prokaryotic organisms. Assignment of functions to these proteins represents one of the principal challenges for the field of proteomics. Activity-based protein profiling (ABPP) has emerged as a powerful chemical proteomic strategy to characterize enzyme function directly in native biological systems on a global scale. Here, we review the basic technology of ABPP, the enzyme classes addressable by this method, and the biological discoveries attributable to its application.
Relaxation in two dimensions and the 'sinh-Poisson' equation

NASA Technical Reports Server (NTRS)

Montgomery, D.; Matthaeus, W. H.; Stribling, W. T.; Martinez, D.; Oughton, S.

1992-01-01

Long-time states of a turbulent, decaying, two-dimensional, Navier-Stokes flow are shown numerically to relax toward maximum-entropy configurations, as defined by the "sinh-Poisson" equation. The large-scale Reynolds number is about 14,000, the spatial resolution is (512)-squared, the boundary conditions are spatially periodic, and the evolution takes place over nearly 400 large-scale eddy-turnover times.
Testing Convergence Versus History: Convergence Dominates Phenotypic Evolution for over 150 Million Years in Frogs.

PubMed

Moen, Daniel S; Morlon, Hélène; Wiens, John J

2016-01-01

Striking evolutionary convergence can lead to similar sets of species in different locations, such as in cichlid fishes and Anolis lizards, and suggests that evolution can be repeatable and predictable across clades. Yet, most examples of convergence involve relatively small temporal and/or spatial scales. Some authors have speculated that at larger scales (e.g., across continents), differing evolutionary histories will prevent convergence. However, few studies have compared the contrasting roles of convergence and history, and none have done so at large scales. Here we develop a two-part approach to test the scale over which convergence can occur, comparing the relative importance of convergence and history in macroevolution using phylogenetic models of adaptive evolution. We apply this approach to data from morphology, ecology, and phylogeny from 167 species of anuran amphibians (frogs) from 10 local sites across the world, spanning ~160 myr of evolution. Mapping ecology on the phylogeny revealed that similar microhabitat specialists (e.g., aquatic, arboreal) have evolved repeatedly across clades and regions, producing many evolutionary replicates for testing for morphological convergence. By comparing morphological optima for clades and microhabitat types (our first test), we find that convergence associated with microhabitat use dominates frog morphological evolution, producing recurrent ecomorphs that together encompass all sampled species in each community in each region. However, our second test, which examines whether and how much species differ from their inferred optima, shows that convergence is incomplete: that is, phenotypes of most species are still somewhat distant from the estimated optimum for each microhabitat, seemingly because of insufficient time for more complete adaptation (an effect of history). Yet, these effects of history are related to past ecologies, and not clade membership. Overall, our study elucidates the dominant drivers of morphological evolution across a major vertebrate clade and shows that evolution can be repeatable at much greater temporal and spatial scales than commonly thought. It also provides an analytical framework for testing other potential examples of large-scale convergence. © The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Venomous snakes of Costa Rica: biological and medical implications of their venom proteomic profiles analyzed through the strategy of snake venomics.

PubMed

Lomonte, Bruno; Fernández, Julián; Sanz, Libia; Angulo, Yamileth; Sasa, Mahmood; Gutiérrez, José María; Calvete, Juan J

2014-06-13

In spite of its small territory of ~50,000km(2), Costa Rica harbors a remarkably rich biodiversity. Its herpetofauna includes 138 species of snakes, of which sixteen pit vipers (family Viperidae, subfamily Crotalinae), five coral snakes (family Elapidae, subfamily Elapinae), and one sea snake (Family Elapidae, subfamily Hydrophiinae) pose potential hazards to human and animal health. In recent years, knowledge on the composition of snake venoms has expanded dramatically thanks to the development of increasingly fast and sensitive analytical techniques in mass spectrometry and separation science applied to protein characterization. Among several analytical strategies to determine the overall protein/peptide composition of snake venoms, the methodology known as 'snake venomics' has proven particularly well suited and informative, by providing not only a catalog of protein types/families present in a venom, but also a semi-quantitative estimation of their relative abundances. Through a collaborative research initiative between Instituto de Biomedicina de Valencia (IBV) and Instituto Clodomiro Picado (ICP), this strategy has been applied to the study of venoms of Costa Rican snakes, aiming to obtain a deeper knowledge on their composition, geographic and ontogenic variations, relationships to taxonomy, correlation with toxic activities, and discovery of novel components. The proteomic profiles of venoms from sixteen out of the 22 species within the Viperidae and Elapidae families found in Costa Rica have been reported so far, and an integrative view of these studies is hereby presented. In line with other venomic projects by research groups focusing on a wide variety of snakes around the world, these studies contribute to a deeper understanding of the biochemical basis for the diverse toxic profiles evolved by venomous snakes. In addition, these studies provide opportunities to identify novel molecules of potential pharmacological interest. Furthermore, the establishment of venom proteomic profiles offers a fundamental platform to assess the detailed immunorecognition of individual proteins/peptides by therapeutic or experimental antivenoms, an evolving methodology for which the term 'antivenomics' was coined (as described in an accompanying paper in this special issue). Venoms represent an adaptive trait and an example of both divergent and convergent evolution. A deep understanding of the composition of venoms and of the principles governing the evolution of venomous systems is of applied importance for exploring the enormous potential of venoms as sources of chemical and pharmacological novelty but also to fight the consequences of snakebite envenomings. Key to this is the identification of evolutionary and ecological trends at different taxonomical levels. However, the evolution of venomous species and their venoms do not always follow the same course, and the identification of structural and functional convergences and divergences among venoms is often unpredictable by a phylogenetic hypothesis. Snake venomics is a proteomic-centered strategy to deconstruct the complex molecular phenotypes the venom proteomes. The proteomic profiles of venoms from sixteen out of the 22 venomous species within the Viperidae and Elapidae families found in Costa Rica have been completed so far. An integrative view of their venom composition, including the identification of geographic and ontogenic variations, is hereby presented. Venom proteomic profiles offer a fundamental platform to assess the detailed immunorecognition of individual venom components by therapeutic or experimental antivenoms. This aspect is reviewed in the companion paper. This article is part of a Special Issue entitled: Proteomics of non-model organisms. Copyright © 2014 Elsevier B.V. All rights reserved.
In-Culture Cross-Linking of Bacterial Cells Reveals Large-Scale Dynamic Protein-Protein Interactions at the Peptide Level.

PubMed

de Jong, Luitzen; de Koning, Edward A; Roseboom, Winfried; Buncherd, Hansuk; Wanner, Martin J; Dapic, Irena; Jansen, Petra J; van Maarseveen, Jan H; Corthals, Garry L; Lewis, Peter J; Hamoen, Leendert W; de Koster, Chris G

2017-07-07

Identification of dynamic protein-protein interactions at the peptide level on a proteomic scale is a challenging approach that is still in its infancy. We have developed a system to cross-link cells directly in culture with the special lysine cross-linker bis(succinimidyl)-3-azidomethyl-glutarate (BAMG). We used the Gram-positive model bacterium Bacillus subtilis as an exemplar system. Within 5 min extensive intracellular cross-linking was detected, while intracellular cross-linking in a Gram-negative species, Escherichia coli, was still undetectable after 30 min, in agreement with the low permeability in this organism for lipophilic compounds like BAMG. We were able to identify 82 unique interprotein cross-linked peptides with <1% false discovery rate by mass spectrometry and genome-wide database searching. Nearly 60% of the interprotein cross-links occur in assemblies involved in transcription and translation. Several of these interactions are new, and we identified a binding site between the δ and β' subunit of RNA polymerase close to the downstream DNA channel, providing a clue into how δ might regulate promoter selectivity and promote RNA polymerase recycling. Our methodology opens new avenues to investigate the functional dynamic organization of complex protein assemblies involved in bacterial growth. Data are available via ProteomeXchange with identifier PXD006287.
The Landscape Evolution Observatory: a large-scale controllable infrastructure to study coupled Earth-surface processes

USGS Publications Warehouse

Pangle, Luke A.; DeLong, Stephen B.; Abramson, Nate; Adams, John; Barron-Gafford, Greg A.; Breshears, David D.; Brooks, Paul D.; Chorover, Jon; Dietrich, William E.; Dontsova, Katerina; Durcik, Matej; Espeleta, Javier; Ferré, T.P.A.; Ferriere, Regis; Henderson, Whitney; Hunt, Edward A.; Huxman, Travis E.; Millar, David; Murphy, Brendan; Niu, Guo-Yue; Pavao-Zuckerman, Mitch; Pelletier, Jon D.; Rasmussen, Craig; Ruiz, Joaquin; Saleska, Scott; Schaap, Marcel; Sibayan, Michael; Troch, Peter A.; Tuller, Markus; van Haren, Joost; Zeng, Xubin

2015-01-01

Zero-order drainage basins, and their constituent hillslopes, are the fundamental geomorphic unit comprising much of Earth's uplands. The convergent topography of these landscapes generates spatially variable substrate and moisture content, facilitating biological diversity and influencing how the landscape filters precipitation and sequesters atmospheric carbon dioxide. In light of these significant ecosystem services, refining our understanding of how these functions are affected by landscape evolution, weather variability, and long-term climate change is imperative. In this paper we introduce the Landscape Evolution Observatory (LEO): a large-scale controllable infrastructure consisting of three replicated artificial landscapes (each 330 m2 surface area) within the climate-controlled Biosphere 2 facility in Arizona, USA. At LEO, experimental manipulation of rainfall, air temperature, relative humidity, and wind speed are possible at unprecedented scale. The Landscape Evolution Observatory was designed as a community resource to advance understanding of how topography, physical and chemical properties of soil, and biological communities coevolve, and how this coevolution affects water, carbon, and energy cycles at multiple spatial scales. With well-defined boundary conditions and an extensive network of sensors and samplers, LEO enables an iterative scientific approach that includes numerical model development and virtual experimentation, physical experimentation, data analysis, and model refinement. We plan to engage the broader scientific community through public dissemination of data from LEO, collaborative experimental design, and community-based model development.
Proteomic analysis in type 2 diabetes patients before and after a very low calorie diet reveals potential disease state and intervention specific biomarkers.

PubMed

Sleddering, Maria A; Markvoort, Albert J; Dharuri, Harish K; Jeyakar, Skhandhan; Snel, Marieke; Juhasz, Peter; Lynch, Moira; Hines, Wade; Li, Xiaohong; Jazet, Ingrid M; Adourian, Aram; Hilbers, Peter A J; Smit, Johannes W A; Van Dijk, Ko Willems

2014-01-01

Very low calorie diets (VLCD) with and without exercise programs lead to major metabolic improvements in obese type 2 diabetes patients. The mechanisms underlying these improvements have so far not been elucidated fully. To further investigate the mechanisms of a VLCD with or without exercise and to uncover possible biomarkers associated with these interventions, blood samples were collected from 27 obese type 2 diabetes patients before and after a 16-week VLCD (Modifast ∼ 450 kcal/day). Thirteen of these patients followed an exercise program in addition to the VCLD. Plasma was obtained from 27 lean and 27 obese controls as well. Proteomic analysis was performed using mass spectrometry (MS) and targeted multiple reaction monitoring (MRM) and a large scale isobaric tags for relative and absolute quantitation (iTRAQ) approach. After the 16-week VLCD, there was a significant decrease in body weight and HbA1c in all patients, without differences between the two intervention groups. Targeted MRM analysis revealed differences in several proteins, which could be divided in diabetes-associated (fibrinogen, transthyretin), obesity-associated (complement C3), and diet-associated markers (apolipoproteins, especially apolipoprotein A-IV). To further investigate the effects of exercise, large scale iTRAQ analysis was performed. However, no proteins were found showing an exercise effect. Thus, in this study, specific proteins were found to be differentially expressed in type 2 diabetes patients versus controls and before and after a VLCD. These proteins are potential disease state and intervention specific biomarkers. Controlled-Trials.com ISRCTN76920690.
Proteomic Analysis in Type 2 Diabetes Patients before and after a Very Low Calorie Diet Reveals Potential Disease State and Intervention Specific Biomarkers

PubMed Central

Dharuri, Harish K.; Jeyakar, Skhandhan; Snel, Marieke; Juhasz, Peter; Lynch, Moira; Hines, Wade; Li, Xiaohong; Jazet, Ingrid M.; Adourian, Aram; Hilbers, Peter A. J.; Smit, Johannes W. A.; Van Dijk, Ko Willems

2014-01-01

Very low calorie diets (VLCD) with and without exercise programs lead to major metabolic improvements in obese type 2 diabetes patients. The mechanisms underlying these improvements have so far not been elucidated fully. To further investigate the mechanisms of a VLCD with or without exercise and to uncover possible biomarkers associated with these interventions, blood samples were collected from 27 obese type 2 diabetes patients before and after a 16-week VLCD (Modifast ∼450 kcal/day). Thirteen of these patients followed an exercise program in addition to the VCLD. Plasma was obtained from 27 lean and 27 obese controls as well. Proteomic analysis was performed using mass spectrometry (MS) and targeted multiple reaction monitoring (MRM) and a large scale isobaric tags for relative and absolute quantitation (iTRAQ) approach. After the 16-week VLCD, there was a significant decrease in body weight and HbA1c in all patients, without differences between the two intervention groups. Targeted MRM analysis revealed differences in several proteins, which could be divided in diabetes-associated (fibrinogen, transthyretin), obesity-associated (complement C3), and diet-associated markers (apolipoproteins, especially apolipoprotein A-IV). To further investigate the effects of exercise, large scale iTRAQ analysis was performed. However, no proteins were found showing an exercise effect. Thus, in this study, specific proteins were found to be differentially expressed in type 2 diabetes patients versus controls and before and after a VLCD. These proteins are potential disease state and intervention specific biomarkers. Trial Registration Controlled-Trials.com ISRCTN76920690 PMID:25415563
New Genes and Functional Innovation in Mammals

PubMed Central

Luis Villanueva-Cañas, José; Ruiz-Orera, Jorge; Agea, M. Isabel; Gallo, Maria; Andreu, David

2017-01-01

Abstract The birth of genes that encode new protein sequences is a major source of evolutionary innovation. However, we still understand relatively little about how these genes come into being and which functions they are selected for. To address these questions, we have obtained a large collection of mammalian-specific gene families that lack homologues in other eukaryotic groups. We have combined gene annotations and de novo transcript assemblies from 30 different mammalian species, obtaining ∼6,000 gene families. In general, the proteins in mammalian-specific gene families tend to be short and depleted in aromatic and negatively charged residues. Proteins which arose early in mammalian evolution include milk and skin polypeptides, immune response components, and proteins involved in reproduction. In contrast, the functions of proteins which have a more recent origin remain largely unknown, despite the fact that these proteins also have extensive proteomics support. We identify several previously described cases of genes originated de novo from noncoding genomic regions, supporting the idea that this mechanism frequently underlies the evolution of new protein-coding genes in mammals. Finally, we show that most young mammalian genes are preferentially expressed in testis, suggesting that sexual selection plays an important role in the emergence of new functional genes. PMID:28854603
Mapping HLA-A2, -A3 and -B7 supertype-restricted T-cell epitopes in the ebolavirus proteome.

PubMed

Lim, Wan Ching; Khan, Asif M

2018-01-19

Ebolavirus (EBOV) is responsible for one of the most fatal diseases encountered by mankind. Cellular T-cell responses have been implicated to be important in providing protection against the virus. Antigenic variation can result in viral escape from immune recognition. Mapping targets of immune responses among the sequence of viral proteins is, thus, an important first step towards understanding the immune responses to viral variants and can aid in the identification of vaccine targets. Herein, we performed a large-scale, proteome-wide mapping and diversity analyses of putative HLA supertype-restricted T-cell epitopes of Zaire ebolavirus (ZEBOV), the most pathogenic species among the EBOV family. All publicly available ZEBOV sequences (14,098) for each of the nine viral proteins were retrieved, removed of irrelevant and duplicate sequences, and aligned. The overall proteome diversity of the non-redundant sequences was studied by use of Shannon's entropy. The sequences were predicted, by use of the NetCTLpan server, for HLA-A2, -A3, and -B7 supertype-restricted epitopes, which are relevant to African and other ethnicities and provide for large (~86%) population coverage. The predicted epitopes were mapped to the alignment of each protein for analyses of antigenic sequence diversity and relevance to structure and function. The putative epitopes were validated by comparison with experimentally confirmed epitopes. ZEBOV proteome was generally conserved, with an average entropy of 0.16. The 185 HLA supertype-restricted T-cell epitopes predicted (82 (A2), 37 (A3) and 66 (B7)) mapped to 125 alignment positions and covered ~24% of the proteome length. Many of the epitopes showed a propensity to co-localize at select positions of the alignment. Thirty (30) of the mapped positions were completely conserved and may be attractive for vaccine design. The remaining (95) positions had one or more epitopes, with or without non-epitope variants. A significant number (24) of the putative epitopes matched reported experimentally validated HLA ligands/T-cell epitopes of A2, A3 and/or B7 supertype representative allele restrictions. The epitopes generally corresponded to functional motifs/domains and there was no correlation to localization on the protein 3D structure. These data and the epitope map provide important insights into the interaction between EBOV and the host immune system.
Proteomics data repositories

PubMed Central

Riffle, Michael; Eng, Jimmy K.

2010-01-01

The field of proteomics, particularly the application of mass spectrometry analysis to protein samples, is well-established and growing rapidly. Proteomics studies generate large volumes of raw experimental data and inferred biological results. To facilitate the dissemination of these data, centralized data repositories have been developed that make the data and results accessible to proteomics researchers and biologists alike. This review of proteomics data repositories focuses exclusively on freely-available, centralized data resources that disseminate or store experimental mass spectrometry data and results. The resources chosen reflect a current “snapshot” of the state of resources available with an emphasis placed on resources that may be of particular interest to yeast researchers. Resources are described in terms of their intended purpose and the features and functionality provided to users. PMID:19795424
Learning, climate and the evolution of cultural capacity.

PubMed

Whitehead, Hal

2007-03-21

Patterns of environmental variation influence the utility, and thus evolution, of different learning strategies. I use stochastic, individual-based evolutionary models to assess the relative advantages of 15 different learning strategies (genetic determination, individual learning, vertical social learning, horizontal/oblique social learning, and contingent combinations of these) when competing in variable environments described by 1/f noise. When environmental variation has little effect on fitness, then genetic determinism persists. When environmental variation is large and equal over all time-scales ("white noise") then individual learning is adaptive. Social learning is advantageous in "red noise" environments when variation over long time-scales is large. Climatic variability increases with time-scale, so that short-lived organisms should be able to rely largely on genetic determination. Thermal climates usually are insufficiently red for social learning to be advantageous for species whose fitness is very determined by temperature. In contrast, population trajectories of many species, especially large mammals and aquatic carnivores, are sufficiently red to promote social learning in their predators. The ocean environment is generally redder than that on land. Thus, while individual learning should be adaptive for many longer-lived organisms, social learning will often be found in those dependent on the populations of other species, especially if they are marine. This provides a potential explanation for the evolution of a prevalence of social learning, and culture, in humans and cetaceans.
Large-scale transportation network congestion evolution prediction using deep learning theory.

PubMed

Ma, Xiaolei; Yu, Haiyang; Wang, Yunpeng; Wang, Yinhai

2015-01-01

Understanding how congestion at one location can cause ripples throughout large-scale transportation network is vital for transportation researchers and practitioners to pinpoint traffic bottlenecks for congestion mitigation. Traditional studies rely on either mathematical equations or simulation techniques to model traffic congestion dynamics. However, most of the approaches have limitations, largely due to unrealistic assumptions and cumbersome parameter calibration process. With the development of Intelligent Transportation Systems (ITS) and Internet of Things (IoT), transportation data become more and more ubiquitous. This triggers a series of data-driven research to investigate transportation phenomena. Among them, deep learning theory is considered one of the most promising techniques to tackle tremendous high-dimensional data. This study attempts to extend deep learning theory into large-scale transportation network analysis. A deep Restricted Boltzmann Machine and Recurrent Neural Network architecture is utilized to model and predict traffic congestion evolution based on Global Positioning System (GPS) data from taxi. A numerical study in Ningbo, China is conducted to validate the effectiveness and efficiency of the proposed method. Results show that the prediction accuracy can achieve as high as 88% within less than 6 minutes when the model is implemented in a Graphic Processing Unit (GPU)-based parallel computing environment. The predicted congestion evolution patterns can be visualized temporally and spatially through a map-based platform to identify the vulnerable links for proactive congestion mitigation.
Large-Scale Transportation Network Congestion Evolution Prediction Using Deep Learning Theory

PubMed Central

Ma, Xiaolei; Yu, Haiyang; Wang, Yunpeng; Wang, Yinhai

2015-01-01

Understanding how congestion at one location can cause ripples throughout large-scale transportation network is vital for transportation researchers and practitioners to pinpoint traffic bottlenecks for congestion mitigation. Traditional studies rely on either mathematical equations or simulation techniques to model traffic congestion dynamics. However, most of the approaches have limitations, largely due to unrealistic assumptions and cumbersome parameter calibration process. With the development of Intelligent Transportation Systems (ITS) and Internet of Things (IoT), transportation data become more and more ubiquitous. This triggers a series of data-driven research to investigate transportation phenomena. Among them, deep learning theory is considered one of the most promising techniques to tackle tremendous high-dimensional data. This study attempts to extend deep learning theory into large-scale transportation network analysis. A deep Restricted Boltzmann Machine and Recurrent Neural Network architecture is utilized to model and predict traffic congestion evolution based on Global Positioning System (GPS) data from taxi. A numerical study in Ningbo, China is conducted to validate the effectiveness and efficiency of the proposed method. Results show that the prediction accuracy can achieve as high as 88% within less than 6 minutes when the model is implemented in a Graphic Processing Unit (GPU)-based parallel computing environment. The predicted congestion evolution patterns can be visualized temporally and spatially through a map-based platform to identify the vulnerable links for proactive congestion mitigation. PMID:25780910
Structure and evolution of the large scale solar and heliospheric magnetic fields. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Hoeksema, J. T.

1984-01-01

Structure and evolution of large scale photospheric and coronal magnetic fields in the interval 1976-1983 were studied using observations from the Stanford Solar Observatory and a potential field model. The solar wind in the heliosphere is organized into large regions in which the magnetic field has a componenet either toward or away from the sun. The model predicts the location of the current sheet separating these regions. Near solar minimum, in 1976, the current sheet lay within a few degrees of the solar equator having two extensions north and south of the equator. Soon after minimum the latitudinal extent began to increase. The sheet reached to at least 50 deg from 1978 through 1983. The complex structure near maximum occasionally included multiple current sheets. Large scale structures persist for up to two years during the entire interval. To minimize errors in determining the structure of the heliospheric field particular attention was paid to decreasing the distorting effects of rapid field evolution, finding the optimum source surface radius, determining the correction to the sun's polar field, and handling missing data. The predicted structure agrees with direct interplanetary field measurements taken near the ecliptic and with coronameter and interplanetary scintillation measurements which infer the three dimensional interplanetary magnetic structure. During most of the solar cycle the heliospheric field cannot be adequately described as a dipole.
Variation in Orthologous Shell-Forming Proteins Contribute to Molluscan Shell Diversity.

PubMed

Jackson, Daniel J; Reim, Laurin; Randow, Clemens; Cerveau, Nicolas; Degnan, Bernard M; Fleck, Claudia

2017-11-01

Despite the evolutionary success and ancient heritage of the molluscan shell, little is known about the molecular details of its formation, evolutionary origins, or the interactions between the material properties of the shell and its organic constituents. In contrast to this dearth of information, a growing collection of molluscan shell-forming proteomes and transcriptomes suggest they are comprised of both deeply conserved, and lineage specific elements. Analyses of these sequence data sets have suggested that mechanisms such as exon shuffling, gene co-option, and gene family expansion facilitated the rapid evolution of shell-forming proteomes and supported the diversification of this phylum specific structure. In order to further investigate and test these ideas we have examined the molecular features and spatial expression patterns of two shell-forming genes (Lustrin and ML1A2) and coupled these observations with materials properties measurements of shells from a group of closely related gastropods (abalone). We find that the prominent "GS" domain of Lustrin, a domain believed to confer elastomeric properties to the shell, varies significantly in length between the species we investigated. Furthermore, the spatial expression patterns of Lustrin and ML1A2 also vary significantly between species, suggesting that both protein architecture, and the regulation of spatial gene expression patterns, are important drivers of molluscan shell evolution. Variation in these molecular features might relate to certain materials properties of the shells of these species. These insights reveal an important and underappreciated source of variation within shell-forming proteomes that must contribute to the diversity of molluscan shell phenotypes. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Dynamical evolution of domain walls in an expanding universe

NASA Technical Reports Server (NTRS)

Press, William H.; Ryden, Barbara S.; Spergel, David N.

1989-01-01

Whenever the potential of a scalar field has two or more separated, degenerate minima, domain walls form as the universe cools. The evolution of the resulting network of domain walls is calculated for the case of two potential minima in two and three dimensions, including wall annihilation, crossing, and reconnection effects. The nature of the evolution is found to be largely independent of the rate at which the universe expands. Wall annihilation and reconnection occur almost as fast as causality allows, so that the horizon volume is 'swept clean' and contains, at any time, only about one, fairly smooth, wall. Quantitative statistics are given. The total area of wall per volume decreases as the first power of time. The relative slowness of the decrease and the smoothness of the wall on the horizon scale make it impossible for walls to both generate large-scale structure and be consistent with quadrupole microwave background anisotropy limits.
Large scale, spatially-explicit test of the refuge strategy for delaying insecticide resistance

USDA-ARS?s Scientific Manuscript database

The refuge strategy used worldwide to delay the evolution of arthropod resistance to pesticides consists of leaving areas where a pesticide is not used near fields where the pesticide is used. Yet, empirical approaches are lacking to characterize effects of putative refuges on resistance evolution. ...
Single-cell proteomics: potential implications for cancer diagnostics.

PubMed

Gavasso, Sonia; Gullaksen, Stein-Erik; Skavland, Jørn; Gjertsen, Bjørn T

2016-01-01

Single-cell proteomics in cancer is evolving and promises to provide more accurate diagnoses based on detailed molecular features of cells within tumors. This review focuses on technologies that allow for collection of complex data from single cells, but also highlights methods that are adaptable to routine cancer diagnostics. Current diagnostics rely on histopathological analysis, complemented by mutational detection and clinical imaging. Though crucial, the information gained is often not directly transferable to defined therapeutic strategies, and predicting therapy response in a patient is difficult. In cancer, cellular states revealed through perturbed intracellular signaling pathways can identify functional mutations recurrent in cancer subsets. Single-cell proteomics remains to be validated in clinical trials where serial samples before and during treatment can reveal excessive clonal evolution and therapy failure; its use in clinical trials is anticipated to ignite a diagnostic revolution that will better align diagnostics with the current biological understanding of cancer.
Proteomics Improves the New Understanding of Honeybee Biology.

PubMed

Hora, Zewdu Ararso; Altaye, Solomon Zewdu; Wubie, Abebe Jemberie; Li, Jianke

2018-04-11

The honeybee is one of the most valuable insect pollinators, playing a key role in pollinating wild vegetation and agricultural crops, with significant contribution to the world's food production. Although honeybees have long been studied as model for social evolution, honeybee biology at the molecular level remained poorly understood until the year 2006. With the availability of the honeybee genome sequence and technological advancements in protein separation, mass spectrometry, and bioinformatics, aspects of honeybee biology such as developmental biology, physiology, behavior, neurobiology, and immunology have been explored to new depths at molecular and biochemical levels. This Review comprehensively summarizes the recent progress in honeybee biology using proteomics to study developmental physiology, task transition, and physiological changes in some of the organs, tissues, and cells based on achievements from the authors' laboratory in this field. The research advances of honeybee proteomics provide new insights for understanding of honeybee biology and future research directions.
Fundamental tests of galaxy formation theory

NASA Technical Reports Server (NTRS)

Silk, J.

1982-01-01

The structure of the universe as an environment where traces exist of the seed fluctuations from which galaxies formed is studied. The evolution of the density fluctuation modes that led to the eventual formation of matter inhomogeneities is reviewed, How the resulting clumps developed into galaxies and galaxy clusters acquiring characteristic masses, velocity dispersions, and metallicities, is discussed. Tests are described that utilize the large scale structure of the universe, including the dynamics of the local supercluster, the large scale matter distribution, and the anisotropy of the cosmic background radiation, to probe the earliest accessible stages of evolution. Finally, the role of particle physics is described with regard to its observable implications for galaxy formation.
When is bigger better? The effects of group size on the evolution of helping behaviours.

PubMed

Powers, Simon T; Lehmann, Laurent

2017-05-01

Understanding the evolution of sociality in humans and other species requires understanding how selection on social behaviour varies with group size. However, the effects of group size are frequently obscured in the theoretical literature, which often makes assumptions that are at odds with empirical findings. In particular, mechanisms are suggested as supporting large-scale cooperation when they would in fact rapidly become ineffective with increasing group size. Here we review the literature on the evolution of helping behaviours (cooperation and altruism), and frame it using a simple synthetic model that allows us to delineate how the three main components of the selection pressure on helping must vary with increasing group size. The first component is the marginal benefit of helping to group members, which determines both direct fitness benefits to the actor and indirect fitness benefits to recipients. While this is often assumed to be independent of group size, marginal benefits are in practice likely to be maximal at intermediate group sizes for many types of collective action problems, and will eventually become very small in large groups due to the law of decreasing marginal returns. The second component is the response of social partners on the past play of an actor, which underlies conditional behaviour under repeated social interactions. We argue that under realistic conditions on the transmission of information in a population, this response on past play decreases rapidly with increasing group size so that reciprocity alone (whether direct, indirect, or generalised) cannot sustain cooperation in very large groups. The final component is the relatedness between actor and recipient, which, according to the rules of inheritance, again decreases rapidly with increasing group size. These results explain why helping behaviours in very large social groups are limited to cases where the number of reproducing individuals is small, as in social insects, or where there are social institutions that can promote (possibly through sanctioning) large-scale cooperation, as in human societies. Finally, we discuss how individually devised institutions can foster the transition from small-scale to large-scale cooperative groups in human evolution. © 2016 Cambridge Philosophical Society.
Modulation of Small-scale Turbulence Structure by Large-scale Motions in the Absence of Direct Energy Transfer.

NASA Astrophysics Data System (ADS)

Brasseur, James G.; Juneja, Anurag

1996-11-01

Previous DNS studies indicate that small-scale structure can be directly altered through ``distant'' dynamical interactions by energetic forcing of the large scales. To remove the possibility of stimulating energy transfer between the large- and small-scale motions in these long-range interactions, we here perturb the large scale structure without altering its energy content by suddenly altering only the phases of large-scale Fourier modes. Scale-dependent changes in turbulence structure appear as a non zero difference field between two simulations from identical initial conditions of isotropic decaying turbulence, one perturbed and one unperturbed. We find that the large-scale phase perturbations leave the evolution of the energy spectrum virtually unchanged relative to the unperturbed turbulence. The difference field, on the other hand, is strongly affected by the perturbation. Most importantly, the time scale τ characterizing the change in in turbulence structure at spatial scale r shortly after initiating a change in large-scale structure decreases with decreasing turbulence scale r. Thus, structural information is transferred directly from the large- to the smallest-scale motions in the absence of direct energy transfer---a long-range effect which cannot be explained by a linear mechanism such as rapid distortion theory. * Supported by ARO grant DAAL03-92-G-0117

The co-evolution of social institutions, demography, and large-scale human cooperation.

PubMed

Powers, Simon T; Lehmann, Laurent

2013-11-01

Human cooperation is typically coordinated by institutions, which determine the outcome structure of the social interactions individuals engage in. Explaining the Neolithic transition from small- to large-scale societies involves understanding how these institutions co-evolve with demography. We study this using a demographically explicit model of institution formation in a patch-structured population. Each patch supports both social and asocial niches. Social individuals create an institution, at a cost to themselves, by negotiating how much of the costly public good provided by cooperators is invested into sanctioning defectors. The remainder of their public good is invested in technology that increases carrying capacity, such as irrigation systems. We show that social individuals can invade a population of asocials, and form institutions that support high levels of cooperation. We then demonstrate conditions where the co-evolution of cooperation, institutions, and demographic carrying capacity creates a transition from small- to large-scale social groups. © 2013 John Wiley & Sons Ltd/CNRS.
Large-scale gas dynamical processes affecting the origin and evolution of gaseous galactic halos

NASA Technical Reports Server (NTRS)

Shapiro, Paul R.

1991-01-01

Observations of galactic halo gas are consistent with an interpretation in terms of the galactic fountain model in which supernova heated gas in the galactic disk escapes into the halo, radiatively cools and forms clouds which fall back to the disk. The results of a new study of several large-scale gas dynamical effects which are expected to occur in such a model for the origin and evolution of galactic halo gas will be summarized, including the following: (1) nonequilibrium absorption line and emission spectrum diagnostics for radiatively cooling halo gas in our own galaxy, as well the implications of such absorption line diagnostics for the origin of quasar absorption lines in galactic halo clouds of high redshift galaxies; (2) numerical MHD simulations and analytical analysis of large-scale explosions ad superbubbles in the galactic disk and halo; (3) numerical MHD simulations of halo cloud formation by thermal instability, with and without magnetic field; and (4) the effect of the galactic fountain on the galactic dynamo.
The Explorer of Diffuse Galactic Emission (EDGE): Determining the Large-Scale Structure Evolution in the Universe

NASA Technical Reports Server (NTRS)

Silverberg, R. F.; Cheng, E. S.; Cottingham, D. A.; Fixsen, D. J.; Meyer, S. S.; Knox, L.; Timbie, P.; Wilson, G.

2003-01-01

Measurements of the large-scale anisotropy of the Cosmic Infared Background (CIB) can be used to determine the characteristics of the distribution of galaxies at the largest spatial scales. With this information important tests of galaxy evolution models and primordial structure growth are possible. In this paper, we describe the scientific goals, instrumentation, and operation of EDGE, a mission using an Antarctic Long Duration Balloon (LDB) platform. EDGE will osbserve the anisotropy in the CIB in 8 spectral bands from 270 GHz-1.5 THz with 6 arcminute angular resolution over a region -400 square degrees. EDGE uses a one-meter class off-axis telescope and an array of Frequency Selective Bololeters (FSB) to provide the compact and efficient multi-colar, high sensitivity radiometer required to achieve its scientific objectives.
The dynamics of magnetic flux rings

NASA Technical Reports Server (NTRS)

Deluca, E. E.; Fisher, G. H.; Patten, B. M.

1993-01-01

The evolution of magnetic fields in the presence of turbulent convection is examined using results of numerical simulations of closed magnetic flux tubes embedded in a steady 'ABC' flow field, which approximate some of the important characteristics of a turbulent convecting flow field. Three different evolutionary scenarios were found: expansion to a steady deformed ring; collapse to a compact fat flux ring, separated from the expansion type of behavior by a critical length scale; and, occasionally, evolution toward an advecting, oscillatory state. The work suggests that small-scale flows will not have a strong effect on large-scale, strong fields.
The membrane proteome of Medicago truncatula roots displays qualitative and quantitative changes in response to arbuscular mycorrhizal symbiosis.

PubMed

Abdallah, Cosette; Valot, Benoit; Guillier, Christelle; Mounier, Arnaud; Balliau, Thierry; Zivy, Michel; van Tuinen, Diederik; Renaut, Jenny; Wipf, Daniel; Dumas-Gaudot, Eliane; Recorbet, Ghislaine

2014-08-28

Arbuscular mycorrhizal (AM) symbiosis that associates roots of most land plants with soil-borne fungi (Glomeromycota), is characterized by reciprocal nutritional benefits. Fungal colonization of plant roots induces massive changes in cortical cells where the fungus differentiates an arbuscule, which drives proliferation of the plasma membrane. Despite the recognized importance of membrane proteins in sustaining AM symbiosis, the root microsomal proteome elicited upon mycorrhiza still remains to be explored. In this study, we first examined the qualitative composition of the root membrane proteome of Medicago truncatula after microsome enrichment and subsequent in depth analysis by GeLC-MS/MS. The results obtained highlighted the identification of 1226 root membrane protein candidates whose cellular and functional classifications predispose plastids and protein synthesis as prevalent organelle and function, respectively. Changes at the protein abundance level between the membrane proteomes of mycorrhizal and nonmycorrhizal roots were further monitored by spectral counting, which retrieved a total of 96 proteins that displayed a differential accumulation upon AM symbiosis. Besides the canonical markers of the periarbuscular membrane, new candidates supporting the importance of membrane trafficking events during mycorrhiza establishment/functioning were identified, including flotillin-like proteins. The data have been deposited to the ProteomeXchange with identifier PXD000875. During arbuscular mycorrhizal symbiosis, one of the most widespread mutualistic associations in nature, the endomembrane system of plant roots is believed to undergo qualitative and quantitative changes in order to sustain both the accommodation process of the AM fungus within cortical cells and the exchange of nutrients between symbionts. Large-scale GeLC-MS/MS proteomic analysis of the membrane fractions from mycorrhizal and nonmycorrhizal roots of M. truncatula coupled to spectral counting retrieved around one hundred proteins that displayed changes in abundance upon mycorrhizal establishment. The symbiosis-related membrane proteins that were identified mostly function in signaling/membrane trafficking and nutrient uptake regulation. Besides extending the coverage of the root membrane proteome of M. truncatula, new candidates involved in the symbiotic program emerged from the current study, which pointed out a dynamic reorganization of microsomal proteins during the accommodation of AM fungi within cortical cells. Copyright © 2014 Elsevier B.V. All rights reserved.
Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution.

PubMed

Clarke, Thomas H; Garb, Jessica E; Hayashi, Cheryl Y; Arensburger, Peter; Ayoub, Nadia A

2015-06-08

The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.

Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the “generalist” (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions, prediction errors for growth rate and metabolic fluxes were 69% and 14% lower, respectively. The sector-constrained ME model thusmore » represents a generalist ME model reflecting both growth rate maximization and “hedging” against uncertain environments and stresses, as indicated by significant enrichment of these sectors for the general stress response sigma factor σS. Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally-related protein groups) as demonstrated here. Furthermore, this flexible formalism provides an accessible approach for narrowing the gap between the complexity captured by omics data and governing principles of proteome allocation described by systems-level models.« less
Primordial large-scale electromagnetic fields from gravitoelectromagnetic inflation

NASA Astrophysics Data System (ADS)

Membiela, Federico Agustín; Bellini, Mauricio

2009-04-01

We investigate the origin and evolution of primordial electric and magnetic fields in the early universe, when the expansion is governed by a cosmological constant Λ0. Using the gravitoelectromagnetic inflationary formalism with A0 = 0, we obtain the power of spectrums for large-scale magnetic fields and the inflaton field fluctuations during inflation. A very important fact is that our formalism is naturally non-conformally invariant.
Large scale shell model study of the evolution of mixed-symmetry states in chains of nuclei around 132Sn

NASA Astrophysics Data System (ADS)

Lo Iudice, N.; Bianco, D.; Andreozzi, F.; Porrino, A.; Knapp, F.

2012-10-01

Large scale shell model calculations based on a new diagonalization algorithm are performed in order to investigate the mixed symmetry states in chains of nuclei in the proximity of N=82. The resulting spectra and transitions are in agreement with the experiments and consistent with the scheme provided by the interacting boson model.
Large-scale interaction profiling of PDZ domains through proteomic peptide-phage display using human and viral phage peptidomes.

PubMed

Ivarsson, Ylva; Arnold, Roland; McLaughlin, Megan; Nim, Satra; Joshi, Rakesh; Ray, Debashish; Liu, Bernard; Teyra, Joan; Pawson, Tony; Moffat, Jason; Li, Shawn Shun-Cheng; Sidhu, Sachdev S; Kim, Philip M

2014-02-18

The human proteome contains a plethora of short linear motifs (SLiMs) that serve as binding interfaces for modular protein domains. Such interactions are crucial for signaling and other cellular processes, but are difficult to detect because of their low to moderate affinities. Here we developed a dedicated approach, proteomic peptide-phage display (ProP-PD), to identify domain-SLiM interactions. Specifically, we generated phage libraries containing all human and viral C-terminal peptides using custom oligonucleotide microarrays. With these libraries we screened the nine PSD-95/Dlg/ZO-1 (PDZ) domains of human Densin-180, Erbin, Scribble, and Disks large homolog 1 for peptide ligands. We identified several known and putative interactions potentially relevant to cellular signaling pathways and confirmed interactions between full-length Scribble and the target proteins β-PIX, plakophilin-4, and guanylate cyclase soluble subunit α-2 using colocalization and coimmunoprecipitation experiments. The affinities of recombinant Scribble PDZ domains and the synthetic peptides representing the C termini of these proteins were in the 1- to 40-μM range. Furthermore, we identified several well-established host-virus protein-protein interactions, and confirmed that PDZ domains of Scribble interact with the C terminus of Tax-1 of human T-cell leukemia virus with micromolar affinity. Previously unknown putative viral protein ligands for the PDZ domains of Scribble and Erbin were also identified. Thus, we demonstrate that our ProP-PD libraries are useful tools for probing PDZ domain interactions. The method can be extended to interrogate all potential eukaryotic, bacterial, and viral SLiMs and we suggest it will be a highly valuable approach for studying cellular and pathogen-host protein-protein interactions.
Serum quantitative proteomic analysis reveals potential zinc-associated biomarkers for nonbacterial prostatitis.

PubMed

Yang, Xiaoli; Li, Hongtao; Zhang, Chengdong; Lin, Zhidi; Zhang, Xinhua; Zhang, Youjie; Yu, Yanbao; Liu, Kun; Li, Muyan; Zhang, Yuening; Lv, Wenxin; Xie, Yuanliang; Lu, Zheng; Wu, Chunlei; Teng, Ruobing; Lu, Shaoming; He, Min; Mo, Zengnan

2015-10-01

Prostatitis is one of the most common urological problems afflicting adult men. The etiology and pathogenesis of nonbacterial prostatitis, which accounts for 90-95% of cases, is largely unknown. As serum proteins often indicate the overall pathologic status of patients, we hypothesized that protein biomarkers of prostatitis might be identified by comparing the serum proteomes of patients with and without nonbacterial prostatitis. All untreated samples were collected from subjects attending the Fangchenggang Area Male Health and Examination Survey (FAMHES). We profiled pooled serum samples from four carefully selected groups of patients (n = 10/group) representing the various categories of nonbacterial prostatitis (IIIa, IIIb, and IV) and matched healthy controls using a mass spectrometry-based 4-plex iTRAQ proteomic approach. More than 160 samples were validated by ELISA. Overall, 69 proteins were identified. Among them, 42, 52, and 37 proteins were identified with differential expression in Category IIIa, IIIb, and IV prostatitis, respectively. The 19 common proteins were related to immunity and defense, ion binding, transport, and proteolysis. Two zinc-binding proteins, superoxide dismutase 3 (SOD3), and carbonic anhydrase I (CA1), were significantly higher in all types of prostatitis than in the control. A receiver operating characteristic curve estimated sensitivities of 50.4 and 68.1% and specificities of 92.1 and 83.8% for CA1 and SOD3, respectively, in detecting nonbacterial prostatitis. The serum CA1 concentration was inversely correlated to the zinc concentration in expressed-prostatic secretions. Our findings suggest that SOD3 and CA1 are potential diagnostic markers of nonbacterial prostatitis, although further large-scale studies are required. The molecular profiles of nonbacterial prostatitis pathogenesis may lay a foundation for discovery of new therapies. © 2015 Wiley Periodicals, Inc.
Expert system for computer-assisted annotation of MS/MS spectra.

PubMed

Neuhauser, Nadin; Michalski, Annette; Cox, Jürgen; Mann, Matthias

2012-11-01

An important step in mass spectrometry (MS)-based proteomics is the identification of peptides by their fragment spectra. Regardless of the identification score achieved, almost all tandem-MS (MS/MS) spectra contain remaining peaks that are not assigned by the search engine. These peaks may be explainable by human experts but the scale of modern proteomics experiments makes this impractical. In computer science, Expert Systems are a mature technology to implement a list of rules generated by interviews with practitioners. We here develop such an Expert System, making use of literature knowledge as well as a large body of high mass accuracy and pure fragmentation spectra. Interestingly, we find that even with high mass accuracy data, rule sets can quickly become too complex, leading to over-annotation. Therefore we establish a rigorous false discovery rate, calculated by random insertion of peaks from a large collection of other MS/MS spectra, and use it to develop an optimized knowledge base. This rule set correctly annotates almost all peaks of medium or high abundance. For high resolution HCD data, median intensity coverage of fragment peaks in MS/MS spectra increases from 58% by search engine annotation alone to 86%. The resulting annotation performance surpasses a human expert, especially on complex spectra such as those of larger phosphorylated peptides. Our system is also applicable to high resolution collision-induced dissociation data. It is available both as a part of MaxQuant and via a webserver that only requires an MS/MS spectrum and the corresponding peptides sequence, and which outputs publication quality, annotated MS/MS spectra (www.biochem.mpg.de/mann/tools/). It provides expert knowledge to beginners in the field of MS-based proteomics and helps advanced users to focus on unusual and possibly novel types of fragment ions.
Expert System for Computer-assisted Annotation of MS/MS Spectra*

PubMed Central

Neuhauser, Nadin; Michalski, Annette; Cox, Jürgen; Mann, Matthias

2012-01-01

An important step in mass spectrometry (MS)-based proteomics is the identification of peptides by their fragment spectra. Regardless of the identification score achieved, almost all tandem-MS (MS/MS) spectra contain remaining peaks that are not assigned by the search engine. These peaks may be explainable by human experts but the scale of modern proteomics experiments makes this impractical. In computer science, Expert Systems are a mature technology to implement a list of rules generated by interviews with practitioners. We here develop such an Expert System, making use of literature knowledge as well as a large body of high mass accuracy and pure fragmentation spectra. Interestingly, we find that even with high mass accuracy data, rule sets can quickly become too complex, leading to over-annotation. Therefore we establish a rigorous false discovery rate, calculated by random insertion of peaks from a large collection of other MS/MS spectra, and use it to develop an optimized knowledge base. This rule set correctly annotates almost all peaks of medium or high abundance. For high resolution HCD data, median intensity coverage of fragment peaks in MS/MS spectra increases from 58% by search engine annotation alone to 86%. The resulting annotation performance surpasses a human expert, especially on complex spectra such as those of larger phosphorylated peptides. Our system is also applicable to high resolution collision-induced dissociation data. It is available both as a part of MaxQuant and via a webserver that only requires an MS/MS spectrum and the corresponding peptides sequence, and which outputs publication quality, annotated MS/MS spectra (www.biochem.mpg.de/mann/tools/). It provides expert knowledge to beginners in the field of MS-based proteomics and helps advanced users to focus on unusual and possibly novel types of fragment ions. PMID:22888147
Proteogenomic strategies for identification of aberrant cancer peptides using large-scale Next Generation Sequencing data

DOE PAGES

Woo, Sunghee; Cha, Seong Won; Na, Seungjin; ...

2014-11-17

Cancer is driven by the acquisition of somatic DNA lesions. Distinguishing the early driver mutations from subsequent passenger mutations is key to molecular sub-typing of cancers, and the discovery of novel biomarkers. The availability of genomics technologies (mainly wholegenome and exome sequencing, and transcript sampling via RNA-seq, collectively referred to as NGS) have fueled recent studies on somatic mutation discovery. However, the vision is challenged by the complexity, redundancy, and errors in genomic data, and the difficulty of investigating the proteome using only genomic approaches. Recently, combination of proteomic and genomic technologies are increasingly employed. However, the complexity and redundancymore » of NGS data remains a challenge for proteogenomics, and various trade-offs must be made to allow for the searches to take place. This paperprovides a discussion of two such trade-offs, relating to large database search, and FDR calculations, and their implication to cancer proteogenomics. Moreover, it extends and develops the idea of a unified genomic variant database that can be searched by any mass spectrometry sample. A total of 879 BAM files downloaded from TCGA repository were used to create a 4.34 GB unified FASTA database which contained 2,787,062 novel splice junctions, 38,464 deletions, 1105 insertions, and 182,302 substitutions. Proteomic data from a single ovarian carcinoma sample (439,858 spectra) was searched against the database. By applying the most conservative FDR measure, we have identified 524 novel peptides and 65,578 known peptides at 1% FDR threshold. The novel peptides include interesting examples of doubly mutated peptides, frame-shifts, and non-sample-recruited mutations, which emphasize the strength of our approach.« less
Analysis of the pumpkin phloem proteome provides insights into angiosperm sieve tube function.

PubMed

Lin, Ming-Kuem; Lee, Young-Jin; Lough, Tony J; Phinney, Brett S; Lucas, William J

2009-02-01

Increasing evidence suggests that proteins present in the angiosperm sieve tube system play an important role in the long distance signaling system of plants. To identify the nature of these putatively non-cell-autonomous proteins, we adopted a large scale proteomics approach to analyze pumpkin phloem exudates. Phloem proteins were fractionated by fast protein liquid chromatography using both anion and cation exchange columns and then either in-solution or in-gel digested following further separation by SDS-PAGE. A total of 345 LC-MS/MS data sets were analyzed using a combination of Mascot and X!Tandem against the NCBI non-redundant green plant database and an extensive Cucurbit maxima expressed sequence tag database. In this analysis, 1,209 different consensi were obtained of which 1,121 could be annotated from GenBank and BLAST search analyses against three plant species, Arabidopsis thaliana, rice (Oryza sativa), and poplar (Populus trichocarpa). Gene ontology (GO) enrichment analyses identified sets of phloem proteins that function in RNA binding, mRNA translation, ubiquitin-mediated proteolysis, and macromolecular and vesicle trafficking. Our findings indicate that protein synthesis and turnover, processes that were thought to be absent in enucleate sieve elements, likely occur within the angiosperm phloem translocation stream. In addition, our GO analysis identified a set of phloem proteins that are associated with the GO term "embryonic development ending in seed dormancy"; this finding raises the intriguing question as to whether the phloem may exert some level of control over seed development. The universal significance of the phloem proteome was highlighted by conservation of the phloem proteome in species as diverse as monocots (rice), eudicots (Arabidopsis and pumpkin), and trees (poplar). These results are discussed from the perspective of the role played by the phloem proteome as an integral component of the whole plant communication system.
Topology and evolution of technology innovation networks

NASA Astrophysics Data System (ADS)

Valverde, Sergi; Solé, Ricard V.; Bedau, Mark A.; Packard, Norman

2007-11-01

The web of relations linking technological innovation can be fairly described in terms of patent citations. The resulting patent citation network provides a picture of the large-scale organization of innovations and its time evolution. Here we study the patterns of change of patents registered by the U.S. Patent and Trademark Office. We show that the scaling behavior exhibited by this network is consistent with a preferential attachment mechanism together with a Weibull-shaped aging term. Such an attachment kernel is shared by scientific citation networks, thus indicating a universal type of mechanism linking ideas and designs and their evolution. The implications for evolutionary theory of innovation are discussed.
Critical roles for a genetic code alteration in the evolution of the genus Candida.

PubMed

Silva, Raquel M; Paredes, João A; Moura, Gabriela R; Manadas, Bruno; Lima-Costa, Tatiana; Rocha, Rita; Miranda, Isabel; Gomes, Ana C; Koerkamp, Marian J G; Perrot, Michel; Holstege, Frank C P; Boucherie, Hélian; Santos, Manuel A S

2007-10-31

During the last 30 years, several alterations to the standard genetic code have been discovered in various bacterial and eukaryotic species. Sense and nonsense codons have been reassigned or reprogrammed to expand the genetic code to selenocysteine and pyrrolysine. These discoveries highlight unexpected flexibility in the genetic code, but do not elucidate how the organisms survived the proteome chaos generated by codon identity redefinition. In order to shed new light on this question, we have reconstructed a Candida genetic code alteration in Saccharomyces cerevisiae and used a combination of DNA microarrays, proteomics and genetics approaches to evaluate its impact on gene expression, adaptation and sexual reproduction. This genetic manipulation blocked mating, locked yeast in a diploid state, remodelled gene expression and created stress cross-protection that generated adaptive advantages under environmental challenging conditions. This study highlights unanticipated roles for codon identity redefinition during the evolution of the genus Candida, and strongly suggests that genetic code alterations create genetic barriers that speed up speciation.
A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human.

PubMed

Vo, Tommy V; Das, Jishnu; Meyer, Michael J; Cordero, Nicolas A; Akturk, Nurten; Wei, Xiaomu; Fair, Benjamin J; Degatano, Andrew G; Fragoza, Robert; Liu, Lisa G; Matsuyama, Akihisa; Trickey, Michelle; Horibata, Sachi; Grimson, Andrew; Yamano, Hiroyuki; Yoshida, Minoru; Roth, Frederick P; Pleiss, Jeffrey A; Xia, Yu; Yu, Haiyuan

2016-01-14

Here, we present FissionNet, a proteome-wide binary protein interactome for S. pombe, comprising 2,278 high-quality interactions, of which ∼ 50% were previously not reported in any species. FissionNet unravels previously unreported interactions implicated in processes such as gene silencing and pre-mRNA splicing. We developed a rigorous network comparison framework that accounts for assay sensitivity and specificity, revealing extensive species-specific network rewiring between fission yeast, budding yeast, and human. Surprisingly, although genes are better conserved between the yeasts, S. pombe interactions are significantly better conserved in human than in S. cerevisiae. Our framework also reveals that different modes of gene duplication influence the extent to which paralogous proteins are functionally repurposed. Finally, cross-species interactome mapping demonstrates that coevolution of interacting proteins is remarkably prevalent, a result with important implications for studying human disease in model organisms. Overall, FissionNet is a valuable resource for understanding protein functions and their evolution. Copyright © 2016 Elsevier Inc. All rights reserved.
Introducing the CPL/MUW proteome database: interpretation of human liver and liver cancer proteome profiles by referring to isolated primary cells.

PubMed

Wimmer, Helge; Gundacker, Nina C; Griss, Johannes; Haudek, Verena J; Stättner, Stefan; Mohr, Thomas; Zwickl, Hannes; Paulitschke, Verena; Baron, David M; Trittner, Wolfgang; Kubicek, Markus; Bayer, Editha; Slany, Astrid; Gerner, Christopher

2009-06-01

Interpretation of proteome data with a focus on biomarker discovery largely relies on comparative proteome analyses. Here, we introduce a database-assisted interpretation strategy based on proteome profiles of primary cells. Both 2-D-PAGE and shotgun proteomics are applied. We obtain high data concordance with these two different techniques. When applying mass analysis of tryptic spot digests from 2-D gels of cytoplasmic fractions, we typically identify several hundred proteins. Using the same protein fractions, we usually identify more than thousand proteins by shotgun proteomics. The data consistency obtained when comparing these independent data sets exceeds 99% of the proteins identified in the 2-D gels. Many characteristic differences in protein expression of different cells can thus be independently confirmed. Our self-designed SQL database (CPL/MUW - database of the Clinical Proteomics Laboratories at the Medical University of Vienna accessible via www.meduniwien.ac.at/proteomics/database) facilitates (i) quality management of protein identification data, which are based on MS, (ii) the detection of cell type-specific proteins and (iii) of molecular signatures of specific functional cell states. Here, we demonstrate, how the interpretation of proteome profiles obtained from human liver tissue and hepatocellular carcinoma tissue is assisted by the Clinical Proteomics Laboratories at the Medical University of Vienna-database. Therefore, we suggest that the use of reference experiments supported by a tailored database may substantially facilitate data interpretation of proteome profiling experiments.
A scalable strategy for high-throughput GFP tagging of endogenous human proteins.

PubMed

Leonetti, Manuel D; Sekine, Sayaka; Kamiyama, Daichi; Weissman, Jonathan S; Huang, Bo

2016-06-21

A central challenge of the postgenomic era is to comprehensively characterize the cellular role of the ∼20,000 proteins encoded in the human genome. To systematically study protein function in a native cellular background, libraries of human cell lines expressing proteins tagged with a functional sequence at their endogenous loci would be very valuable. Here, using electroporation of Cas9 nuclease/single-guide RNA ribonucleoproteins and taking advantage of a split-GFP system, we describe a scalable method for the robust, scarless, and specific tagging of endogenous human genes with GFP. Our approach requires no molecular cloning and allows a large number of cell lines to be processed in parallel. We demonstrate the scalability of our method by targeting 48 human genes and show that the resulting GFP fluorescence correlates with protein expression levels. We next present how our protocols can be easily adapted for the tagging of a given target with GFP repeats, critically enabling the study of low-abundance proteins. Finally, we show that our GFP tagging approach allows the biochemical isolation of native protein complexes for proteomic studies. Taken together, our results pave the way for the large-scale generation of endogenously tagged human cell lines for the proteome-wide analysis of protein localization and interaction networks in a native cellular context.

New Markers for Predicting Fertility of the Male Gametes in the Post Genomic Age.

PubMed

Dipresa, Savina; De Toni, Luca; Foresta, Carlo; Garolla, Andrea

2018-04-18

A number of test have been proposed to assess male fertility potential, ranging from routine testing by light microscopic method for evaluating semen samples, to screening test for DNA integrity aimed to look at sperm chromatin abnormalities. Spermatozoa are an extremely differentiated cell, they have critical functions for embryo development and heredity, in addiction to delivering a haploid paternal genome to the oocyte. Towards this goal certain requirements must always be met. The ability of spermatozoa to perform its reproductive function taking place in the spermatogenesis, a highly specialized process depending on multiple factors with effect on male fertility. In the past 30 years, large-scale analyses of transcriptomic and genome expression in mammals have generated a large amount of informations on numberless biomolecules involved in spermatogenesis and male germ cell reproductive function. Sperm proteome represents the protein content that spermatozoa needs to survive and work correctly and modifications of sperm proteome play a role in determining functional changes leading to a decrease of reproductive competence into affected spermatozoa. The post-genomic approach consists of different methodologies for concurrently testicular transcriptome studies, protein compositional analysis and metabolomics findings of the spermatozoa in humans. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Transverse momentum dependent parton distributions at small- x

DOE PAGES

Xiao, Bo-Wen; Yuan, Feng; Zhou, Jian

2017-05-23

We study the transverse momentum dependent (TMD) parton distributions at small-x in a consistent framework that takes into account the TMD evolution and small-x evolution simultaneously. The small-x evolution effects are included by computing the TMDs at appropriate scales in terms of the dipole scattering amplitudes, which obey the relevant Balitsky–Kovchegov equation. Meanwhile, the TMD evolution is obtained by resumming the Collins–Soper type large logarithms emerged from the calculations in small-x formalism into Sudakov factors.
Transverse momentum dependent parton distributions at small-x

NASA Astrophysics Data System (ADS)

Xiao, Bo-Wen; Yuan, Feng; Zhou, Jian

2017-08-01

We study the transverse momentum dependent (TMD) parton distributions at small-x in a consistent framework that takes into account the TMD evolution and small-x evolution simultaneously. The small-x evolution effects are included by computing the TMDs at appropriate scales in terms of the dipole scattering amplitudes, which obey the relevant Balitsky-Kovchegov equation. Meanwhile, the TMD evolution is obtained by resumming the Collins-Soper type large logarithms emerged from the calculations in small-x formalism into Sudakov factors.
Transverse momentum dependent parton distributions at small- x

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xiao, Bo-Wen; Yuan, Feng; Zhou, Jian

We study the transverse momentum dependent (TMD) parton distributions at small-x in a consistent framework that takes into account the TMD evolution and small-x evolution simultaneously. The small-x evolution effects are included by computing the TMDs at appropriate scales in terms of the dipole scattering amplitudes, which obey the relevant Balitsky–Kovchegov equation. Meanwhile, the TMD evolution is obtained by resumming the Collins–Soper type large logarithms emerged from the calculations in small-x formalism into Sudakov factors.
Punctuated equilibrium in the large-scale evolution of programming languages†

PubMed Central

Valverde, Sergi; Solé, Ricard V.

2015-01-01

The analogies and differences between biological and cultural evolution have been explored by evolutionary biologists, historians, engineers and linguists alike. Two well-known domains of cultural change are language and technology. Both share some traits relating the evolution of species, but technological change is very difficult to study. A major challenge in our way towards a scientific theory of technological evolution is how to properly define evolutionary trees or clades and how to weight the role played by horizontal transfer of information. Here, we study the large-scale historical development of programming languages, which have deeply marked social and technological advances in the last half century. We analyse their historical connections using network theory and reconstructed phylogenetic networks. Using both data analysis and network modelling, it is shown that their evolution is highly uneven, marked by innovation events where new languages are created out of improved combinations of different structural components belonging to previous languages. These radiation events occur in a bursty pattern and are tied to novel technological and social niches. The method can be extrapolated to other systems and consistently captures the major classes of languages and the widespread horizontal design exchanges, revealing a punctuated evolutionary path. PMID:25994298
A Unique Model Platform for C4 Plant Systems and Synthetic Biology

DTIC Science & Technology

2015-12-10

International Conference in Bioinformatics , Sydney, Australia, July 31 - August 2, 2014.  Nielsen LK (2015) Genome scale metabolic and regulatory...the comparison of transcriptome proteome and central metabolome in mature and immature tissue. Preliminary data were obtained suggesting successful...guide the comparison of transcriptome, proteome and central metabolome in mature and immature tissue. Preliminary data were obtained suggesting
Electrodeposition of hierarchically structured three-dimensional nickel–iron electrodes for efficient oxygen evolution at high current densities

PubMed Central

Lu, Xunyu; Zhao, Chuan

2015-01-01

Large-scale industrial application of electrolytic splitting of water has called for the development of oxygen evolution electrodes that are inexpensive, robust and can deliver large current density (>500 mA cm−2) at low applied potentials. Here we show that an efficient oxygen electrode can be developed by electrodepositing amorphous mesoporous nickel–iron composite nanosheets directly onto macroporous nickel foam substrates. The as-prepared oxygen electrode exhibits high catalytic activity towards water oxidation in alkaline solutions, which only requires an overpotential of 200 mV to initiate the reaction, and is capable of delivering current densities of 500 and 1,000 mA cm−2 at overpotentials of 240 and 270 mV, respectively. The electrode also shows prolonged stability against bulk water electrolysis at large current. Collectively, the as-prepared three-dimensional structured electrode is the most efficient oxygen evolution electrode in alkaline electrolytes reported to the best of our knowledge, and can potentially be applied for industrial scale water electrolysis. PMID:25776015
General relativistic screening in cosmological simulations

NASA Astrophysics Data System (ADS)

Hahn, Oliver; Paranjape, Aseem

2016-10-01

We revisit the issue of interpreting the results of large volume cosmological simulations in the context of large-scale general relativistic effects. We look for simple modifications to the nonlinear evolution of the gravitational potential ψ that lead on large scales to the correct, fully relativistic description of density perturbations in the Newtonian gauge. We note that the relativistic constraint equation for ψ can be cast as a diffusion equation, with a diffusion length scale determined by the expansion of the Universe. Exploiting the weak time evolution of ψ in all regimes of interest, this equation can be further accurately approximated as a Helmholtz equation, with an effective relativistic "screening" scale ℓ related to the Hubble radius. We demonstrate that it is thus possible to carry out N-body simulations in the Newtonian gauge by replacing Poisson's equation with this Helmholtz equation, involving a trivial change in the Green's function kernel. Our results also motivate a simple, approximate (but very accurate) gauge transformation—δN(k )≈δsim(k )×(k2+ℓ-2)/k2 —to convert the density field δsim of standard collisionless N -body simulations (initialized in the comoving synchronous gauge) into the Newtonian gauge density δN at arbitrary times. A similar conversion can also be written in terms of particle positions. Our results can be interpreted in terms of a Jeans stability criterion induced by the expansion of the Universe. The appearance of the screening scale ℓ in the evolution of ψ , in particular, leads to a natural resolution of the "Jeans swindle" in the presence of superhorizon modes.
Workflow based framework for life science informatics.

PubMed

Tiwari, Abhishek; Sekhar, Arvind K T

2007-10-01

Workflow technology is a generic mechanism to integrate diverse types of available resources (databases, servers, software applications and different services) which facilitate knowledge exchange within traditionally divergent fields such as molecular biology, clinical research, computational science, physics, chemistry and statistics. Researchers can easily incorporate and access diverse, distributed tools and data to develop their own research protocols for scientific analysis. Application of workflow technology has been reported in areas like drug discovery, genomics, large-scale gene expression analysis, proteomics, and system biology. In this article, we have discussed the existing workflow systems and the trends in applications of workflow based systems.
FAST MAGNETIC FIELD AMPLIFICATION IN THE EARLY UNIVERSE: GROWTH OF COLLISIONLESS PLASMA INSTABILITIES IN TURBULENT MEDIA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Falceta-Gonçalves, D.; Kowal, G.

2015-07-20

In this work we report on a numerical study of the cosmic magnetic field amplification due to collisionless plasma instabilities. The collisionless magnetohydrodynamic equations derived account for the pressure anisotropy that leads, in specific conditions, to the firehose and mirror instabilities. We study the time evolution of seed fields in turbulence under the influence of such instabilities. An approximate analytical time evolution of the magnetic field is provided. The numerical simulations and the analytical predictions are compared. We found that (i) amplification of the magnetic field was efficient in firehose-unstable turbulent regimes, but not in the mirror-unstable models; (ii) the growthmore » rate of the magnetic energy density is much faster than the turbulent dynamo; and (iii) the efficient amplification occurs at small scales. The analytical prediction for the correlation between the growth timescales and pressure anisotropy is confirmed by the numerical simulations. These results reinforce the idea that pressure anisotropies—driven naturally in a turbulent collisionless medium, e.g., the intergalactic medium, could efficiently amplify the magnetic field in the early universe (post-recombination era), previous to the collapse of the first large-scale gravitational structures. This mechanism, though fast for the small-scale fields (∼kpc scales), is unable to provide relatively strong magnetic fields at large scales. Other mechanisms that were not accounted for here (e.g., collisional turbulence once instabilities are quenched, velocity shear, or gravitationally induced inflows of gas into galaxies and clusters) could operate afterward to build up large-scale coherent field structures in the long time evolution.« less
Laser assisted microdissection, an efficient technique to understand tissue specific gene expression patterns and functional genomics in plants.

PubMed

Gautam, Vibhav; Sarkar, Ananda K

2015-04-01

Laser assisted microdissection (LAM) is an advanced technology used to perform tissue or cell-specific expression profiling of genes and proteins, owing to its ability to isolate the desired tissue or cell type from a heterogeneous population. Due to the specificity and high efficiency acquired during its pioneering use in medical science, the LAM technique has quickly been adopted for use in many biological researches. Today, it has become a potent tool to address a wide range of questions in diverse field of plant biology. Beginning with comparative transcriptome analysis of different tissues such as reproductive parts, meristems, lateral organs, roots etc., LAM has also been extensively used in plant-pathogen interaction studies, proteomics, and metabolomics. In combination with next generation sequencing and proteomics analysis, LAM has opened up promising opportunities in the area of large scale functional studies in plants. Ever since the advent of this technique, significant improvements have been achieved in term of its instrumentation and method, which has made LAM a more efficient tool applicable in wider research areas. Here, we discuss the advancement of LAM technique with special emphasis on its methodology and highlight its scope in modern research areas of plant biology. Although we put emphasis on use of LAM in transcriptome studies, which is mostly used, we also discuss its recent application and scope in proteome and metabolome studies.
Pathogens and Disease Play Havoc on the Host Epiproteome-The "First Line of Response" Role for Proteomic Changes Influenced by Disorder.

PubMed

Rikkerink, Erik H A

2018-03-08

Organisms face stress from multiple sources simultaneously and require mechanisms to respond to these scenarios if they are to survive in the long term. This overview focuses on a series of key points that illustrate how disorder and post-translational changes can combine to play a critical role in orchestrating the response of organisms to the stress of a changing environment. Increasingly, protein complexes are thought of as dynamic multi-component molecular machines able to adapt through compositional, conformational and/or post-translational modifications to control their largely metabolic outputs. These metabolites then feed into cellular physiological homeostasis or the production of secondary metabolites with novel anti-microbial properties. The control of adaptations to stress operates at multiple levels including the proteome and the dynamic nature of proteomic changes suggests a parallel with the equally dynamic epigenetic changes at the level of nucleic acids. Given their properties, I propose that some disordered protein platforms specifically enable organisms to sense and react rapidly as the first line of response to change. Using examples from the highly dynamic host-pathogen and host-stress response, I illustrate by example how disordered proteins are key to fulfilling the need for multiple levels of integration of response at different time scales to create robust control points.
Insights into temperature modulation of the Eucalyptus globulus and Eucalyptus grandis antioxidant and lignification subproteomes.

PubMed

de Santana Costa, Marília Gabriela; Mazzafera, Paulo; Balbuena, Tiago Santana

2017-05-01

Eucalyptus grandis and Eucalyptus globulus are among the most widely cultivated trees, differing in lignin composition and plantation areas, as E. grandis is mostly cultivated in tropical regions while E. globulus is preferred in temperate areas. As temperature is a key modulator in plant metabolism, a large-scale proteome analysis was carried out to investigate changes in the antioxidant system and the lignification metabolism in plantlets grown at different temperatures. Our strategy allowed the identification of 3111 stem proteins. A total of 103 antioxidant proteins were detected in the stems of both species. Hierarchical clustering revealed that alterations in the antioxidant proteins are more prominent when Eucalyptus seedlings were exposed to high temperature and that the superoxide isoforms coded by the gene Eucgr.B03930 are the most abundant antioxidant enzymes induced by thermal stimulus. Regarding the lignin biosynthesis, our proteomics approach resulted in the identification of 13 of the 17 core proteins involved in this metabolism, corroborating with gene predictions and the proposed lignin toolbox. Quantitative analyses revealed significant differences in 8 protein isoforms, including the ferulate 5-hydroxylase isoform F5H1, a key enzyme in catalyzing the synthesis of sinapyl alcohol, and the cinnamyl alcohol dehydrogenase isoform CAD2, the last enzyme in monolignol biosynthesis. Data are available via ProteomeXchange with identifier PXD005743. Copyright © 2017 Elsevier Ltd. All rights reserved.
Proteome-wide association studies identify biochemical modules associated with a wing-size phenotype in Drosophila melanogaster.

PubMed

Okada, Hirokazu; Ebhardt, H Alexander; Vonesch, Sibylle Chantal; Aebersold, Ruedi; Hafen, Ernst

2016-09-01

The manner by which genetic diversity within a population generates individual phenotypes is a fundamental question of biology. To advance the understanding of the genotype-phenotype relationships towards the level of biochemical processes, we perform a proteome-wide association study (PWAS) of a complex quantitative phenotype. We quantify the variation of wing imaginal disc proteomes in Drosophila genetic reference panel (DGRP) lines using SWATH mass spectrometry. In spite of the very large genetic variation (1/36 bp) between the lines, proteome variability is surprisingly small, indicating strong molecular resilience of protein expression patterns. Proteins associated with adult wing size form tight co-variation clusters that are enriched in fundamental biochemical processes. Wing size correlates with some basic metabolic functions, positively with glucose metabolism but negatively with mitochondrial respiration and not with ribosome biogenesis. Our study highlights the power of PWAS to filter functional variants from the large genetic variability in natural populations.
An integrated native mass spectrometry and top-down proteomics method that connects sequence to structure and function of macromolecular complexes

NASA Astrophysics Data System (ADS)

Li, Huilin; Nguyen, Hong Hanh; Ogorzalek Loo, Rachel R.; Campuzano, Iain D. G.; Loo, Joseph A.

2018-02-01

Mass spectrometry (MS) has become a crucial technique for the analysis of protein complexes. Native MS has traditionally examined protein subunit arrangements, while proteomics MS has focused on sequence identification. These two techniques are usually performed separately without taking advantage of the synergies between them. Here we describe the development of an integrated native MS and top-down proteomics method using Fourier-transform ion cyclotron resonance (FTICR) to analyse macromolecular protein complexes in a single experiment. We address previous concerns of employing FTICR MS to measure large macromolecular complexes by demonstrating the detection of complexes up to 1.8 MDa, and we demonstrate the efficacy of this technique for direct acquirement of sequence to higher-order structural information with several large complexes. We then summarize the unique functionalities of different activation/dissociation techniques. The platform expands the ability of MS to integrate proteomics and structural biology to provide insights into protein structure, function and regulation.
Evidence for network evolution in an arabidopsis interactome map

USDA-ARS?s Scientific Manuscript database

Plants have unique features that evolved in response to their environments and ecosystems. A full account of the complex cellular networks that underlie plant-specific functions is still missing. We describe a proteome-wide binary protein-protein interaction map for the interactome network of the pl...
Evolution of Rotor Wake in Swirling Flow

NASA Technical Reports Server (NTRS)

El-Haldidi, Basman; Atassi, Hafiz; Envia, Edmane; Podboy, Gary

2000-01-01

A theory is presented for modeling the evolution of rotor wakes as a function of axial distance in swirling mean flows. The theory, which extends an earlier work to include arbitrary radial distributions of mean swirl, indicates that swirl can significantly alter the wake structure of the rotor especially at large downstream distances (i.e., for moderate to large rotor-stator spacings). Using measured wakes of a representative scale model fan stage to define the mean swirl and initial wake perturbations, the theory is used to predict the subsequent evolution of the wakes. The results indicate the sensitivity of the wake evolution to the initial profile and the need to have complete and consistent initial definition of both velocity and pressure perturbations.
Infrared Multiphoton Dissociation for Quantitative Shotgun Proteomics

PubMed Central

Ledvina, Aaron R.; Lee, M. Violet; McAlister, Graeme C.; Westphall, Michael S.; Coon, Joshua J.

2012-01-01

We modified a dual-cell linear ion trap mass spectrometer to perform infrared multiphoton dissociation (IRMPD) in the low pressure trap of a dual-cell quadrupole linear ion trap (dual cell QLT) and perform large-scale IRMPD analyses of complex peptide mixtures. Upon optimization of activation parameters (precursor q-value, irradiation time, and photon flux), IRMPD subtly, but significantly outperforms resonant excitation CAD for peptides identified at a 1% false-discovery rate (FDR) from a yeast tryptic digest (95% confidence, p = 0.019). We further demonstrate that IRMPD is compatible with the analysis of isobaric-tagged peptides. Using fixed QLT RF amplitude allows for the consistent retention of reporter ions, but necessitates the use of variable IRMPD irradiation times, dependent upon precursor mass-to-charge (m/z). We show that IRMPD activation parameters can be tuned to allow for effective peptide identification and quantitation simultaneously. We thus conclude that IRMPD performed in a dual-cell ion trap is an effective option for the large-scale analysis of both unmodified and isobaric-tagged peptides. PMID:22480380
Large-Scale Analysis Exploring Evolution of Catalytic Machineries and Mechanisms in Enzyme Superfamilies.

PubMed

Furnham, Nicholas; Dawson, Natalie L; Rahman, Syed A; Thornton, Janet M; Orengo, Christine A

2016-01-29

Enzymes, as biological catalysts, form the basis of all forms of life. How these proteins have evolved their functions remains a fundamental question in biology. Over 100 years of detailed biochemistry studies, combined with the large volumes of sequence and protein structural data now available, means that we are able to perform large-scale analyses to address this question. Using a range of computational tools and resources, we have compiled information on all experimentally annotated changes in enzyme function within 379 structurally defined protein domain superfamilies, linking the changes observed in functions during evolution to changes in reaction chemistry. Many superfamilies show changes in function at some level, although one function often dominates one superfamily. We use quantitative measures of changes in reaction chemistry to reveal the various types of chemical changes occurring during evolution and to exemplify these by detailed examples. Additionally, we use structural information of the enzymes active site to examine how different superfamilies have changed their catalytic machinery during evolution. Some superfamilies have changed the reactions they perform without changing catalytic machinery. In others, large changes of enzyme function, in terms of both overall chemistry and substrate specificity, have been brought about by significant changes in catalytic machinery. Interestingly, in some superfamilies, relatives perform similar functions but with different catalytic machineries. This analysis highlights characteristics of functional evolution across a wide range of superfamilies, providing insights that will be useful in predicting the function of uncharacterised sequences and the design of new synthetic enzymes. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Genome-, Transcriptome- and Proteome-Wide Analyses of the Gliadin Gene Families in Triticum urartu

PubMed Central

Wang, Dongzhi; Yang, Wenlong; Sun, Jiazhu; Zhang, Aimin; Zhan, Kehui

2015-01-01

Gliadins are the major components of storage proteins in wheat grains, and they play an essential role in the dough extensibility and nutritional quality of flour. Because of the large number of the gliadin family members, the high level of sequence identity, and the lack of abundant genomic data for Triticum species, identifying the full complement of gliadin family genes in hexaploid wheat remains challenging. Triticum urartu is a wild diploid wheat species and considered the A-genome donor of polyploid wheat species. The accession PI428198 (G1812) was chosen to determine the complete composition of the gliadin gene families in the wheat A-genome using the available draft genome. Using a PCR-based cloning strategy for genomic DNA and mRNA as well as a bioinformatics analysis of genomic sequence data, 28 gliadin genes were characterized. Of these genes, 23 were α-gliadin genes, three were γ-gliadin genes and two were ω-gliadin genes. An RNA sequencing (RNA-Seq) survey of the dynamic expression patterns of gliadin genes revealed that their synthesis in immature grains began prior to 10 days post-anthesis (DPA), peaked at 15 DPA and gradually decreased at 20 DPA. The accumulation of proteins encoded by 16 of the expressed gliadin genes was further verified and quantified using proteomic methods. The phylogenetic analysis demonstrated that the homologs of these α-gliadin genes were present in tetraploid and hexaploid wheat, which was consistent with T. urartu being the A-genome progenitor species. This study presents a systematic investigation of the gliadin gene families in T. urartu that spans the genome, transcriptome and proteome, and it provides new information to better understand the molecular structure, expression profiles and evolution of the gliadin genes in T. urartu and common wheat. PMID:26132381

Physiological, biomass elemental composition and proteomic analyses of Escherichia coli ammonium-limited chemostat growth, and comparison with iron- and glucose-limited chemostat growth

PubMed Central

Folsom, James Patrick

2015-01-01

Escherichia coli physiological, biomass elemental composition and proteome acclimations to ammonium-limited chemostat growth were measured at four levels of nutrient scarcity controlled via chemostat dilution rate. These data were compared with published iron- and glucose-limited growth data collected from the same strain and at the same dilution rates to quantify general and nutrient-specific responses. Severe nutrient scarcity resulted in an overflow metabolism with differing organic byproduct profiles based on limiting nutrient and dilution rate. Ammonium-limited cultures secreted up to 35 % of the metabolized glucose carbon as organic byproducts with acetate representing the largest fraction; in comparison, iron-limited cultures secreted up to 70 % of the metabolized glucose carbon as lactate, and glucose-limited cultures secreted up to 4 % of the metabolized glucose carbon as formate. Biomass elemental composition differed with nutrient limitation; biomass from ammonium-limited cultures had a lower nitrogen content than biomass from either iron- or glucose-limited cultures. Proteomic analysis of central metabolism enzymes revealed that ammonium- and iron-limited cultures had a lower abundance of key tricarboxylic acid (TCA) cycle enzymes and higher abundance of key glycolysis enzymes compared with glucose-limited cultures. The overall results are largely consistent with cellular economics concepts, including metabolic tradeoff theory where the limiting nutrient is invested into essential pathways such as glycolysis instead of higher ATP-yielding, but non-essential, pathways such as the TCA cycle. The data provide a detailed insight into ecologically competitive metabolic strategies selected by evolution, templates for controlling metabolism for bioprocesses and a comprehensive dataset for validating in silico representations of metabolism. PMID:26018546
Genome-, Transcriptome- and Proteome-Wide Analyses of the Gliadin Gene Families in Triticum urartu.

PubMed

Zhang, Yanlin; Luo, Guangbin; Liu, Dongcheng; Wang, Dongzhi; Yang, Wenlong; Sun, Jiazhu; Zhang, Aimin; Zhan, Kehui

2015-01-01

Gliadins are the major components of storage proteins in wheat grains, and they play an essential role in the dough extensibility and nutritional quality of flour. Because of the large number of the gliadin family members, the high level of sequence identity, and the lack of abundant genomic data for Triticum species, identifying the full complement of gliadin family genes in hexaploid wheat remains challenging. Triticum urartu is a wild diploid wheat species and considered the A-genome donor of polyploid wheat species. The accession PI428198 (G1812) was chosen to determine the complete composition of the gliadin gene families in the wheat A-genome using the available draft genome. Using a PCR-based cloning strategy for genomic DNA and mRNA as well as a bioinformatics analysis of genomic sequence data, 28 gliadin genes were characterized. Of these genes, 23 were α-gliadin genes, three were γ-gliadin genes and two were ω-gliadin genes. An RNA sequencing (RNA-Seq) survey of the dynamic expression patterns of gliadin genes revealed that their synthesis in immature grains began prior to 10 days post-anthesis (DPA), peaked at 15 DPA and gradually decreased at 20 DPA. The accumulation of proteins encoded by 16 of the expressed gliadin genes was further verified and quantified using proteomic methods. The phylogenetic analysis demonstrated that the homologs of these α-gliadin genes were present in tetraploid and hexaploid wheat, which was consistent with T. urartu being the A-genome progenitor species. This study presents a systematic investigation of the gliadin gene families in T. urartu that spans the genome, transcriptome and proteome, and it provides new information to better understand the molecular structure, expression profiles and evolution of the gliadin genes in T. urartu and common wheat.
Selection on plant male function genes identifies candidates for reproductive isolation of yellow monkeyflowers.

PubMed

Aagaard, Jan E; George, Renee D; Fishman, Lila; Maccoss, Michael J; Swanson, Willie J

2013-01-01

Understanding the genetic basis of reproductive isolation promises insight into speciation and the origins of biological diversity. While progress has been made in identifying genes underlying barriers to reproduction that function after fertilization (post-zygotic isolation), we know much less about earlier acting pre-zygotic barriers. Of particular interest are barriers involved in mating and fertilization that can evolve extremely rapidly under sexual selection, suggesting they may play a prominent role in the initial stages of reproductive isolation. A significant challenge to the field of speciation genetics is developing new approaches for identification of candidate genes underlying these barriers, particularly among non-traditional model systems. We employ powerful proteomic and genomic strategies to study the genetic basis of conspecific pollen precedence, an important component of pre-zygotic reproductive isolation among yellow monkeyflowers (Mimulus spp.) resulting from male pollen competition. We use isotopic labeling in combination with shotgun proteomics to identify more than 2,000 male function (pollen tube) proteins within maternal reproductive structures (styles) of M. guttatus flowers where pollen competition occurs. We then sequence array-captured pollen tube exomes from a large outcrossing population of M. guttatus, and identify those genes with evidence of selective sweeps or balancing selection consistent with their role in pollen competition. We also test for evidence of positive selection on these genes more broadly across yellow monkeyflowers, because a signal of adaptive divergence is a common feature of genes causing reproductive isolation. Together the molecular evolution studies identify 159 pollen tube proteins that are candidate genes for conspecific pollen precedence. Our work demonstrates how powerful proteomic and genomic tools can be readily adapted to non-traditional model systems, allowing for genome-wide screens towards the goal of identifying the molecular basis of genetically complex traits.
Novel "omics" approach for study of low-abundance, low-molecular-weight components of a complex biological tissue: regional differences between chorionic and basal plates of the human placenta.

PubMed

Kedia, Komal; Nichols, Caitlin A; Thulin, Craig D; Graves, Steven W

2015-11-01

Tissue proteomics has relied heavily on two-dimensional gel electrophoresis, for protein separation and quantification, then single protein isolation, trypsin digestion, and mass spectrometric protein identification. Such methods are predominantly used for study of high-abundance, full-length proteins. Tissue peptidomics has recently been developed but is still used to study the most highly abundant species, often resulting in observation and identification of dozens of peptides only. Tissue lipidomics is likewise new, and reported studies are limited. We have developed an "omics" approach that enables over 7,000 low-molecular-weight, low-abundance species to be surveyed and have applied this to human placental tissue. Because the placenta is believed to be involved in complications of pregnancy, its proteomic evaluation is of substantial interest. In previous research on the placental proteome, abundant, high-molecular-weight proteins have been studied. Application of large-scale, global proteomics or peptidomics to the placenta have been limited, and would be challenging owing to the anatomic complexity and broad concentration range of proteins in this tissue. In our approach, involving protein depletion, capillary liquid chromatography, and tandem mass spectrometry, we attempted to identify molecular differences between two regions of the same placenta with only slightly different cellular composition. Our analysis revealed 16 species with statistically significant differences between the two regions. Tandem mass spectrometry enabled successful sequencing, or otherwise enabled chemical characterization, of twelve of these. The successful discovery and identification of regional differences between the expression of low-abundance, low-molecular weight biomolecules reveals the potential of our approach.
Long-Gradient Separations Coupled with Selected Reaction Monitoring for Highly Sensitive, Large Scale Targeted Protein Quantification in a Single Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shi, Tujin; Fillmore, Thomas L.; Gao, Yuqian

2013-10-01

Long-gradient separations coupled to tandem MS were recently demonstrated to provide a deep proteome coverage for global proteomics; however, such long-gradient separations have not been explored for targeted proteomics. Herein, we investigate the potential performance of the long-gradient separations coupled with selected reaction monitoring (LG-SRM) for targeted protein quantification. Direct comparison of LG-SRM (5 h gradient) and conventional LC-SRM (45 min gradient) showed that the long-gradient separations significantly reduced background interference levels and provided an 8- to 100-fold improvement in LOQ for target proteins in human female serum. Based on at least one surrogate peptide per protein, an LOQ ofmore » 10 ng/mL was achieved for the two spiked proteins in non-depleted human serum. The LG-SRM detection of seven out of eight endogenous plasma proteins expressed at ng/mL or sub-ng/mL levels in clinical patient sera was also demonstrated. A correlation coefficient of >0.99 was observed for the results of LG-SRM and ELISA measurements for prostate-specific antigen (PSA) in selected patient sera. Further enhancement of LG-SRM sensitivity was achieved by applying front-end IgY14 immunoaffinity depletion. Besides improved sensitivity, LG-SRM offers at least 3 times higher multiplexing capacity than conventional LC-SRM due to ~3-fold increase in average peak widths for a 300-min gradient compared to a 45-min gradient. Therefore, LG-SRM holds great potential for bridging the gap between global and targeted proteomics due to its advantages in both sensitivity and multiplexing capacity.« less
Fast Evolution and Lineage-Specific Gene Family Expansions of Aphid Salivary Effectors Driven by Interactions with Host-Plants.

PubMed

Boulain, Hélène; Legeai, Fabrice; Guy, Endrick; Morlière, Stéphanie; Douglas, Nadine E; Oh, Jonghee; Murugan, Marimuthu; Smith, Michael; Jaquiéry, Julie; Peccoud, Jean; White, Frank F; Carolan, James C; Simon, Jean-Christophe; Sugio, Akiko

2018-05-18

Effector proteins play crucial roles in plant-parasite interactions by suppressing plant defenses and hijacking plant physiological responses to facilitate parasite invasion and propagation. Although effector proteins have been characterized in many microbial plant pathogens, their nature and role in adaptation to host plants are largely unknown in insect herbivores. Aphids rely on salivary effector proteins injected into the host plants to promote phloem sap uptake. Therefore, gaining insight into the repertoire and evolution of aphid effectors is key to unveiling the mechanisms responsible for aphid virulence and host plant specialization. With this aim in mind, we assembled catalogues of putative effectors in the legume specialist aphid, Acyrthosiphon pisum, using transcriptomics and proteomics approaches. We identified 3603 candidate effector genes predicted to be expressed in A. pisum salivary glands (SGs), and 740 of which displayed up-regulated expression in SGs in comparison to the alimentary tract. A search for orthologs in 17 arthropod genomes revealed that SG-up-regulated effector candidates of A. pisum are enriched in aphid-specific genes and tend to evolve faster compared to the whole gene set. We also found that a large fraction of proteins detected in the A. pisum saliva belonged to three gene families, of which certain members show evidence consistent with positive selection. Overall, this comprehensive analysis suggests that the large repertoire of effector candidates in A. pisum constitutes a source of novelties promoting plant adaptation to legumes.
NASA: Assessments of Selected Large-Scale Projects

DTIC Science & Technology

2011-03-01

REPORT DATE MAR 2011 2. REPORT TYPE 3. DATES COVERED 00-00-2011 to 00-00-2011 4. TITLE AND SUBTITLE Assessments Of Selected Large-Scale Projects...Volatile EvolutioN MEP Mars Exploration Program MIB Mishap Investigation Board MMRTG Multi Mission Radioisotope Thermoelectric Generator MMS Magnetospheric...probes designed to explore the Martian surface, to satellites equipped with advanced sensors to study the earth , to telescopes intended to explore the
Top-down Proteomics in Health and Disease: Challenges and Opportunities

PubMed Central

Gregorich, Zachery R.; Ge, Ying

2014-01-01

Proteomics is essential for deciphering how molecules interact as a system and for understanding the functions of cellular systems in human disease; however, the unique characteristics of the human proteome, which include a high dynamic range of protein expression and extreme complexity due to a plethora of post-translational modifications (PTMs) and sequence variations, make such analyses challenging. An emerging “top-down” mass spectrometry (MS)-based proteomics approach, which provides a “bird’s eye” view of all proteoforms, has unique advantages for the assessment of PTMs and sequence variations. Recently, a number of studies have showcased the potential of top-down proteomics for unraveling of disease mechanisms and discovery of new biomarkers. Nevertheless, the top-down approach still faces significant challenges in terms of protein solubility, separation, and the detection of large intact proteins, as well as the under-developed data analysis tools. Consequently, new technological developments are urgently needed to advance the field of top-down proteomics. Herein, we intend to provide an overview of the recent applications of top-down proteomics in biomedical research. Moreover, we will outline the challenges and opportunities facing top-down proteomics strategies aimed at understanding and diagnosing human diseases. PMID:24723472
Internal constitution and evolution of the moon.

NASA Technical Reports Server (NTRS)

Solomon, S. C.; Toksoz, M. N.

1973-01-01

The composition, structure and evolution of the moon's interior are narrowly constrained by a large assortment of physical and chemical data. Models of the thermal evolution of the moon that fit the chronology of igneous activity on the lunar surface, the stress history of the lunar lithosphere implied by the presence of mascons, and the surface concentrations of radioactive elements, involve extensive differentiation early in lunar history. This differentiation may be the result of rapid accretion and large-scale melting or of primary chemical layering during accretion; differences in present-day temperatures for these two possibilities are significant only in the inner 1000 km of the moon and may not be resolvable.
Alignment between Satellite and Central Galaxies in the SDSS DR7: Dependence on Large-scale Environment

NASA Astrophysics Data System (ADS)

Wang, Peng; Luo, Yu; Kang, Xi; Libeskind, Noam I.; Wang, Lei; Zhang, Youcai; Tempel, Elmo; Guo, Quan

2018-06-01

The alignment between satellites and central galaxies has been studied in detail both in observational and theoretical works. The widely accepted fact is that satellites preferentially reside along the major axis of their central galaxy. However, the origin and large-scale environmental dependence of this alignment are still unknown. In an attempt to determine these variables, we use data constructed from Sloan Digital Sky Survey DR7 to investigate the large-scale environmental dependence of this alignment with emphasis on examining the alignment’s dependence on the color of the central galaxy. We find a very strong large-scale environmental dependence of the satellite–central alignment (SCA) in groups with blue centrals. Satellites of blue centrals in knots are preferentially located perpendicular to the major axes of the centrals, and the alignment angle decreases with environment, namely, when going from knots to voids. The alignment angle strongly depends on the {}0.1(g-r) color of centrals. We suggest that the SCA is the result of a competition between satellite accretion within large-scale structure (LSS) and galaxy evolution inside host halos. For groups containing red central galaxies, the SCA is mainly determined by the evolution effect, while for blue central dominated groups, the effect of the LSS plays a more important role, especially in knots. Our results provide an explanation for how the SCA forms within different large-scale environments. The perpendicular case in groups and knots with blue centrals may also provide insight into understanding similar polar arrangements, such as the formation of the Milky Way and Centaurus A’s satellite system.
Large-Scale Coherent Vortex Formation in Two-Dimensional Turbulence

NASA Astrophysics Data System (ADS)

Orlov, A. V.; Brazhnikov, M. Yu.; Levchenko, A. A.

2018-04-01

The evolution of a vortex flow excited by an electromagnetic technique in a thin layer of a conducting liquid was studied experimentally. Small-scale vortices, excited at the pumping scale, merge with time due to the nonlinear interaction and produce large-scale structures—the inverse energy cascade is formed. The dependence of the energy spectrum in the developed inverse cascade is well described by the Kraichnan law k -5/3. At large scales, the inverse cascade is limited by cell sizes, and a large-scale coherent vortex flow is formed, which occupies almost the entire area of the experimental cell. The radial profile of the azimuthal velocity of the coherent vortex immediately after the pumping was switched off has been established for the first time. Inside the vortex core, the azimuthal velocity grows linearly along a radius and reaches a constant value outside the core, which agrees well with the theoretical prediction.
ms_lims, a simple yet powerful open source laboratory information management system for MS-driven proteomics.

PubMed

Helsens, Kenny; Colaert, Niklaas; Barsnes, Harald; Muth, Thilo; Flikka, Kristian; Staes, An; Timmerman, Evy; Wortelkamp, Steffi; Sickmann, Albert; Vandekerckhove, Joël; Gevaert, Kris; Martens, Lennart

2010-03-01

MS-based proteomics produces large amounts of mass spectra that require processing, identification and possibly quantification before interpretation can be undertaken. High-throughput studies require automation of these various steps, and management of the data in association with the results obtained. We here present ms_lims (http://genesis.UGent.be/ms_lims), a freely available, open-source system based on a central database to automate data management and processing in MS-driven proteomics analyses.
Application of oncoproteomics to aberrant signalling networks in changing the treatment paradigm in acute lymphoblastic leukaemia.

PubMed

López Villar, Elena; Wang, Xiangdong; Madero, Luis; Cho, William C

2015-01-01

Oncoproteomics is an important innovation in the early diagnosis, management and development of personalized treatment of acute lymphoblastic leukaemia (ALL). As inherent factors are not completely known - e.g. age or family history, radiation exposure, benzene chemical exposure, certain viral exposures such as infection with the human T-cell lymphoma/leukaemia virus-1, as well as some inherited syndromes may raise the risk of ALL - each ALL patient may modify the susceptibility of therapy. Indeed, we consider these unknown inherent factors could be explained via coupling cytogenetics plus proteomics, especially when proteins are the ones which play function within cells. Innovative proteomics to ALL therapy may help to understand the mechanism of drug resistance and toxicities, which in turn will provide some leads to improve ALL management. Most important of these are shotgun proteomic strategies to unravel ALL aberrant signalling networks. Some shotgun proteomic innovations and bioinformatic tools for ALL therapies will be discussed. As network proteins are distinctive characteristics for ALL patients, unrevealed by cytogenetics, those network proteins are currently an important source of novel therapeutic targets that emerge from shotgun proteomics. Indeed, ALL evolution can be studied for each individual patient via oncoproteomics. © 2014 The Authors. Journal of Cellular and Molecular Medicine published by John Wiley & Sons Ltd and Foundation for Cellular and Molecular Medicine.
Proteomics of the Human Placenta: Promises and Realities

PubMed Central

Robinson, J.M.; Ackerman, W.E.; Kniss, D.A.; Takizawa, T.; Vandré, D.D.

2015-01-01

Proteomics is an area of study that sets as its ultimate goal the global analysis of all of the proteins expressed in a biological system of interest. However, technical limitations currently hamper proteome-wide analyses of complex systems. In a more practical sense, a desired outcome of proteomics research is the translation of large protein data sets into formats that provide meaningful information regarding clinical conditions (e.g., biomarkers to serve as diagnostic and/or prognostic indicators of disease). Herein, we discuss placental proteomics by describing existing studies, pointing out their strengths and weaknesses. In so doing, we strive to inform investigators interested in this area of research about the current gap between hyperbolic promises and realities. Additionally, we discuss the utility of proteomics in discovery-based research, particularly as regards the capacity to unearth novel insights into placental biology. Importantly, when considering under studied systems such as the human placenta and diseases associated with abnormalities in placental function, proteomics can serve as a robust ‘shortcut’ to obtaining information unlikely to be garnered using traditional approaches. PMID:18222537
Alternatively Spliced Homologous Exons Have Ancient Origins and Are Highly Expressed at the Protein Level

PubMed Central

Abascal, Federico; Ezkurdia, Iakes; Rodriguez-Rivas, Juan; Rodriguez, Jose Manuel; del Pozo, Angela; Vázquez, Jesús; Valencia, Alfonso; Tress, Michael L.

2015-01-01

Alternative splicing of messenger RNA can generate a wide variety of mature RNA transcripts, and these transcripts may produce protein isoforms with diverse cellular functions. While there is much supporting evidence for the expression of alternative transcripts, the same is not true for the alternatively spliced protein products. Large-scale mass spectroscopy experiments have identified evidence of alternative splicing at the protein level, but with conflicting results. Here we carried out a rigorous analysis of the peptide evidence from eight large-scale proteomics experiments to assess the scale of alternative splicing that is detectable by high-resolution mass spectroscopy. We find fewer splice events than would be expected: we identified peptides for almost 64% of human protein coding genes, but detected just 282 splice events. This data suggests that most genes have a single dominant isoform at the protein level. Many of the alternative isoforms that we could identify were only subtly different from the main splice isoform. Very few of the splice events identified at the protein level disrupted functional domains, in stark contrast to the two thirds of splice events annotated in the human genome that would lead to the loss or damage of functional domains. The most striking result was that more than 20% of the splice isoforms we identified were generated by substituting one homologous exon for another. This is significantly more than would be expected from the frequency of these events in the genome. These homologous exon substitution events were remarkably conserved—all the homologous exons we identified evolved over 460 million years ago—and eight of the fourteen tissue-specific splice isoforms we identified were generated from homologous exons. The combination of proteomics evidence, ancient origin and tissue-specific splicing indicates that isoforms generated from homologous exons may have important cellular roles. PMID:26061177
A sense of life: computational and experimental investigations with models of biochemical and evolutionary processes.

PubMed

Mishra, Bud; Daruwala, Raoul-Sam; Zhou, Yi; Ugel, Nadia; Policriti, Alberto; Antoniotti, Marco; Paxia, Salvatore; Rejali, Marc; Rudra, Archisman; Cherepinsky, Vera; Silver, Naomi; Casey, William; Piazza, Carla; Simeoni, Marta; Barbano, Paolo; Spivak, Marina; Feng, Jiawu; Gill, Ofer; Venkatesh, Mysore; Cheng, Fang; Sun, Bing; Ioniata, Iuliana; Anantharaman, Thomas; Hubbard, E Jane Albert; Pnueli, Amir; Harel, David; Chandru, Vijay; Hariharan, Ramesh; Wigler, Michael; Park, Frank; Lin, Shih-Chieh; Lazebnik, Yuri; Winkler, Franz; Cantor, Charles R; Carbone, Alessandra; Gromov, Mikhael

2003-01-01

We collaborate in a research program aimed at creating a rigorous framework, experimental infrastructure, and computational environment for understanding, experimenting with, manipulating, and modifying a diverse set of fundamental biological processes at multiple scales and spatio-temporal modes. The novelty of our research is based on an approach that (i) requires coevolution of experimental science and theoretical techniques and (ii) exploits a certain universality in biology guided by a parsimonious model of evolutionary mechanisms operating at the genomic level and manifesting at the proteomic, transcriptomic, phylogenic, and other higher levels. Our current program in "systems biology" endeavors to marry large-scale biological experiments with the tools to ponder and reason about large, complex, and subtle natural systems. To achieve this ambitious goal, ideas and concepts are combined from many different fields: biological experimentation, applied mathematical modeling, computational reasoning schemes, and large-scale numerical and symbolic simulations. From a biological viewpoint, the basic issues are many: (i) understanding common and shared structural motifs among biological processes; (ii) modeling biological noise due to interactions among a small number of key molecules or loss of synchrony; (iii) explaining the robustness of these systems in spite of such noise; and (iv) cataloging multistatic behavior and adaptation exhibited by many biological processes.
Large-scale gene function analysis with the PANTHER classification system.

PubMed

Mi, Huaiyu; Muruganujan, Anushya; Casagrande, John T; Thomas, Paul D

2013-08-01

The PANTHER (protein annotation through evolutionary relationship) classification system (http://www.pantherdb.org/) is a comprehensive system that combines gene function, ontology, pathways and statistical analysis tools that enable biologists to analyze large-scale, genome-wide data from sequencing, proteomics or gene expression experiments. The system is built with 82 complete genomes organized into gene families and subfamilies, and their evolutionary relationships are captured in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models or HMMs). Genes are classified according to their function in several different ways: families and subfamilies are annotated with ontology terms (Gene Ontology (GO) and PANTHER protein class), and sequences are assigned to PANTHER pathways. The PANTHER website includes a suite of tools that enable users to browse and query gene functions, and to analyze large-scale experimental data with a number of statistical tests. It is widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. In the 2013 release of PANTHER (v.8.0), in addition to an update of the data content, we redesigned the website interface to improve both user experience and the system's analytical capability. This protocol provides a detailed description of how to analyze genome-wide experimental data with the PANTHER classification system.
IslandFAST: A Semi-numerical Tool for Simulating the Late Epoch of Reionization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, Yidong; Chen, Xuelei; Yue, Bin

2017-08-01

We present the algorithm and main results of our semi-numerical simulation, islandFAST, which was developed from 21cmFAST and designed for the late stage of reionization. The islandFAST simulation predicts the evolution and size distribution of the large-scale underdense neutral regions (neutral islands), and we find that the late Epoch of Reionization proceeds very fast, showing a characteristic scale of the neutral islands at each redshift. Using islandFAST, we compare the impact of two types of absorption systems, i.e., the large-scale underdense neutral islands versus small-scale overdense absorbers, in regulating the reionization process. The neutral islands dominate the morphology of themore » ionization field, while the small-scale absorbers dominate the mean-free path of ionizing photons, and also delay and prolong the reionization process. With our semi-numerical simulation, the evolution of the ionizing background can be derived self-consistently given a model for the small absorbers. The hydrogen ionization rate of the ionizing background is reduced by an order of magnitude in the presence of dense absorbers.« less
The bovine lactation genome: Insights into the evolution of mammalian milk

USDA-ARS?s Scientific Manuscript database

The newly assembled Bos Taurus genome sequence enables the linkage of bovine milk and lactation data with other mammalian genomes. Using publicly available milk proteome data and mammary expressed sequence tags, 197 milk protein genes and over 6,000 mammary genes were identified in the bovine genome...
An Extremely Halophilic Proteobacterium Combines a Highly Acidic Proteome with a Low Cytoplasmic Potassium Content*

PubMed Central

Deole, Ratnakar; Challacombe, Jean; Raiford, Douglas W.; Hoff, Wouter D.

2013-01-01

Halophilic archaea accumulate molar concentrations of KCl in their cytoplasm as an osmoprotectant and have evolved highly acidic proteomes that function only at high salinity. We examined osmoprotection in the photosynthetic Proteobacteria Halorhodospira halophila and Halorhodospira halochloris. Genome sequencing and isoelectric focusing gel electrophoresis showed that the proteome of H. halophila is acidic. In line with this finding, H. halophila accumulated molar concentrations of KCl when grown in high salt medium as detected by x-ray microanalysis and plasma emission spectrometry. This result extends the taxonomic range of organisms using KCl as a main osmoprotectant to the Proteobacteria. The closely related organism H. halochloris does not exhibit an acidic proteome, matching its inability to accumulate K+. This observation indicates recent evolutionary changes in the osmoprotection strategy of these organisms. Upon growth of H. halophila in low salt medium, its cytoplasmic K+ content matches that of Escherichia coli, revealing an acidic proteome that can function in the absence of high cytoplasmic salt concentrations. These findings necessitate a reassessment of two central aspects of theories for understanding extreme halophiles. First, we conclude that proteome acidity is not driven by stabilizing interactions between K+ ions and acidic side chains but by the need for maintaining sufficient solvation and hydration of the protein surface at high salinity through strongly hydrated carboxylates. Second, we propose that obligate protein halophilicity is a non-adaptive property resulting from genetic drift in which constructive neutral evolution progressively incorporates weakly stabilizing K+-binding sites on an increasingly acidic protein surface. PMID:23144460

The evolution of airplanes

NASA Astrophysics Data System (ADS)

Bejan, A.; Charles, J. D.; Lorente, S.

2014-07-01

The prevailing view is that we cannot witness biological evolution because it occurred on a time scale immensely greater than our lifetime. Here, we show that we can witness evolution in our lifetime by watching the evolution of the flying human-and-machine species: the airplane. We document this evolution, and we also predict it based on a physics principle: the constructal law. We show that the airplanes must obey theoretical allometric rules that unite them with the birds and other animals. For example, the larger airplanes are faster, more efficient as vehicles, and have greater range. The engine mass is proportional to the body size: this scaling is analogous to animal design, where the mass of the motive organs (muscle, heart, lung) is proportional to the body size. Large or small, airplanes exhibit a proportionality between wing span and fuselage length, and between fuel load and body size. The animal-design counterparts of these features are evident. The view that emerges is that the evolution phenomenon is broader than biological evolution. The evolution of technology, river basins, and animal design is one phenomenon, and it belongs in physics.
Large-scale structure non-Gaussianities with modal methods

NASA Astrophysics Data System (ADS)

Schmittfull, Marcel

2016-10-01

Relying on a separable modal expansion of the bispectrum, the implementation of a fast estimator for the full bispectrum of a 3d particle distribution is presented. The computational cost of accurate bispectrum estimation is negligible relative to simulation evolution, so the bispectrum can be used as a standard diagnostic whenever the power spectrum is evaluated. As an application, the time evolution of gravitational and primordial dark matter bispectra was measured in a large suite of N-body simulations. The bispectrum shape changes characteristically when the cosmic web becomes dominated by filaments and halos, therefore providing a quantitative probe of 3d structure formation. Our measured bispectra are determined by ~ 50 coefficients, which can be used as fitting formulae in the nonlinear regime and for non-Gaussian initial conditions. We also compare the measured bispectra with predictions from the Effective Field Theory of Large Scale Structures (EFTofLSS).
A proteomic approach to obesity and type 2 diabetes

PubMed Central

López-Villar, Elena; Martos-Moreno, Gabriel Á; Chowen, Julie A; Okada, Shigeru; Kopchick, John J; Argente, Jesús

2015-01-01

The incidence of obesity and type diabetes 2 has increased dramatically resulting in an increased interest in its biomedical relevance. However, the mechanisms that trigger the development of diabetes type 2 in obese patients remain largely unknown. Scientific, clinical and pharmaceutical communities are dedicating vast resources to unravel this issue by applying different omics tools. During the last decade, the advances in proteomic approaches and the Human Proteome Organization have opened and are opening a new door that may be helpful in the identification of patients at risk and to improve current therapies. Here, we briefly review some of the advances in our understanding of type 2 diabetes that have occurred through the application of proteomics. We also review, in detail, the current improvements in proteomic methodologies and new strategies that could be employed to further advance our understanding of this pathology. By applying these new proteomic advances, novel therapeutic and/or diagnostic protein targets will be discovered in the obesity/Type 2 diabetes area. PMID:25960181
Methodologies and Perspectives of Proteomics Applied to Filamentous Fungi: From Sample Preparation to Secretome Analysis

PubMed Central

Bianco, Linda; Perrotta, Gaetano

2015-01-01

Filamentous fungi possess the extraordinary ability to digest complex biomasses and mineralize numerous xenobiotics, as consequence of their aptitude to sensing the environment and regulating their intra and extra cellular proteins, producing drastic changes in proteome and secretome composition. Recent advancement in proteomic technologies offers an exciting opportunity to reveal the fluctuations of fungal proteins and enzymes, responsible for their metabolic adaptation to a large variety of environmental conditions. Here, an overview of the most commonly used proteomic strategies will be provided; this paper will range from sample preparation to gel-free and gel-based proteomics, discussing pros and cons of each mentioned state-of-the-art technique. The main focus will be kept on filamentous fungi. Due to the biotechnological relevance of lignocellulose degrading fungi, special attention will be finally given to their extracellular proteome, or secretome. Secreted proteins and enzymes will be discussed in relation to their involvement in bio-based processes, such as biomass deconstruction and mycoremediation. PMID:25775160
Methodologies and perspectives of proteomics applied to filamentous fungi: from sample preparation to secretome analysis.

PubMed

Bianco, Linda; Perrotta, Gaetano

2015-03-12

Filamentous fungi possess the extraordinary ability to digest complex biomasses and mineralize numerous xenobiotics, as consequence of their aptitude to sensing the environment and regulating their intra and extra cellular proteins, producing drastic changes in proteome and secretome composition. Recent advancement in proteomic technologies offers an exciting opportunity to reveal the fluctuations of fungal proteins and enzymes, responsible for their metabolic adaptation to a large variety of environmental conditions. Here, an overview of the most commonly used proteomic strategies will be provided; this paper will range from sample preparation to gel-free and gel-based proteomics, discussing pros and cons of each mentioned state-of-the-art technique. The main focus will be kept on filamentous fungi. Due to the biotechnological relevance of lignocellulose degrading fungi, special attention will be finally given to their extracellular proteome, or secretome. Secreted proteins and enzymes will be discussed in relation to their involvement in bio-based processes, such as biomass deconstruction and mycoremediation.
Shaping biological knowledge: applications in proteomics.

PubMed

Lisacek, F; Chichester, C; Gonnet, P; Jaillet, O; Kappus, S; Nikitin, F; Roland, P; Rossier, G; Truong, L; Appel, R

2004-01-01

The central dogma of molecular biology has provided a meaningful principle for data integration in the field of genomics. In this context, integration reflects the known transitions from a chromosome to a protein sequence: transcription, intron splicing, exon assembly and translation. There is no such clear principle for integrating proteomics data, since the laws governing protein folding and interactivity are not quite understood. In our effort to bring together independent pieces of information relative to proteins in a biologically meaningful way, we assess the bias of bioinformatics resources and consequent approximations in the framework of small-scale studies. We analyse proteomics data while following both a data-driven (focus on proteins smaller than 10 kDa) and a hypothesis-driven (focus on whole bacterial proteomes) approach. These applications are potentially the source of specialized complements to classical biological ontologies.
Investigation of shear damage considering the evolution of anisotropy

NASA Astrophysics Data System (ADS)

Kweon, S.

2013-12-01

The damage that occurs in shear deformations in view of anisotropy evolution is investigated. It is widely believed in the mechanics research community that damage (or porosity) does not evolve (increase) in shear deformations since the hydrostatic stress in shear is zero. This paper proves that the above statement can be false in large deformations of simple shear. The simulation using the proposed anisotropic ductile fracture model (macro-scale) in this study indicates that hydrostatic stress becomes nonzero and (thus) porosity evolves (increases or decreases) in the simple shear deformation of anisotropic (orthotropic) materials. The simple shear simulation using a crystal plasticity based damage model (meso-scale) shows the same physics as manifested in the above macro-scale model that porosity evolves due to the grain-to-grain interaction, i.e., due to the evolution of anisotropy. Through a series of simple shear simulations, this study investigates the effect of the evolution of anisotropy, i.e., the rotation of the orthotropic axes onto the damage (porosity) evolution. The effect of the evolutions of void orientation and void shape onto the damage (porosity) evolution is investigated as well. It is found out that the interaction among porosity, the matrix anisotropy and void orientation/shape plays a crucial role in the ductile damage of porous materials.
Recent advances in micro-scale and nano-scale high-performance liquid-phase chromatography for proteome research.

PubMed

Tao, Dingyin; Zhang, Lihua; Shan, Yichu; Liang, Zhen; Zhang, Yukui

2011-01-01

High-performance liquid chromatography-electrospray ionization tandem mass spectrometry (HPLC-ESI-MS-MS) is regarded as one of the most powerful techniques for separation and identification of proteins. Recently, much effort has been made to improve the separation capacity, detection sensitivity, and analysis throughput of micro- and nano-HPLC, by increasing column length, reducing column internal diameter, and using integrated techniques. Development of HPLC columns has also been rapid, as a result of the use of submicrometer packing materials and monolithic columns. All these innovations result in clearly improved performance of micro- and nano-HPLC for proteome research.
A New Scheme to Characterize and Identify Protein Ubiquitination Sites.

PubMed

Nguyen, Van-Nui; Huang, Kai-Yao; Huang, Chien-Hsun; Lai, K Robert; Lee, Tzong-Yi

2017-01-01

Protein ubiquitination, involving the conjugation of ubiquitin on lysine residue, serves as an important modulator of many cellular functions in eukaryotes. Recent advancements in proteomic technology have stimulated increasing interest in identifying ubiquitination sites. However, most computational tools for predicting ubiquitination sites are focused on small-scale data. With an increasing number of experimentally verified ubiquitination sites, we were motivated to design a predictive model for identifying lysine ubiquitination sites for large-scale proteome dataset. This work assessed not only single features, such as amino acid composition (AAC), amino acid pair composition (AAPC) and evolutionary information, but also the effectiveness of incorporating two or more features into a hybrid approach to model construction. The support vector machine (SVM) was applied to generate the prediction models for ubiquitination site identification. Evaluation by five-fold cross-validation showed that the SVM models learned from the combination of hybrid features delivered a better prediction performance. Additionally, a motif discovery tool, MDDLogo, was adopted to characterize the potential substrate motifs of ubiquitination sites. The SVM models integrating the MDDLogo-identified substrate motifs could yield an average accuracy of 68.70 percent. Furthermore, the independent testing result showed that the MDDLogo-clustered SVM models could provide a promising accuracy (78.50 percent) and perform better than other prediction tools. Two cases have demonstrated the effective prediction of ubiquitination sites with corresponding substrate motifs.
Development of a Highly Automated and Multiplexed Targeted Proteome Pipeline and Assay for 112 Rat Brain Synaptic Proteins

PubMed Central

Colangelo, Christopher M.; Ivosev, Gordana; Chung, Lisa; Abbott, Thomas; Shifman, Mark; Sakaue, Fumika; Cox, David; Kitchen, Rob R.; Burton, Lyle; Tate, Stephen A; Gulcicek, Erol; Bonner, Ron; Rinehart, Jesse; Nairn, Angus C.; Williams, Kenneth R.

2015-01-01

We present a comprehensive workflow for large scale (>1000 transitions/run) label-free LC-MRM proteome assays. Innovations include automated MRM transition selection, intelligent retention time scheduling (xMRM) that improves Signal/Noise by >2-fold, and automatic peak modeling. Improvements to data analysis include a novel Q/C metric, Normalized Group Area Ratio (NGAR), MLR normalization, weighted regression analysis, and data dissemination through the Yale Protein Expression Database. As a proof of principle we developed a robust 90 minute LC-MRM assay for Mouse/Rat Post-Synaptic Density (PSD) fractions which resulted in the routine quantification of 337 peptides from 112 proteins based on 15 observations per protein. Parallel analyses with stable isotope dilution peptide standards (SIS), demonstrate very high correlation in retention time (1.0) and protein fold change (0.94) between the label-free and SIS analyses. Overall, our first method achieved a technical CV of 11.4% with >97.5% of the 1697 transitions being quantified without user intervention, resulting in a highly efficient, robust, and single injection LC-MRM assay. PMID:25476245
Vascular biology: cellular and molecular profiling.

PubMed

Baird, Alison E; Wright, Violet L

2006-02-01

Our understanding of the mechanisms underlying cerebrovascular atherosclerosis has improved in recent years, but significant gaps remain. New insights into the vascular biological processes that result in ischemic stroke may come from cellular and molecular profiling studies of the peripheral blood. In recent cellular profiling studies, increased levels of a proinflammatory T-cell subset (CD4 (+)CD28 (-)) have been associated with stroke recurrence and death. Expansion of this T-cell subset may occur after ischemic stroke and be a pathogenic mechanism leading to recurrent stroke and death. Increases in certain phenotypes of endothelial cell microparticles have been found in stroke patients relative to controls, possibly indicating a state of increased vascular risk. Molecular profiling approaches include gene expression profiling and proteomic methods that permit large-scale analyses of the transcriptome and the proteome, respectively. Ultimately panels of genes and proteins may be identified that are predictive of stroke risk. Cellular and molecular profiling studies of the peripheral blood and of atherosclerotic plaques may also pave the way for the development of therapeutic agents for primary and secondary stroke prevention.
Nucleic Acids for Ultra-Sensitive Protein Detection

PubMed Central

Janssen, Kris P. F.; Knez, Karel; Spasic, Dragana; Lammertyn, Jeroen

2013-01-01

Major advancements in molecular biology and clinical diagnostics cannot be brought about strictly through the use of genomics based methods. Improved methods for protein detection and proteomic screening are an absolute necessity to complement to wealth of information offered by novel, high-throughput sequencing technologies. Only then will it be possible to advance insights into clinical processes and to characterize the importance of specific protein biomarkers for disease detection or the realization of “personalized medicine”. Currently however, large-scale proteomic information is still not as easily obtained as its genomic counterpart, mainly because traditional antibody-based technologies struggle to meet the stringent sensitivity and throughput requirements that are required whereas mass-spectrometry based methods might be burdened by significant costs involved. However, recent years have seen the development of new biodetection strategies linking nucleic acids with existing antibody technology or replacing antibodies with oligonucleotide recognition elements altogether. These advancements have unlocked many new strategies to lower detection limits and dramatically increase throughput of protein detection assays. In this review, an overview of these new strategies will be given. PMID:23337338
Advances in crop proteomics: PTMs of proteins under abiotic stress.

PubMed

Wu, Xiaolin; Gong, Fangping; Cao, Di; Hu, Xiuli; Wang, Wei

2016-03-01

Under natural conditions, crop plants are frequently subjected to various abiotic environmental stresses such as drought and heat wave, which may become more prevalent in the coming decades. Plant acclimation and tolerance to an abiotic stress are always associated with significant changes in PTMs of specific proteins. PTMs are important for regulating protein function, subcellular localization and protein activity and stability. Studies of plant responses to abiotic stress at the PTMs level are essential to the process of plant phenotyping for crop improvement. The ability to identify and quantify PTMs on a large-scale will contribute to a detailed protein functional characterization that will improve our understanding of the processes of crop plant stress acclimation and stress tolerance acquisition. Hundreds of PTMs have been reported, but it is impossible to review all of the possible protein modifications. In this review, we briefly summarize several main types of PTMs regarding their characteristics and detection methods, review the advances in PTMs research of crop proteomics, and highlight the importance of specific PTMs in crop response to abiotic stress. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Progressive muscle proteome changes in a clinically relevant pig model of Duchenne muscular dystrophy.

PubMed

Fröhlich, Thomas; Kemter, Elisabeth; Flenkenthaler, Florian; Klymiuk, Nikolai; Otte, Kathrin A; Blutke, Andreas; Krause, Sabine; Walter, Maggie C; Wanke, Rüdiger; Wolf, Eckhard; Arnold, Georg J

2016-09-16

Duchenne muscular dystrophy (DMD) is caused by genetic deficiency of dystrophin and characterized by massive structural and functional changes of skeletal muscle tissue, leading to terminal muscle failure. We recently generated a novel genetically engineered pig model reflecting pathological hallmarks of human DMD better than the widely used mdx mouse. To get insight into the hierarchy of molecular derangements during DMD progression, we performed a proteome analysis of biceps femoris muscle samples from 2-day-old and 3-month-old DMD and wild-type (WT) pigs. The extent of proteome changes in DMD vs. WT muscle increased markedly with age, reflecting progression of the pathological changes. In 3-month-old DMD muscle, proteins related to muscle repair such as vimentin, nestin, desmin and tenascin C were found to be increased, whereas a large number of respiratory chain proteins were decreased in abundance in DMD muscle, indicating serious disturbances in aerobic energy production and a reduction of functional muscle tissue. The combination of proteome data for fiber type specific myosin heavy chain proteins and immunohistochemistry showed preferential degeneration of fast-twitch fiber types in DMD muscle. The stage-specific proteome changes detected in this large animal model of clinically severe muscular dystrophy provide novel molecular readouts for future treatment trials.
Precision medicine for psychopharmacology: a general introduction.

PubMed

Shin, Cheolmin; Han, Changsu; Pae, Chi-Un; Patkar, Ashwin A

2016-07-01

Precision medicine is an emerging medical model that can provide accurate diagnoses and tailored therapeutic strategies for patients based on data pertaining to genes, microbiomes, environment, family history and lifestyle. Here, we provide basic information about precision medicine and newly introduced concepts, such as the precision medicine ecosystem and big data processing, and omics technologies including pharmacogenomics, pharamacometabolomics, pharmacoproteomics, pharmacoepigenomics, connectomics and exposomics. The authors review the current state of omics in psychiatry and the future direction of psychopharmacology as it moves towards precision medicine. Expert commentary: Advances in precision medicine have been facilitated by achievements in multiple fields, including large-scale biological databases, powerful methods for characterizing patients (such as genomics, proteomics, metabolomics, diverse cellular assays, and even social networks and mobile health technologies), and computer-based tools for analyzing large amounts of data.
A new model for extinction and recolonization in two dimensions: quantifying phylogeography.

PubMed

Barton, Nicholas H; Kelleher, Jerome; Etheridge, Alison M

2010-09-01

Classical models of gene flow fail in three ways: they cannot explain large-scale patterns; they predict much more genetic diversity than is observed; and they assume that loosely linked genetic loci evolve independently. We propose a new model that deals with these problems. Extinction events kill some fraction of individuals in a region. These are replaced by offspring from a small number of parents, drawn from the preexisting population. This model of evolution forwards in time corresponds to a backwards model, in which ancestral lineages jump to a new location if they are hit by an event, and may coalesce with other lineages that are hit by the same event. We derive an expression for the identity in allelic state, and show that, over scales much larger than the largest event, this converges to the classical value derived by Wright and Malécot. However, rare events that cover large areas cause low genetic diversity, large-scale patterns, and correlations in ancestry between unlinked loci. © 2010 The Author(s). Journal compilation © 2010 The Society for the Study of Evolution.
Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution

PubMed Central

Clarke, Thomas H.; Garb, Jessica E.; Hayashi, Cheryl Y.; Arensburger, Peter; Ayoub, Nadia A.

2015-01-01

The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). PMID:26058392
Contrasting effects of copper limitation on the photosynthetic apparatus in two strains of the open ocean diatom Thalassiosira oceanica

PubMed Central

Allen, Andrew E.; Foster, Leonard J.; Green, Beverley R.; Maldonado, Maria T.

2017-01-01

There is an intricate interaction between iron (Fe) and copper (Cu) physiology in diatoms. However, strategies to cope with low Cu are largely unknown. This study unveils the comprehensive restructuring of the photosynthetic apparatus in the diatom Thalassiosira oceanica (CCMP1003) in response to low Cu, at the physiological and proteomic level. The restructuring results in a shift from light harvesting for photochemistry—and ultimately for carbon fixation—to photoprotection, reducing carbon fixation and oxygen evolution. The observed decreases in the physiological parameters Fv/Fm, carbon fixation, and oxygen evolution, concomitant with increases in the antennae absorption cross section (σPSII), non-photochemical quenching (NPQ) and the conversion factor (φe:C/ηPSII) are in agreement with well documented cellular responses to low Fe. However, the underlying proteomic changes due to low Cu are very different from those elicited by low Fe. Low Cu induces a significant four-fold reduction in the Cu-containing photosynthetic electron carrier plastocyanin. The decrease in plastocyanin causes a bottleneck within the photosynthetic electron transport chain (ETC), ultimately leading to substantial stoichiometric changes. Namely, 2-fold reduction in both cytochrome b6f complex (cytb6f) and photosystem II (PSII), no change in the Fe-rich PSI and a 40- and 2-fold increase in proteins potentially involved in detoxification of reactive oxygen species (ferredoxin and ferredoxin:NADP+ reductase, respectively). Furthermore, we identify 48 light harvesting complex (LHC) proteins in the publicly available genome of T. oceanica and provide proteomic evidence for 33 of these. The change in the LHC composition within the antennae in response to low Cu underlines the shift from photochemistry to photoprotection in T. oceanica (CCMP1003). Interestingly, we also reveal very significant intra-specific strain differences. Another strain of T. oceanica (CCMP 1005) requires significantly higher Cu concentrations to sustain both its maximal and minimal growth rate compared to CCMP 1003. Under low Cu, CCMP 1005 decreases its growth rate, cell size, Chla and total protein per cell. We argue that the reduction in protein per cell is the main strategy to decrease its cellular Cu requirement, as none of the other parameters tested are affected. Differences between the two strains, as well as differences between the well documented responses to low Fe and those presented here in response to low Cu are discussed. PMID:28837661
Mass spectrometry-based proteomic exploration of the human immune system: focus on the inflammasome, global protein secretion, and T cells.

PubMed

Nyman, Tuula A; Lorey, Martina B; Cypryk, Wojciech; Matikainen, Sampsa

2017-05-01

The immune system is our defense system against microbial infections and tissue injury, and understanding how it works in detail is essential for developing drugs for different diseases. Mass spectrometry-based proteomics can provide in-depth information on the molecular mechanisms involved in immune responses. Areas covered: Summarized are the key immunology findings obtained with MS-based proteomics in the past five years, with a focus on inflammasome activation, global protein secretion, mucosal immunology, immunopeptidome and T cells. Special focus is on extracellular vesicle-mediated protein secretion and its role in immune responses. Expert commentary: Proteomics is an essential part of modern omics-scale immunology research. To date, MS-based proteomics has been used in immunology to study protein expression levels, their subcellular localization, secretion, post-translational modifications, and interactions in immune cells upon activation by different stimuli. These studies have made major contributions to understanding the molecular mechanisms involved in innate and adaptive immune responses. New developments in proteomics offer constantly novel possibilities for exploring the immune system. Examples of these techniques include mass cytometry and different MS-based imaging approaches which can be widely used in immunology.
Evidence-based green algal genomics reveals marine diversity and ancestral characteristics of land plants.

PubMed

van Baren, Marijke J; Bachy, Charles; Reistetter, Emily Nahas; Purvine, Samuel O; Grimwood, Jane; Sudek, Sebastian; Yu, Hang; Poirier, Camille; Deerinck, Thomas J; Kuo, Alan; Grigoriev, Igor V; Wong, Chee-Hong; Smith, Richard D; Callister, Stephen J; Wei, Chia-Lin; Schmutz, Jeremy; Worden, Alexandra Z

2016-03-31

Prasinophytes are widespread marine green algae that are related to plants. Cellular abundance of the prasinophyte Micromonas has reportedly increased in the Arctic due to climate-induced changes. Thus, studies of these unicellular eukaryotes are important for marine ecology and for understanding Viridiplantae evolution and diversification. We generated evidence-based Micromonas gene models using proteomics and RNA-Seq to improve prasinophyte genomic resources. First, sequences of four chromosomes in the 22 Mb Micromonas pusilla (CCMP1545) genome were finished. Comparison with the finished 21 Mb genome of Micromonas commoda (RCC299; named herein) shows they share ≤8,141 of ~10,000 protein-encoding genes, depending on the analysis method. Unlike RCC299 and other sequenced eukaryotes, CCMP1545 has two abundant repetitive intron types and a high percent (26 %) GC splice donors. Micromonas has more genus-specific protein families (19 %) than other genome sequenced prasinophytes (11 %). Comparative analyses using predicted proteomes from other prasinophytes reveal proteins likely related to scale formation and ancestral photosynthesis. Our studies also indicate that peptidoglycan (PG) biosynthesis enzymes have been lost in multiple independent events in select prasinophytes and plants. However, CCMP1545, polar Micromonas CCMP2099 and prasinophytes from other classes retain the entire PG pathway, like moss and glaucophyte algae. Surprisingly, multiple vascular plants also have the PG pathway, except the Penicillin-Binding Protein, and share a unique bi-domain protein potentially associated with the pathway. Alongside Micromonas experiments using antibiotics that halt bacterial PG biosynthesis, the findings highlight unrecognized phylogenetic complexity in PG-pathway retention and implicate a role in chloroplast structure or division in several extant Viridiplantae lineages. Extensive differences in gene loss and architecture between related prasinophytes underscore their divergence. PG biosynthesis genes from the cyanobacterial endosymbiont that became the plastid, have been selectively retained in multiple plants and algae, implying a biological function. Our studies provide robust genomic resources for emerging model algae, advancing knowledge of marine phytoplankton and plant evolution.

Evidence-based green algal genomics reveals marine diversity and ancestral characteristics of land plants

DOE Office of Scientific and Technical Information (OSTI.GOV)

van Baren, Marijke J.; Bachy, Charles; Reistetter, Emily Nahas

Prasinophytes are widespread marine green algae that are related to plants. Abundance of the genus Micromonas has reportedly increased in the Arctic due to climate-induced changes. Thus, studies of these organisms are important for marine ecology and understanding Virdiplantae evolution and diversification. We generated evidence-based Micromonas gene models using proteomics and RNA-Seq to improve prasinophyte genomic resources. First, sequences of four chromosomes in the 22 Mb Micromonas pusilla (CCMP1545) genome were finished. Comparison with the finished 21 Mb Micromonas commoda (RCC299) shows they share ≤ 8,142 of ~10,000 protein-encoding genes, depending on the analysis method. Unlike RCC299 and other sequencedmore » eukaryotes, CCMP1545 has two abundant repetitive intron types and a high percent (26%) GC splice donors. Micromonas has more genus-specific protein families (19%) than other genome sequenced prasinophytes (11%). Comparative analyses using predicted proteomes from other prasinophytes reveal proteins likely related to scale formation and ancestral photosynthesis. Our studies also indicate that peptidoglycan (PG) biosynthesis enzymes have been lost in multiple independent events in select prasinophytes and most plants. However, CCMP1545, polar Micromonas CCMP2099 and prasinophytes from other claasses retain the entire PG pathway, like moss and glaucophyte algae. Multiple vascular plants that share a unique bi-domain protein also have the pathway, except the Penicillin-Binding-Protein. Alongside Micromonas experiments using antibiotics that halt bacterial PG biosynthesis, the findings highlight unrecognized phylogenetic complexity in the PG-pathway retention and implicate a role in chloroplast structure of division in several extant Vridiplantae lineages. Extensive differences in gene loss and architecture between related prasinophytes underscore their extensive divergence. PG biosynthesis genes from the cyanobacterial endosymbiont that became the plastid, have been selectively retained in some plants and algae, implying a biological function. As a result, our studies provide robust genomic resources for emerging model algae, advancing knowledge of marine phytoplankton and plant evolution.« less
Evidence-based green algal genomics reveals marine diversity and ancestral characteristics of land plants

DOE PAGES

van Baren, Marijke J.; Bachy, Charles; Reistetter, Emily Nahas; ...

2016-03-31

Prasinophytes are widespread marine green algae that are related to plants. Abundance of the genus Micromonas has reportedly increased in the Arctic due to climate-induced changes. Thus, studies of these organisms are important for marine ecology and understanding Virdiplantae evolution and diversification. We generated evidence-based Micromonas gene models using proteomics and RNA-Seq to improve prasinophyte genomic resources. First, sequences of four chromosomes in the 22 Mb Micromonas pusilla (CCMP1545) genome were finished. Comparison with the finished 21 Mb Micromonas commoda (RCC299) shows they share ≤ 8,142 of ~10,000 protein-encoding genes, depending on the analysis method. Unlike RCC299 and other sequencedmore » eukaryotes, CCMP1545 has two abundant repetitive intron types and a high percent (26%) GC splice donors. Micromonas has more genus-specific protein families (19%) than other genome sequenced prasinophytes (11%). Comparative analyses using predicted proteomes from other prasinophytes reveal proteins likely related to scale formation and ancestral photosynthesis. Our studies also indicate that peptidoglycan (PG) biosynthesis enzymes have been lost in multiple independent events in select prasinophytes and most plants. However, CCMP1545, polar Micromonas CCMP2099 and prasinophytes from other claasses retain the entire PG pathway, like moss and glaucophyte algae. Multiple vascular plants that share a unique bi-domain protein also have the pathway, except the Penicillin-Binding-Protein. Alongside Micromonas experiments using antibiotics that halt bacterial PG biosynthesis, the findings highlight unrecognized phylogenetic complexity in the PG-pathway retention and implicate a role in chloroplast structure of division in several extant Vridiplantae lineages. Extensive differences in gene loss and architecture between related prasinophytes underscore their extensive divergence. PG biosynthesis genes from the cyanobacterial endosymbiont that became the plastid, have been selectively retained in some plants and algae, implying a biological function. As a result, our studies provide robust genomic resources for emerging model algae, advancing knowledge of marine phytoplankton and plant evolution.« less
Multiplexed and scalable super-resolution imaging of three-dimensional protein localization in size-adjustable tissues.

PubMed

Ku, Taeyun; Swaney, Justin; Park, Jeong-Yoon; Albanese, Alexandre; Murray, Evan; Cho, Jae Hun; Park, Young-Gyun; Mangena, Vamsi; Chen, Jiapei; Chung, Kwanghun

2016-09-01

The biology of multicellular organisms is coordinated across multiple size scales, from the subnanoscale of molecules to the macroscale, tissue-wide interconnectivity of cell populations. Here we introduce a method for super-resolution imaging of the multiscale organization of intact tissues. The method, called magnified analysis of the proteome (MAP), linearly expands entire organs fourfold while preserving their overall architecture and three-dimensional proteome organization. MAP is based on the observation that preventing crosslinking within and between endogenous proteins during hydrogel-tissue hybridization allows for natural expansion upon protein denaturation and dissociation. The expanded tissue preserves its protein content, its fine subcellular details, and its organ-scale intercellular connectivity. We use off-the-shelf antibodies for multiple rounds of immunolabeling and imaging of a tissue's magnified proteome, and our experiments demonstrate a success rate of 82% (100/122 antibodies tested). We show that specimen size can be reversibly modulated to image both inter-regional connections and fine synaptic architectures in the mouse brain.
Rapid Evolution of Beta-Keratin Genes Contribute to Phenotypic Differences That Distinguish Turtles and Birds from Other Reptiles

PubMed Central

Li, Yang I.; Kong, Lesheng; Ponting, Chris P.; Haerty, Wilfried

2013-01-01

Sequencing of vertebrate genomes permits changes in distinct protein families, including gene gains and losses, to be ascribed to lineage-specific phenotypes. A prominent example of this is the large-scale duplication of beta-keratin genes in the ancestors of birds, which was crucial to the subsequent evolution of their beaks, claws, and feathers. Evidence suggests that the shell of Pseudomys nelsoni contains at least 16 beta-keratins proteins, but it is unknown whether this is a complete set and whether their corresponding genes are orthologous to avian beak, claw, or feather beta-keratin genes. To address these issues and to better understand the evolution of the turtle shell at a molecular level, we surveyed the diversity of beta-keratin genes from the genome assemblies of three turtles, Chrysemys picta, Pelodiscus sinensis, and Chelonia mydas, which together represent over 160 Myr of chelonian evolution. For these three turtles, we found 200 beta-keratins, which indicate that, as for birds, a large expansion of beta-keratin genes in turtles occurred concomitantly with the evolution of a unique phenotype, namely, their plastron and carapace. Phylogenetic reconstruction of beta-keratin gene evolution suggests that separate waves of gene duplication within a single genomic location gave rise to scales, claws, and feathers in birds, and independently the scutes of the shell in turtles. PMID:23576313
The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system.

PubMed

Vonk, Freek J; Casewell, Nicholas R; Henkel, Christiaan V; Heimberg, Alysha M; Jansen, Hans J; McCleary, Ryan J R; Kerkkamp, Harald M E; Vos, Rutger A; Guerreiro, Isabel; Calvete, Juan J; Wüster, Wolfgang; Woods, Anthony E; Logan, Jessica M; Harrison, Robert A; Castoe, Todd A; de Koning, A P Jason; Pollock, David D; Yandell, Mark; Calderon, Diego; Renjifo, Camila; Currier, Rachel B; Salgado, David; Pla, Davinia; Sanz, Libia; Hyder, Asad S; Ribeiro, José M C; Arntzen, Jan W; van den Thillart, Guido E E J M; Boetzer, Marten; Pirovano, Walter; Dirks, Ron P; Spaink, Herman P; Duboule, Denis; McGlinn, Edwina; Kini, R Manjunatha; Richardson, Michael K

2013-12-17

Snakes are limbless predators, and many species use venom to help overpower relatively large, agile prey. Snake venoms are complex protein mixtures encoded by several multilocus gene families that function synergistically to cause incapacitation. To examine venom evolution, we sequenced and interrogated the genome of a venomous snake, the king cobra (Ophiophagus hannah), and compared it, together with our unique transcriptome, microRNA, and proteome datasets from this species, with data from other vertebrates. In contrast to the platypus, the only other venomous vertebrate with a sequenced genome, we find that snake toxin genes evolve through several distinct co-option mechanisms and exhibit surprisingly variable levels of gene duplication and directional selection that correlate with their functional importance in prey capture. The enigmatic accessory venom gland shows a very different pattern of toxin gene expression from the main venom gland and seems to have recruited toxin-like lectin genes repeatedly for new nontoxic functions. In addition, tissue-specific microRNA analyses suggested the co-option of core genetic regulatory components of the venom secretory system from a pancreatic origin. Although the king cobra is limbless, we recovered coding sequences for all Hox genes involved in amniote limb development, with the exception of Hoxd12. Our results provide a unique view of the origin and evolution of snake venom and reveal multiple genome-level adaptive responses to natural selection in this complex biological weapon system. More generally, they provide insight into mechanisms of protein evolution under strong selection.
Broad-scale phylogenomics provides insights into retrovirus–host evolution

PubMed Central

Hayward, Alexander; Grabherr, Manfred; Jern, Patric

2013-01-01

Genomic data provide an excellent resource to improve understanding of retrovirus evolution and the complex relationships among viruses and their hosts. In conjunction with broad-scale in silico screening of vertebrate genomes, this resource offers an opportunity to complement data on the evolution and frequency of past retroviral spread and so evaluate future risks and limitations for horizontal transmission between different host species. Here, we develop a methodology for extracting phylogenetic signal from large endogenous retrovirus (ERV) datasets by collapsing information to facilitate broad-scale phylogenomics across a wide sample of hosts. Starting with nearly 90,000 ERVs from 60 vertebrate host genomes, we construct phylogenetic hypotheses and draw inferences regarding the designation, host distribution, origin, and transmission of the Gammaretrovirus genus and associated class I ERVs. Our results uncover remarkable depths in retroviral sequence diversity, supported within a phylogenetic context. This finding suggests that current infectious exogenous retrovirus diversity may be underestimated, adding credence to the possibility that many additional exogenous retroviruses may remain to be discovered in vertebrate taxa. We demonstrate a history of frequent horizontal interorder transmissions from a rodent reservoir and suggest that rats may have acted as important overlooked facilitators of gammaretrovirus spread across diverse mammalian hosts. Together, these results demonstrate the promise of the methodology used here to analyze large ERV datasets and improve understanding of retroviral evolution and diversity for utilization in wider applications. PMID:24277832
Broad-scale phylogenomics provides insights into retrovirus-host evolution.

PubMed

Hayward, Alexander; Grabherr, Manfred; Jern, Patric

2013-12-10

Genomic data provide an excellent resource to improve understanding of retrovirus evolution and the complex relationships among viruses and their hosts. In conjunction with broad-scale in silico screening of vertebrate genomes, this resource offers an opportunity to complement data on the evolution and frequency of past retroviral spread and so evaluate future risks and limitations for horizontal transmission between different host species. Here, we develop a methodology for extracting phylogenetic signal from large endogenous retrovirus (ERV) datasets by collapsing information to facilitate broad-scale phylogenomics across a wide sample of hosts. Starting with nearly 90,000 ERVs from 60 vertebrate host genomes, we construct phylogenetic hypotheses and draw inferences regarding the designation, host distribution, origin, and transmission of the Gammaretrovirus genus and associated class I ERVs. Our results uncover remarkable depths in retroviral sequence diversity, supported within a phylogenetic context. This finding suggests that current infectious exogenous retrovirus diversity may be underestimated, adding credence to the possibility that many additional exogenous retroviruses may remain to be discovered in vertebrate taxa. We demonstrate a history of frequent horizontal interorder transmissions from a rodent reservoir and suggest that rats may have acted as important overlooked facilitators of gammaretrovirus spread across diverse mammalian hosts. Together, these results demonstrate the promise of the methodology used here to analyze large ERV datasets and improve understanding of retroviral evolution and diversity for utilization in wider applications.
The timetable of evolution

PubMed Central

Knoll, Andrew H.; Nowak, Martin A.

2017-01-01

The integration of fossils, phylogeny, and geochronology has resulted in an increasingly well-resolved timetable of evolution. Life appears to have taken root before the earliest known minimally metamorphosed sedimentary rocks were deposited, but for a billion years or more, evolution played out beneath an essentially anoxic atmosphere. Oxygen concentrations in the atmosphere and surface oceans first rose in the Great Oxygenation Event (GOE) 2.4 billion years ago, and a second increase beginning in the later Neoproterozoic Era [Neoproterozoic Oxygenation Event (NOE)] established the redox profile of modern oceans. The GOE facilitated the emergence of eukaryotes, whereas the NOE is associated with large and complex multicellular organisms. Thus, the GOE and NOE are fundamental pacemakers for evolution. On the time scale of Earth’s entire 4 billion–year history, the evolutionary dynamics of the planet’s biosphere appears to be fast, and the pace of evolution is largely determined by physical changes of the planet. However, in Phanerozoic ecosystems, interactions between new functions enabled by the accumulation of characters in a complex regulatory environment and changing biological components of effective environments appear to have an important influence on the timing of evolutionary innovations. On the much shorter time scale of transient environmental perturbations, such as those associated with mass extinctions, rates of genetic accommodation may have been limiting for life. PMID:28560344
Punctuated equilibrium in the large-scale evolution of programming languages.

PubMed

Valverde, Sergi; Solé, Ricard V

2015-06-06

The analogies and differences between biological and cultural evolution have been explored by evolutionary biologists, historians, engineers and linguists alike. Two well-known domains of cultural change are language and technology. Both share some traits relating the evolution of species, but technological change is very difficult to study. A major challenge in our way towards a scientific theory of technological evolution is how to properly define evolutionary trees or clades and how to weight the role played by horizontal transfer of information. Here, we study the large-scale historical development of programming languages, which have deeply marked social and technological advances in the last half century. We analyse their historical connections using network theory and reconstructed phylogenetic networks. Using both data analysis and network modelling, it is shown that their evolution is highly uneven, marked by innovation events where new languages are created out of improved combinations of different structural components belonging to previous languages. These radiation events occur in a bursty pattern and are tied to novel technological and social niches. The method can be extrapolated to other systems and consistently captures the major classes of languages and the widespread horizontal design exchanges, revealing a punctuated evolutionary path. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Constraints on the power spectrum of the primordial density field from large-scale data - Microwave background and predictions of inflation

NASA Technical Reports Server (NTRS)

Kashlinsky, A.

1992-01-01

It is shown here that, by using galaxy catalog correlation data as input, measurements of microwave background radiation (MBR) anisotropies should soon be able to test two of the inflationary scenario's most basic predictions: (1) that the primordial density fluctuations produced were scale-invariant and (2) that the universe is flat. They should also be able to detect anisotropies of large-scale structure formed by gravitational evolution of density fluctuations present at the last scattering epoch. Computations of MBR anisotropies corresponding to the minimum of the large-scale variance of the MBR anisotropy are presented which favor an open universe with P(k) significantly different from the Harrison-Zeldovich spectrum predicted by most inflationary models.
Chimeric origins of ochrophytes and haptophytes revealed through an ancient plastid proteome

PubMed Central

Dorrell, Richard G; Gile, Gillian; McCallum, Giselle; Méheust, Raphaël; Bapteste, Eric P; Klinger, Christen M; Brillet-Guéguen, Loraine; Freeman, Katalina D; Richter, Daniel J; Bowler, Chris

2017-01-01

Plastids are supported by a wide range of proteins encoded within the nucleus and imported from the cytoplasm. These plastid-targeted proteins may originate from the endosymbiont, the host, or other sources entirely. Here, we identify and characterise 770 plastid-targeted proteins that are conserved across the ochrophytes, a major group of algae including diatoms, pelagophytes and kelps, that possess plastids derived from red algae. We show that the ancestral ochrophyte plastid proteome was an evolutionary chimera, with 25% of its phylogenetically tractable nucleus-encoded proteins deriving from green algae. We additionally show that functional mixing of host and plastid proteomes, such as through dual-targeting, is an ancestral feature of plastid evolution. Finally, we detect a clear phylogenetic signal from one ochrophyte subgroup, the lineage containing pelagophytes and dictyochophytes, in plastid-targeted proteins from another major algal lineage, the haptophytes. This may represent a possible serial endosymbiosis event deep in eukaryotic evolutionary history. DOI: http://dx.doi.org/10.7554/eLife.23717.001 PMID:28498102
Evolution and Structural Analyses of Glossina morsitans (Diptera; Glossinidae) Tetraspanins

PubMed Central

Murungi, Edwin K.; Kariithi, Henry M.; Adunga, Vincent; Obonyo, Meshack; Christoffels, Alan

2014-01-01

Tetraspanins are important conserved integral membrane proteins expressed in many organisms. Although there is limited knowledge about the full repertoire, evolution and structural characteristics of individual members in various organisms, data obtained so far show that tetraspanins play major roles in membrane biology, visual processing, memory, olfactory signal processing, and mechanosensory antennal inputs. Thus, these proteins are potential targets for control of insect pests. Here, we report that the genome of the tsetse fly, Glossina morsitans (Diptera: Glossinidae) encodes at least seventeen tetraspanins (GmTsps), all containing the signature features found in the tetraspanin superfamily members. Whereas six of the GmTsps have been previously reported, eleven could be classified as novel because their amino acid sequences do not map to characterized tetraspanins in the available protein data bases. We present a model of the GmTsps by using GmTsp42Ed, whose presence and expression has been recently detected by transcriptomics and proteomics analyses of G. morsitans. Phylogenetically, the identified GmTsps segregate into three major clusters. Structurally, the GmTsps are largely similar to vertebrate tetraspanins. In view of the exploitation of tetraspanins by organisms for survival, these proteins could be targeted using specific antibodies, recombinant large extracellular loop (LEL) domains, small-molecule mimetics and siRNAs as potential novel and efficacious putative targets to combat African trypanosomiasis by killing the tsetse fly vector. PMID:26462947
Unexpected features of the dark proteome.

PubMed

Perdigão, Nelson; Heinrich, Julian; Stolte, Christian; Sabir, Kenneth S; Buckley, Michael J; Tabor, Bruce; Signal, Beth; Gloss, Brian S; Hammang, Christopher J; Rost, Burkhard; Schafferhans, Andrea; O'Donoghue, Seán I

2015-12-29

We surveyed the "dark" proteome-that is, regions of proteins never observed by experimental structure determination and inaccessible to homology modeling. For 546,000 Swiss-Prot proteins, we found that 44-54% of the proteome in eukaryotes and viruses was dark, compared with only ∼14% in archaea and bacteria. Surprisingly, most of the dark proteome could not be accounted for by conventional explanations, such as intrinsic disorder or transmembrane regions. Nearly half of the dark proteome comprised dark proteins, in which the entire sequence lacked similarity to any known structure. Dark proteins fulfill a wide variety of functions, but a subset showed distinct and largely unexpected features, such as association with secretion, specific tissues, the endoplasmic reticulum, disulfide bonding, and proteolytic cleavage. Dark proteins also had short sequence length, low evolutionary reuse, and few known interactions with other proteins. These results suggest new research directions in structural and computational biology.
Proteomic and oxi-proteomic response of apple to a compatible (p. expansum) and a non-host (p. digitatum) pathogen

USDA-ARS?s Scientific Manuscript database

Despite the current use of chemical fungicides, Penicillium expansum still is one of the most devastating pathogens of pome fruit. In particular, P. expansum enters tissues through wounds causing large economic losses worldwide. To obtain new rational and environmental friendly control alternative...
Mass spectrometry for biomarker development

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Chaochao; Liu, Tao; Baker, Erin Shammel

2015-06-19

Biomarkers potentially play a crucial role in early disease diagnosis, prognosis and targeted therapy. In the past decade, mass spectrometry based proteomics has become increasingly important in biomarker development due to large advances in technology and associated methods. This chapter mainly focuses on the application of broad (e.g. shotgun) proteomics in biomarker discovery and the utility of targeted proteomics in biomarker verification and validation. A range of mass spectrometry methodologies are discussed emphasizing their efficacy in the different stages in biomarker development, with a particular emphasis on blood biomarker development.
Inconsistencies in the red blood cell membrane proteome analysis: generation of a database for research and diagnostic applications

PubMed Central

Hegedűs, Tamás; Chaubey, Pururawa Mayank; Várady, György; Szabó, Edit; Sarankó, Hajnalka; Hofstetter, Lia; Roschitzki, Bernd; Sarkadi, Balázs

2015-01-01

Based on recent results, the determination of the easily accessible red blood cell (RBC) membrane proteins may provide new diagnostic possibilities for assessing mutations, polymorphisms or regulatory alterations in diseases. However, the analysis of the current mass spectrometry-based proteomics datasets and other major databases indicates inconsistencies—the results show large scattering and only a limited overlap for the identified RBC membrane proteins. Here, we applied membrane-specific proteomics studies in human RBC, compared these results with the data in the literature, and generated a comprehensive and expandable database using all available data sources. The integrated web database now refers to proteomic, genetic and medical databases as well, and contains an unexpected large number of validated membrane proteins previously thought to be specific for other tissues and/or related to major human diseases. Since the determination of protein expression in RBC provides a method to indicate pathological alterations, our database should facilitate the development of RBC membrane biomarker platforms and provide a unique resource to aid related further research and diagnostics. Database URL: http://rbcc.hegelab.org PMID:26078478
The amino acid's backup bone - storage solutions for proteomics facilities.

PubMed

Meckel, Hagen; Stephan, Christian; Bunse, Christian; Krafzik, Michael; Reher, Christopher; Kohl, Michael; Meyer, Helmut Erich; Eisenacher, Martin

2014-01-01

Proteomics methods, especially high-throughput mass spectrometry analysis have been continually developed and improved over the years. The analysis of complex biological samples produces large volumes of raw data. Data storage and recovery management pose substantial challenges to biomedical or proteomic facilities regarding backup and archiving concepts as well as hardware requirements. In this article we describe differences between the terms backup and archive with regard to manual and automatic approaches. We also introduce different storage concepts and technologies from transportable media to professional solutions such as redundant array of independent disks (RAID) systems, network attached storages (NAS) and storage area network (SAN). Moreover, we present a software solution, which we developed for the purpose of long-term preservation of large mass spectrometry raw data files on an object storage device (OSD) archiving system. Finally, advantages, disadvantages, and experiences from routine operations of the presented concepts and technologies are evaluated and discussed. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan. Copyright © 2013. Published by Elsevier B.V.
The cosmic web in CosmoGrid void regions

NASA Astrophysics Data System (ADS)

Rieder, Steven; van de Weygaert, Rien; Cautun, Marius; Beygu, Burcu; Portegies Zwart, Simon

2016-10-01

We study the formation and evolution of the cosmic web, using the high-resolution CosmoGrid ΛCDM simulation. In particular, we investigate the evolution of the large-scale structure around void halo groups, and compare this to observations of the VGS-31 galaxy group, which consists of three interacting galaxies inside a large void. The structure around such haloes shows a great deal of tenuous structure, with most of such systems being embedded in intra-void filaments and walls. We use the Nexus+} algorithm to detect walls and filaments in CosmoGrid, and find them to be present and detectable at every scale. The void regions embed tenuous walls, which in turn embed tenuous filaments. We hypothesize that the void galaxy group of VGS-31 formed in such an environment.
An Introduction to Programming for Bioscientists: A Python-Based Primer

PubMed Central

Mura, Cameron

2016-01-01

Computing has revolutionized the biological sciences over the past several decades, such that virtually all contemporary research in molecular biology, biochemistry, and other biosciences utilizes computer programs. The computational advances have come on many fronts, spurred by fundamental developments in hardware, software, and algorithms. These advances have influenced, and even engendered, a phenomenal array of bioscience fields, including molecular evolution and bioinformatics; genome-, proteome-, transcriptome- and metabolome-wide experimental studies; structural genomics; and atomistic simulations of cellular-scale molecular assemblies as large as ribosomes and intact viruses. In short, much of post-genomic biology is increasingly becoming a form of computational biology. The ability to design and write computer programs is among the most indispensable skills that a modern researcher can cultivate. Python has become a popular programming language in the biosciences, largely because (i) its straightforward semantics and clean syntax make it a readily accessible first language; (ii) it is expressive and well-suited to object-oriented programming, as well as other modern paradigms; and (iii) the many available libraries and third-party toolkits extend the functionality of the core language into virtually every biological domain (sequence and structure analyses, phylogenomics, workflow management systems, etc.). This primer offers a basic introduction to coding, via Python, and it includes concrete examples and exercises to illustrate the language’s usage and capabilities; the main text culminates with a final project in structural bioinformatics. A suite of Supplemental Chapters is also provided. Starting with basic concepts, such as that of a “variable,” the Chapters methodically advance the reader to the point of writing a graphical user interface to compute the Hamming distance between two DNA sequences. PMID:27271528
An Introduction to Programming for Bioscientists: A Python-Based Primer.

PubMed

Ekmekci, Berk; McAnany, Charles E; Mura, Cameron

2016-06-01

Computing has revolutionized the biological sciences over the past several decades, such that virtually all contemporary research in molecular biology, biochemistry, and other biosciences utilizes computer programs. The computational advances have come on many fronts, spurred by fundamental developments in hardware, software, and algorithms. These advances have influenced, and even engendered, a phenomenal array of bioscience fields, including molecular evolution and bioinformatics; genome-, proteome-, transcriptome- and metabolome-wide experimental studies; structural genomics; and atomistic simulations of cellular-scale molecular assemblies as large as ribosomes and intact viruses. In short, much of post-genomic biology is increasingly becoming a form of computational biology. The ability to design and write computer programs is among the most indispensable skills that a modern researcher can cultivate. Python has become a popular programming language in the biosciences, largely because (i) its straightforward semantics and clean syntax make it a readily accessible first language; (ii) it is expressive and well-suited to object-oriented programming, as well as other modern paradigms; and (iii) the many available libraries and third-party toolkits extend the functionality of the core language into virtually every biological domain (sequence and structure analyses, phylogenomics, workflow management systems, etc.). This primer offers a basic introduction to coding, via Python, and it includes concrete examples and exercises to illustrate the language's usage and capabilities; the main text culminates with a final project in structural bioinformatics. A suite of Supplemental Chapters is also provided. Starting with basic concepts, such as that of a "variable," the Chapters methodically advance the reader to the point of writing a graphical user interface to compute the Hamming distance between two DNA sequences.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.