functional genomics proteomics: Topics by Science.gov

Sample records for functional genomics proteomics

Systems biology definition of the core proteome of metabolism and expression is consistent with high-throughput data.

PubMed

Yang, Laurence; Tan, Justin; O'Brien, Edward J; Monk, Jonathan M; Kim, Donghyuk; Li, Howard J; Charusanti, Pep; Ebrahim, Ali; Lloyd, Colton J; Yurkovich, James T; Du, Bin; Dräger, Andreas; Thomas, Alex; Sun, Yuekai; Saunders, Michael A; Palsson, Bernhard O

2015-08-25

Finding the minimal set of gene functions needed to sustain life is of both fundamental and practical importance. Minimal gene lists have been proposed by using comparative genomics-based core proteome definitions. A definition of a core proteome that is supported by empirical data, is understood at the systems-level, and provides a basis for computing essential cell functions is lacking. Here, we use a systems biology-based genome-scale model of metabolism and expression to define a functional core proteome consisting of 356 gene products, accounting for 44% of the Escherichia coli proteome by mass based on proteomics data. This systems biology core proteome includes 212 genes not found in previous comparative genomics-based core proteome definitions, accounts for 65% of known essential genes in E. coli, and has 78% gene function overlap with minimal genomes (Buchnera aphidicola and Mycoplasma genitalium). Based on transcriptomics data across environmental and genetic backgrounds, the systems biology core proteome is significantly enriched in nondifferentially expressed genes and depleted in differentially expressed genes. Compared with the noncore, core gene expression levels are also similar across genetic backgrounds (two times higher Spearman rank correlation) and exhibit significantly more complex transcriptional and posttranscriptional regulatory features (40% more transcription start sites per gene, 22% longer 5'UTR). Thus, genome-scale systems biology approaches rigorously identify a functional core proteome needed to support growth. This framework, validated by using high-throughput datasets, facilitates a mechanistic understanding of systems-level core proteome function through in silico models; it de facto defines a paleome.
Exploring the post-genomic world: differing explanatory and manipulatory functions of post-genomic sciences

PubMed Central

Holmes, Christina; Carlson, Siobhan M.; McDonald, Fiona; Jones, Mavis; Graham, Janice

2016-01-01

Richard Lewontin proposed that the ability of a scientific field to create a narrative for public understanding garners it social relevance. This article applies Lewontin's conceptual framework of the functions of science (manipulatory and explanatory) to compare and explain the current differences in perceived societal relevance of genetics/genomics and proteomics. We provide three examples to illustrate the social relevance and strong cultural narrative of genetics/genomics for which no counterpart exists for proteomics. We argue that the major difference between genetics/genomics and proteomics is that genomics has a strong explanatory function, due to the strong cultural narrative of heredity. Based on qualitative interviews and observations of proteomics conferences, we suggest that the nature of proteins, lack of public understanding, and theoretical complexity exacerbates this difference for proteomics. Lewontin's framework suggests that social scientists may find that omics sciences affect social relations in different ways than past analyses of genetics. PMID:27134568
Exploring the post-genomic world: differing explanatory and manipulatory functions of post-genomic sciences.

PubMed

Holmes, Christina; Carlson, Siobhan M; McDonald, Fiona; Jones, Mavis; Graham, Janice

2016-01-02

Richard Lewontin proposed that the ability of a scientific field to create a narrative for public understanding garners it social relevance. This article applies Lewontin's conceptual framework of the functions of science (manipulatory and explanatory) to compare and explain the current differences in perceived societal relevance of genetics/genomics and proteomics. We provide three examples to illustrate the social relevance and strong cultural narrative of genetics/genomics for which no counterpart exists for proteomics. We argue that the major difference between genetics/genomics and proteomics is that genomics has a strong explanatory function, due to the strong cultural narrative of heredity. Based on qualitative interviews and observations of proteomics conferences, we suggest that the nature of proteins, lack of public understanding, and theoretical complexity exacerbates this difference for proteomics. Lewontin's framework suggests that social scientists may find that omics sciences affect social relations in different ways than past analyses of genetics.
Dynamic Adaptive Binning: An Improved Quantification Technique for NMR Spectroscopic Data

DTIC Science & Technology

2010-01-01

Reo 2002). Unlike proteomics and genomics that assess inter- mediate products, metabolomics assesses the end product of cellular function, metabolites...other proteomic , genomic , and metabolomic analyses, NMR spectroscopy is Electronic supplementary material The online version of this article (doi...Changes occurring at the level of genes and proteins (assessed by genomics and proteomics ) may or may not influence a variety of cellular functions
An object model and database for functional genomics.

PubMed

Jones, Andrew; Hunt, Ela; Wastling, Jonathan M; Pizarro, Angel; Stoeckert, Christian J

2004-07-10

Large-scale functional genomics analysis is now feasible and presents significant challenges in data analysis, storage and querying. Data standards are required to enable the development of public data repositories and to improve data sharing. There is an established data format for microarrays (microarray gene expression markup language, MAGE-ML) and a draft standard for proteomics (PEDRo). We believe that all types of functional genomics experiments should be annotated in a consistent manner, and we hope to open up new ways of comparing multiple datasets used in functional genomics. We have created a functional genomics experiment object model (FGE-OM), developed from the microarray model, MAGE-OM and two models for proteomics, PEDRo and our own model (Gla-PSI-Glasgow Proposal for the Proteomics Standards Initiative). FGE-OM comprises three namespaces representing (i) the parts of the model common to all functional genomics experiments; (ii) microarray-specific components; and (iii) proteomics-specific components. We believe that FGE-OM should initiate discussion about the contents and structure of the next version of MAGE and the future of proteomics standards. A prototype database called RNA And Protein Abundance Database (RAPAD), based on FGE-OM, has been implemented and populated with data from microbial pathogenesis. FGE-OM and the RAPAD schema are available from http://www.gusdb.org/fge.html, along with a set of more detailed diagrams. RAPAD can be accessed by registration at the site.
Proteomic analysis of Medulloblastoma reveals functional biology with translational potential.

PubMed

Rivero-Hinojosa, Samuel; Lau, Ling San; Stampar, Mojca; Staal, Jerome; Zhang, Huizhen; Gordish-Dressman, Heather; Northcott, Paul A; Pfister, Stefan M; Taylor, Michael D; Brown, Kristy J; Rood, Brian R

2018-06-07

Genomic characterization has begun to redefine diagnostic classifications of cancers. However, it remains a challenge to infer disease phenotypes from genomic alterations alone. To help realize the promise of genomics, we have performed a quantitative proteomics investigation using Stable Isotope Labeling by Amino Acids in Cell Culture (SILAC) and 41 tissue samples spanning the 4 genomically based subgroups of medulloblastoma and control cerebellum. We have identified and quantitated thousands of proteins across these groups and find that we are able to recapitulate the genomic subgroups based upon subgroup restricted and differentially abundant proteins while also identifying subgroup specific protein isoforms. Integrating our proteomic measurements with genomic data, we calculate a poor correlation between mRNA and protein abundance. Using EPIC 850 k methylation array data on the same tissues, we also investigate the influence of copy number alterations and DNA methylation on the proteome in an attempt to characterize the impact of these genetic features on the proteome. Reciprocally, we are able to use the proteome to identify which genomic alterations result in altered protein abundance and thus are most likely to impact biology. Finally, we are able to assemble protein-based pathways yielding potential avenues for clinical intervention. From these, we validate the EIF4F cap-dependent translation pathway as a novel druggable pathway in medulloblastoma. Thus, quantitative proteomics complements genomic platforms to yield a more complete understanding of functional tumor biology and identify novel therapeutic targets for medulloblastoma.
DEFINING THE MANDATE OF PROTEOMICS IN THE POST-GENOMIC ERA: WORKSHOP REPORT

EPA Science Inventory

Research in proteomics is the next step after genomics in understanding life processes at the molecular level. In the largest sense proteomics encompasses knowledge of the structure, function and expression of all proteins in the biochemical or biological contexts of all organism...
Proteomics in the genome engineering era.

PubMed

Vandemoortele, Giel; Gevaert, Kris; Eyckerman, Sven

2016-01-01

Genome engineering experiments used to be lengthy, inefficient, and often expensive, preventing a widespread adoption of such experiments for the full assessment of endogenous protein functions. With the revolutionary clustered regularly interspaced short palindromic repeats/CRISPR-associated protein 9 technology, genome engineering became accessible to the broad life sciences community and is now implemented in several research areas. One particular field that can benefit significantly from this evolution is proteomics where a substantial impact on experimental design and general proteome biology can be expected. In this review, we describe the main applications of genome engineering in proteomics, including the use of engineered disease models and endogenous epitope tagging. In addition, we provide an overview on current literature and highlight important considerations when launching genome engineering technologies in proteomics workflows. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Annotation of Protein Domains Reveals Remarkable Conservation in the Functional Make up of Proteomes Across Superkingdoms

PubMed Central

Nasir, Arshan; Naeem, Aisha; Khan, Muhammad Jawad; Lopez-Nicora, Horacio D.; Caetano-Anollés, Gustavo

2011-01-01

The functional repertoire of a cell is largely embodied in its proteome, the collection of proteins encoded in the genome of an organism. The molecular functions of proteins are the direct consequence of their structure and structure can be inferred from sequence using hidden Markov models of structural recognition. Here we analyze the functional annotation of protein domain structures in almost a thousand sequenced genomes, exploring the functional and structural diversity of proteomes. We find there is a remarkable conservation in the distribution of domains with respect to the molecular functions they perform in the three superkingdoms of life. In general, most of the protein repertoire is spent in functions related to metabolic processes but there are significant differences in the usage of domains for regulatory and extra-cellular processes both within and between superkingdoms. Our results support the hypotheses that the proteomes of superkingdom Eukarya evolved via genome expansion mechanisms that were directed towards innovating new domain architectures for regulatory and extra/intracellular process functions needed for example to maintain the integrity of multicellular structure or to interact with environmental biotic and abiotic factors (e.g., cell signaling and adhesion, immune responses, and toxin production). Proteomes of microbial superkingdoms Archaea and Bacteria retained fewer numbers of domains and maintained simple and smaller protein repertoires. Viruses appear to play an important role in the evolution of superkingdoms. We finally identify few genomic outliers that deviate significantly from the conserved functional design. These include Nanoarchaeum equitans, proteobacterial symbionts of insects with extremely reduced genomes, Tenericutes and Guillardia theta. These organisms spend most of their domains on information functions, including translation and transcription, rather than on metabolism and harbor a domain repertoire characteristic of parasitic organisms. In contrast, the functional repertoire of the proteomes of the Planctomycetes-Verrucomicrobia-Chlamydiae superphylum was no different than the rest of bacteria, failing to support claims of them representing a separate superkingdom. In turn, Protista and Bacteria shared similar functional distribution patterns suggesting an ancestral evolutionary link between these groups. PMID:24710297
Combining genomic and proteomic approaches for epigenetics research

PubMed Central

Han, Yumiao; Garcia, Benjamin A

2014-01-01

Epigenetics is the study of changes in gene expression or cellular phenotype that do not change the DNA sequence. In this review, current methods, both genomic and proteomic, associated with epigenetics research are discussed. Among them, chromatin immunoprecipitation (ChIP) followed by sequencing and other ChIP-based techniques are powerful techniques for genome-wide profiling of DNA-binding proteins, histone post-translational modifications or nucleosome positions. However, mass spectrometry-based proteomics is increasingly being used in functional biological studies and has proved to be an indispensable tool to characterize histone modifications, as well as DNA–protein and protein–protein interactions. With the development of genomic and proteomic approaches, combination of ChIP and mass spectrometry has the potential to expand our knowledge of epigenetics research to a higher level. PMID:23895656
Proteomic approaches in brain research and neuropharmacology.

PubMed

Vercauteren, Freya G G; Bergeron, John J M; Vandesande, Frans; Arckens, Lut; Quirion, Rémi

2004-10-01

Numerous applications of genomic technologies have enabled the assembly of unprecedented inventories of genes, expressed in cells under specific physiological and pathophysiological conditions. Complementing the valuable information generated through functional genomics with the integrative knowledge of protein expression and function should enable the development of more efficient diagnostic tools and therapeutic agents. Proteomic analyses are particularly suitable to elucidate posttranslational modifications, expression levels and protein-protein interactions of thousands of proteins at a time. In this review, two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) investigations of brain tissues in neurodegenerative diseases such as Alzheimer's disease, Down syndrome and schizophrenia, and the construction of 2D-PAGE proteome maps of the brain are discussed. The role of the Human Proteome Organization (HUPO) as an international coordinating organization for proteomic efforts, as well as challenges for proteomic technologies and data analysis are also addressed. It is expected that the use of proteomic strategies will have significant impact in neuropharmacology over the coming decade.
The Proteome Folding Project: Proteome-scale prediction of structure and function

PubMed Central

Drew, Kevin; Winters, Patrick; Butterfoss, Glenn L.; Berstis, Viktors; Uplinger, Keith; Armstrong, Jonathan; Riffle, Michael; Schweighofer, Erik; Bovermann, Bill; Goodlett, David R.; Davis, Trisha N.; Shasha, Dennis; Malmström, Lars; Bonneau, Richard

2011-01-01

The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition, and grid-computing-enabled de novo structure prediction. We predict protein domain boundaries and three-dimensional (3D) structures for protein domains from 94 genomes (including human, Arabidopsis, rice, mouse, fly, yeast, Escherichia coli, and worm). De novo structure predictions were distributed on a grid of more than 1.5 million CPUs worldwide (World Community Grid). We generated significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions. PMID:21824995
Explore, Visualize, and Analyze Functional Cancer Proteomic Data Using the Cancer Proteome Atlas. | Office of Cancer Genomics

Cancer.gov

Reverse-phase protein arrays (RPPA) represent a powerful functional proteomic approach to elucidate cancer-related molecular mechanisms and to develop novel cancer therapies. To facilitate community-based investigation of the large-scale protein expression data generated by this platform, we have developed a user-friendly, open-access bioinformatic resource, The Cancer Proteome Atlas (TCPA, http://tcpaportal.org), which contains two separate web applications.
From proteomics to systems biology: MAPA, MASS WESTERN, PROMEX, and COVAIN as a user-oriented platform.

PubMed

Weckwerth, Wolfram; Wienkoop, Stefanie; Hoehenwarter, Wolfgang; Egelhofer, Volker; Sun, Xiaoliang

2014-01-01

Genome sequencing and systems biology are revolutionizing life sciences. Proteomics emerged as a fundamental technique of this novel research area as it is the basis for gene function analysis and modeling of dynamic protein networks. Here a complete proteomics platform suited for functional genomics and systems biology is presented. The strategy includes MAPA (mass accuracy precursor alignment; http://www.univie.ac.at/mosys/software.html ) as a rapid exploratory analysis step; MASS WESTERN for targeted proteomics; COVAIN ( http://www.univie.ac.at/mosys/software.html ) for multivariate statistical analysis, data integration, and data mining; and PROMEX ( http://www.univie.ac.at/mosys/databases.html ) as a database module for proteogenomics and proteotypic peptides for targeted analysis. Moreover, the presented platform can also be utilized to integrate metabolomics and transcriptomics data for the analysis of metabolite-protein-transcript correlations and time course analysis using COVAIN. Examples for the integration of MAPA and MASS WESTERN data, proteogenomic and metabolic modeling approaches for functional genomics, phosphoproteomics by integration of MOAC (metal-oxide affinity chromatography) with MAPA, and the integration of metabolomics, transcriptomics, proteomics, and physiological data using this platform are presented. All software and step-by-step tutorials for data processing and data mining can be downloaded from http://www.univie.ac.at/mosys/software.html.
Trends in genome dynamics among major orders of insects revealed through variations in protein families.

PubMed

Rappoport, Nadav; Linial, Michal

2015-08-07

Insects belong to a class that accounts for the majority of animals on earth. With over one million identified species, insects display a huge diversity and occupy extreme environments. At present, there are dozens of fully sequenced insect genomes that cover a range of habitats, social behavior and morphologies. In view of such diverse collection of genomes, revealing evolutionary trends and charting functional relationships of proteins remain challenging. We analyzed the relatedness of 17 complete proteomes representative of proteomes from insects including louse, bee, beetle, ants, flies and mosquitoes, as well as an out-group from the crustaceans. The analyzed proteomes mostly represented the orders of Hymenoptera and Diptera. The 287,405 protein sequences from the 18 proteomes were automatically clustered into 20,933 families, including 799 singletons. A comprehensive analysis based on statistical considerations identified the families that were significantly expanded or reduced in any of the studied organisms. Among all the tested species, ants are characterized by an exceptionally high rate of family gain and loss. By assigning annotations to hundreds of species-specific families, the functional diversity among species and between the major clades (Diptera and Hymenoptera) is revealed. We found that many species-specific families are associated with receptor signaling, stress-related functions and proteases. The highest variability among insects associates with the function of transposition and nucleic acids processes (collectively coined TNAP). Specifically, the wasp and ants have an order of magnitude more TNAP families and proteins relative to species that belong to Diptera (mosquitoes and flies). An unsupervised clustering methodology combined with a comparative functional analysis unveiled proteomic signatures in the major clades of winged insects. We propose that the expansion of TNAP families in Hymenoptera potentially contributes to the accelerated genome dynamics that characterize the wasp and ants.
Bacterial membrane proteomics.

PubMed

Poetsch, Ansgar; Wolters, Dirk

2008-10-01

About one quarter to one third of all bacterial genes encode proteins of the inner or outer bacterial membrane. These proteins perform essential physiological functions, such as the import or export of metabolites, the homeostasis of metal ions, the extrusion of toxic substances or antibiotics, and the generation or conversion of energy. The last years have witnessed completion of a plethora of whole-genome sequences of bacteria important for biotechnology or medicine, which is the foundation for proteome and other functional genome analyses. In this review, we discuss the challenges in membrane proteome analysis, starting from sample preparation and leading to MS-data analysis and quantification. The current state of available proteomics technologies as well as their advantages and disadvantages will be described with a focus on shotgun proteomics. Then, we will briefly introduce the most abundant proteins and protein families present in bacterial membranes before bacterial membrane proteomics studies of the last years will be presented. It will be shown how these works enlarged our knowledge about the physiological adaptations that take place in bacteria during fine chemical production, bioremediation, protein overexpression, and during infections. Furthermore, several examples from literature demonstrate the suitability of membrane proteomics for the identification of antigens and different pathogenic strains, as well as the elucidation of membrane protein structure and function.
Evolution of complexity in the zebrafish synapse proteome

PubMed Central

Bayés, Àlex; Collins, Mark O.; Reig-Viader, Rita; Gou, Gemma; Goulding, David; Izquierdo, Abril; Choudhary, Jyoti S.; Emes, Richard D.; Grant, Seth G. N.

2017-01-01

The proteome of human brain synapses is highly complex and is mutated in over 130 diseases. This complexity arose from two whole-genome duplications early in the vertebrate lineage. Zebrafish are used in modelling human diseases; however, its synapse proteome is uncharacterized, and whether the teleost-specific genome duplication (TSGD) influenced complexity is unknown. We report the characterization of the proteomes and ultrastructure of central synapses in zebrafish and analyse the importance of the TSGD. While the TSGD increases overall synapse proteome complexity, the postsynaptic density (PSD) proteome of zebrafish has lower complexity than mammals. A highly conserved set of ∼1,000 proteins is shared across vertebrates. PSD ultrastructural features are also conserved. Lineage-specific proteome differences indicate that vertebrate species evolved distinct synapse types and functions. The data sets are a resource for a wide range of studies and have important implications for the use of zebrafish in modelling human synaptic diseases. PMID:28252024
Pre-fractionation strategies to resolve pea (Pisum sativum) sub-proteomes

PubMed Central

Meisrimler, Claudia-Nicole; Menckhoff, Ljiljana; Kukavica, Biljana M.; Lüthje, Sabine

2015-01-01

Legumes are important crop plants and pea (Pisum sativum L.) has been investigated as a model with respect to several physiological aspects. The sequencing of the pea genome has not been completed. Therefore, proteomic approaches are currently limited. Nevertheless, the increasing numbers of available EST-databases as well as the high homology of the pea and medicago genome (Medicago truncatula Gaertner) allow the successful identification of proteins. Due to the un-sequenced pea genome, pre-fractionation approaches have been used in pea proteomic surveys in the past. Aside from a number of selective proteome studies on crude extracts and the chloroplast, few studies have targeted other components such as the pea secretome, an important sub-proteome of interest due to its role in abiotic and biotic stress processes. The secretome itself can be further divided into different sub-proteomes (plasma membrane, apoplast, cell wall proteins). Cell fractionation in combination with different gel-electrophoresis, chromatography methods and protein identification by mass spectrometry are important partners to gain insight into pea sub-proteomes, post-translational modifications and protein functions. Overall, pea proteomics needs to link numerous existing physiological and biochemical data to gain further insight into adaptation processes, which play important roles in field applications. Future developments and directions in pea proteomics are discussed. PMID:26539198
The proteome: structure, function and evolution

PubMed Central

Fleming, Keiran; Kelley, Lawrence A; Islam, Suhail A; MacCallum, Robert M; Muller, Arne; Pazos, Florencio; Sternberg, Michael J.E

2006-01-01

This paper reports two studies to model the inter-relationships between protein sequence, structure and function. First, an automated pipeline to provide a structural annotation of proteomes in the major genomes is described. The results are stored in a database at Imperial College, London (3D-GENOMICS) that can be accessed at www.sbg.bio.ic.ac.uk. Analysis of the assignments to structural superfamilies provides evolutionary insights. 3D-GENOMICS is being integrated with related proteome annotation data at University College London and the European Bioinformatics Institute in a project known as e-protein (http://www.e-protein.org/). The second topic is motivated by the developments in structural genomics projects in which the structure of a protein is determined prior to knowledge of its function. We have developed a new approach PHUNCTIONER that uses the gene ontology (GO) classification to supervise the extraction of the sequence signal responsible for protein function from a structure-based sequence alignment. Using GO we can obtain profiles for a range of specificities described in the ontology. In the region of low sequence similarity (around 15%), our method is more accurate than assignment from the closest structural homologue. The method is also able to identify the specific residues associated with the function of the protein family. PMID:16524832
ProGeRF: Proteome and Genome Repeat Finder Utilizing a Fast Parallel Hash Function

PubMed Central

Moraes, Walas Jhony Lopes; Rodrigues, Thiago de Souza; Bartholomeu, Daniella Castanheira

2015-01-01

Repetitive element sequences are adjacent, repeating patterns, also called motifs, and can be of different lengths; repetitions can involve their exact or approximate copies. They have been widely used as molecular markers in population biology. Given the sizes of sequenced genomes, various bioinformatics tools have been developed for the extraction of repetitive elements from DNA sequences. However, currently available tools do not provide options for identifying repetitive elements in the genome or proteome, displaying a user-friendly web interface, and performing-exhaustive searches. ProGeRF is a web site for extracting repetitive regions from genome and proteome sequences. It was designed to be efficient, fast, and accurate and primarily user-friendly web tool allowing many ways to view and analyse the results. ProGeRF (Proteome and Genome Repeat Finder) is freely available as a stand-alone program, from which the users can download the source code, and as a web tool. It was developed using the hash table approach to extract perfect and imperfect repetitive regions in a (multi)FASTA file, while allowing a linear time complexity. PMID:25811026

Spermatogenesis in mammals: proteomic insights.

PubMed

Chocu, Sophie; Calvel, Pierre; Rolland, Antoine D; Pineau, Charles

2012-08-01

Spermatogenesis is a highly sophisticated process involved in the transmission of genetic heritage. It includes halving ploidy, repackaging of the chromatin for transport, and the equipment of developing spermatids and eventually spermatozoa with the advanced apparatus (e.g., tightly packed mitochondrial sheat in the mid piece, elongating of the tail, reduction of cytoplasmic volume) to elicit motility once they reach the epididymis. Mammalian spermatogenesis is divided into three phases. In the first the primitive germ cells or spermatogonia undergo a series of mitotic divisions. In the second the spermatocytes undergo two consecutive divisions in meiosis to produce haploid spermatids. In the third the spermatids differentiate into spermatozoa in a process called spermiogenesis. Paracrine, autocrine, juxtacrine, and endocrine pathways all contribute to the regulation of the process. The array of structural elements and chemical factors modulating somatic and germ cell activity is such that the network linking the various cellular activities during spermatogenesis is unimaginably complex. Over the past two decades, advances in genomics have greatly improved our knowledge of spermatogenesis, by identifying numerous genes essential for the development of functional male gametes. Large-scale analyses of testicular function have deepened our insight into normal and pathological spermatogenesis. Progress in genome sequencing and microarray technology have been exploited for genome-wide expression studies, leading to the identification of hundreds of genes differentially expressed within the testis. However, although proteomics has now come of age, the proteomics-based investigation of spermatogenesis remains in its infancy. Here, we review the state-of-the-art of large-scale proteomic analyses of spermatogenesis, from germ cell development during sex determination to spermatogenesis in the adult. Indeed, a few laboratories have undertaken differential protein profiling expression studies and/or systematic analyses of testicular proteomes in entire organs or isolated cells from various species. We consider the pros and cons of proteomics for studying the testicular germ cell gene expression program. Finally, we address the use of protein datasets, through integrative genomics (i.e., combining genomics, transcriptomics, and proteomics), bioinformatics, and modelling.
Proteomics informed by transcriptomics for characterising active transposable elements and genome annotation in Aedes aegypti.

PubMed

Maringer, Kevin; Yousuf, Amjad; Heesom, Kate J; Fan, Jun; Lee, David; Fernandez-Sesma, Ana; Bessant, Conrad; Matthews, David A; Davidson, Andrew D

2017-01-19

Aedes aegypti is a vector for the (re-)emerging human pathogens dengue, chikungunya, yellow fever and Zika viruses. Almost half of the Ae. aegypti genome is comprised of transposable elements (TEs). Transposons have been linked to diverse cellular processes, including the establishment of viral persistence in insects, an essential step in the transmission of vector-borne viruses. However, up until now it has not been possible to study the overall proteome derived from an organism's mobile genetic elements, partly due to the highly divergent nature of TEs. Furthermore, as for many non-model organisms, incomplete genome annotation has hampered proteomic studies on Ae. aegypti. We analysed the Ae. aegypti proteome using our new proteomics informed by transcriptomics (PIT) technique, which bypasses the need for genome annotation by identifying proteins through matched transcriptomic (rather than genomic) data. Our data vastly increase the number of experimentally confirmed Ae. aegypti proteins. The PIT analysis also identified hotspots of incomplete genome annotation, and showed that poor sequence and assembly quality do not explain all annotation gaps. Finally, in a proof-of-principle study, we developed criteria for the characterisation of proteomically active TEs. Protein expression did not correlate with a TE's genomic abundance at different levels of classification. Most notably, long terminal repeat (LTR) retrotransposons were markedly enriched compared to other elements. PIT was superior to 'conventional' proteomic approaches in both our transposon and genome annotation analyses. We present the first proteomic characterisation of an organism's repertoire of mobile genetic elements, which will open new avenues of research into the function of transposon proteins in health and disease. Furthermore, our study provides a proof-of-concept that PIT can be used to evaluate a genome's annotation to guide annotation efforts which has the potential to improve the efficiency of annotation projects in non-model organisms. PIT therefore represents a valuable new tool to study the biology of the important vector species Ae. aegypti, including its role in transmitting emerging viruses of global public health concern.
Application of proteomics to ecology and population biology.

PubMed

Karr, T L

2008-02-01

Proteomics is a relatively new scientific discipline that merges protein biochemistry, genome biology and bioinformatics to determine the spatial and temporal expression of proteins in cells, tissues and whole organisms. There has been very little application of proteomics to the fields of behavioral genetics, evolution, ecology and population dynamics, and has only recently been effectively applied to the closely allied fields of molecular evolution and genetics. However, there exists considerable potential for proteomics to impact in areas related to functional ecology; this review will introduce the general concepts and methodologies that define the field of proteomics and compare and contrast the advantages and disadvantages with other methods. Examples of how proteomics can aid, complement and indeed extend the study of functional ecology will be discussed including the main tool of ecological studies, population genetics with an emphasis on metapopulation structure analysis. Because proteomic analyses provide a direct measure of gene expression, it obviates some of the limitations associated with other genomic approaches, such as microarray and EST analyses. Likewise, in conjunction with associated bioinformatics and molecular evolutionary tools, proteomics can provide the foundation of a systems-level integration approach that can enhance ecological studies. It can be envisioned that proteomics will provide important new information on issues specific to metapopulation biology and adaptive processes in nature. A specific example of the application of proteomics to sperm ageing is provided to illustrate the potential utility of the approach.
Recognition of the polycistronic nature of human genes is critical to understanding the genotype-phenotype relationship.

PubMed

Brunet, Marie A; Levesque, Sébastien A; Hunting, Darel J; Cohen, Alan A; Roucou, Xavier

2018-05-01

Technological advances promise unprecedented opportunities for whole exome sequencing and proteomic analyses of populations. Currently, data from genome and exome sequencing or proteomic studies are searched against reference genome annotations. This provides the foundation for research and clinical screening for genetic causes of pathologies. However, current genome annotations substantially underestimate the proteomic information encoded within a gene. Numerous studies have now demonstrated the expression and function of alternative (mainly small, sometimes overlapping) ORFs within mature gene transcripts. This has important consequences for the correlation of phenotypes and genotypes. Most alternative ORFs are not yet annotated because of a lack of evidence, and this absence from databases precludes their detection by standard proteomic methods, such as mass spectrometry. Here, we demonstrate how current approaches tend to overlook alternative ORFs, hindering the discovery of new genetic drivers and fundamental research. We discuss available tools and techniques to improve identification of proteins from alternative ORFs and finally suggest a novel annotation system to permit a more complete representation of the transcriptomic and proteomic information contained within a gene. Given the crucial challenge of distinguishing functional ORFs from random ones, the suggested pipeline emphasizes both experimental data and conservation signatures. The addition of alternative ORFs in databases will render identification less serendipitous and advance the pace of research and genomic knowledge. This review highlights the urgent medical and research need to incorporate alternative ORFs in current genome annotations and thus permit their inclusion in hypotheses and models, which relate phenotypes and genotypes. © 2018 Brunet et al.; Published by Cold Spring Harbor Laboratory Press.
Strain-resolved microbial community proteomics reveals simultaneous aerobic and anaerobic function during gastrointestinal tract colonization of a preterm infant

DOE PAGES

Brooks, Brandon; Mueller, R. S.; Young, Jacque C.; ...

2015-07-01

While there has been growing interest in the gut microbiome in recent years, it remains unclear whether closely related species and strains have similar or distinct functional roles and if organisms capable of both aerobic and anaerobic growth do so simultaneously. To investigate these questions, we implemented a high-throughput mass spectrometry-based proteomics approach to identify proteins in fecal samples collected on days of life 13 21 from an infant born at 28 weeks gestation. No prior studies have coupled strain-resolved community metagenomics to proteomics for such a purpose. Sequences were manually curated to resolve the genomes of two strains ofmore » Citrobacter that were present during the later stage of colonization. Proteome extracts from fecal samples were processed via a nano-2D-LC-MS/MS and peptides were identified based on information predicted from the genome sequences for the dominant organisms, Serratia and the two Citrobacter strains. These organisms are facultative anaerobes, and proteomic information indicates the utilization of both aerobic and anaerobic metabolisms throughout the time series. This may indicate growth in distinct niches within the gastrointestinal tract. We uncovered differences in the physiology of coexisting Citrobacter strains, including differences in motility and chemotaxis functions. Additionally, for both Citrobacter strains we resolved a community-essential role in vitamin metabolism and a predominant role in propionate production. Finally, in this case study we detected differences between genome abundance and activity levels for the dominant populations. This underlines the value in layering proteomic information over genetic potential.« less
Proteomics technique opens new frontiers in mobilome research.

PubMed

Davidson, Andrew D; Matthews, David A; Maringer, Kevin

2017-01-01

A large proportion of the genome of most eukaryotic organisms consists of highly repetitive mobile genetic elements. The sum of these elements is called the "mobilome," which in eukaryotes is made up mostly of transposons. Transposable elements contribute to disease, evolution, and normal physiology by mediating genetic rearrangement, and through the "domestication" of transposon proteins for cellular functions. Although 'omics studies of mobilome genomes and transcriptomes are common, technical challenges have hampered high-throughput global proteomics analyses of transposons. In a recent paper, we overcame these technical hurdles using a technique called "proteomics informed by transcriptomics" (PIT), and thus published the first unbiased global mobilome-derived proteome for any organism (using cell lines derived from the mosquito Aedes aegypti ). In this commentary, we describe our methods in more detail, and summarise our major findings. We also use new genome sequencing data to show that, in many cases, the specific genomic element expressing a given protein can be identified using PIT. This proteomic technique therefore represents an important technological advance that will open new avenues of research into the role that proteins derived from transposons and other repetitive and sequence diverse genetic elements, such as endogenous retroviruses, play in health and disease.
Birth of plant proteomics in India: a new horizon.

PubMed

Narula, Kanika; Pandey, Aarti; Gayali, Saurabh; Chakraborty, Niranjan; Chakraborty, Subhra

2015-09-08

In the post-genomic era, proteomics is acknowledged as the next frontier for biological research. Although India has a long and distinguished tradition in protein research, the initiation of proteomics studies was a new horizon. Protein research witnessed enormous progress in protein separation, high-resolution refinements, biochemical identification of the proteins, protein-protein interaction, and structure-function analysis. Plant proteomics research, in India, began its journey on investigation of the proteome profiling, complexity analysis, protein trafficking, and biochemical modeling. The research article by Bhushan et al. in 2006 marked the birth of the plant proteomics research in India. Since then plant proteomics studies expanded progressively and are now being carried out in various institutions spread across the country. The compilation presented here seeks to trace the history of development in the area during the past decade based on publications till date. In this review, we emphasize on outcomes of the field providing prospects on proteomic pathway analyses. Finally, we discuss the connotation of strategies and the potential that would provide the framework of plant proteome research. The past decades have seen rapidly growing number of sequenced plant genomes and associated genomic resources. To keep pace with this increasing body of data, India is in the provisional phase of proteomics research to develop a comparative hub for plant proteomes and protein families, but it requires a strong impetus from intellectuals, entrepreneurs, and government agencies. Here, we aim to provide an overview of past, present and future of Indian plant proteomics, which would serve as an evaluation platform for those seeking to incorporate proteomics into their research programs. This article is part of a Special Issue entitled: Proteomics in India. Copyright © 2015 Elsevier B.V. All rights reserved.
Evaluation of a genome-scale in silico metabolic model for Geobacter metallireducens by using proteomic data from a field biostimulation experiment.

PubMed

Fang, Yilin; Wilkins, Michael J; Yabusaki, Steven B; Lipton, Mary S; Long, Philip E

2012-12-01

Accurately predicting the interactions between microbial metabolism and the physical subsurface environment is necessary to enhance subsurface energy development, soil and groundwater cleanup, and carbon management. This study was an initial attempt to confirm the metabolic functional roles within an in silico model using environmental proteomic data collected during field experiments. Shotgun global proteomics data collected during a subsurface biostimulation experiment were used to validate a genome-scale metabolic model of Geobacter metallireducens-specifically, the ability of the metabolic model to predict metal reduction, biomass yield, and growth rate under dynamic field conditions. The constraint-based in silico model of G. metallireducens relates an annotated genome sequence to the physiological functions with 697 reactions controlled by 747 enzyme-coding genes. Proteomic analysis showed that 180 of the 637 G. metallireducens proteins detected during the 2008 experiment were associated with specific metabolic reactions in the in silico model. When the field-calibrated Fe(III) terminal electron acceptor process reaction in a reactive transport model for the field experiments was replaced with the genome-scale model, the model predicted that the largest metabolic fluxes through the in silico model reactions generally correspond to the highest abundances of proteins that catalyze those reactions. Central metabolism predicted by the model agrees well with protein abundance profiles inferred from proteomic analysis. Model discrepancies with the proteomic data, such as the relatively low abundances of proteins associated with amino acid transport and metabolism, revealed pathways or flux constraints in the in silico model that could be updated to more accurately predict metabolic processes that occur in the subsurface environment.
Proteobionics: biomimetics in proteomics.

PubMed

Sommer, Andrei P; Gheorghiu, Eleonora

2006-03-01

Proteomics was established 10 years ago by the analysis of microbial genomes via their protein complement or proteome. Bionics is an ancient art, which converts structures optimized by nature into advanced technical products. Previously, we analyzed survival modalities in nanobacteria and converted the interplay between survival-oriented protein functions and nanoscale mineral shells into models for advanced drug delivery. Exploiting protein functions observed in nature to design biomedical products and therapies could be named proteobionics. Here, we present examples for this new branch of nanoproteomics.
Fungal proteomics: from identification to function.

PubMed

Doyle, Sean

2011-08-01

Some fungi cause disease in humans and plants, while others have demonstrable potential for the control of insect pests. In addition, fungi are also a rich reservoir of therapeutic metabolites and industrially useful enzymes. Detailed analysis of fungal biochemistry is now enabled by multiple technologies including protein mass spectrometry, genome and transcriptome sequencing and advances in bioinformatics. Yet, the assignment of function to fungal proteins, encoded either by in silico annotated, or unannotated genes, remains problematic. The purpose of this review is to describe the strategies used by many researchers to reveal protein function in fungi, and more importantly, to consolidate the nomenclature of 'unknown function protein' as opposed to 'hypothetical protein' - once any protein has been identified by protein mass spectrometry. A combination of approaches including comparative proteomics, pathogen-induced protein expression and immunoproteomics are outlined, which, when used in combination with a variety of other techniques (e.g. functional genomics, microarray analysis, immunochemical and infection model systems), appear to yield comprehensive and definitive information on protein function in fungi. The relative advantages of proteomic, as opposed to transcriptomic-only, analyses are also described. In the future, combined high-throughput, quantitative proteomics, allied to transcriptomic sequencing, are set to reveal much about protein function in fungi. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
Stagonospora nodorum: From pathology to genomics and host resistance

USDA-ARS?s Scientific Manuscript database

Stagonospora nodorum is a major necrotrophic pathogen of wheat that causes the diseases Stagonospora nodorum leaf and glume blotch. A series of tools and resources, including functional genomics, a genome sequence, proteomics and metabolomics, host-mapping populations, and a worldwide collection of ...
ZikaVR: An Integrated Zika Virus Resource for Genomics, Proteomics, Phylogenetic and Therapeutic Analysis

PubMed Central

Gupta, Amit Kumar; Kaur, Karambir; Rajput, Akanksha; Dhanda, Sandeep Kumar; Sehgal, Manika; Khan, Md. Shoaib; Monga, Isha; Dar, Showkat Ahmad; Singh, Sandeep; Nagpal, Gandharva; Usmani, Salman Sadullah; Thakur, Anamika; Kaur, Gazaldeep; Sharma, Shivangi; Bhardwaj, Aman; Qureshi, Abid; Raghava, Gajendra Pal Singh; Kumar, Manoj

2016-01-01

Current Zika virus (ZIKV) outbreaks that spread in several areas of Africa, Southeast Asia, and in pacific islands is declared as a global health emergency by World Health Organization (WHO). It causes Zika fever and illness ranging from severe autoimmune to neurological complications in humans. To facilitate research on this virus, we have developed an integrative multi-omics platform; ZikaVR (http://bioinfo.imtech.res.in/manojk/zikavr/), dedicated to the ZIKV genomic, proteomic and therapeutic knowledge. It comprises of whole genome sequences, their respective functional information regarding proteins, genes, and structural content. Additionally, it also delivers sophisticated analysis such as whole-genome alignments, conservation and variation, CpG islands, codon context, usage bias and phylogenetic inferences at whole genome and proteome level with user-friendly visual environment. Further, glycosylation sites and molecular diagnostic primers were also analyzed. Most importantly, we also proposed potential therapeutically imperative constituents namely vaccine epitopes, siRNAs, miRNAs, sgRNAs and repurposing drug candidates. PMID:27633273
A Resource of Quantitative Functional Annotation for Homo sapiens Genes.

PubMed

Taşan, Murat; Drabkin, Harold J; Beaver, John E; Chua, Hon Nian; Dunham, Julie; Tian, Weidong; Blake, Judith A; Roth, Frederick P

2012-02-01

The body of human genomic and proteomic evidence continues to grow at ever-increasing rates, while annotation efforts struggle to keep pace. A surprisingly small fraction of human genes have clear, documented associations with specific functions, and new functions continue to be found for characterized genes. Here we assembled an integrated collection of diverse genomic and proteomic data for 21,341 human genes and make quantitative associations of each to 4333 Gene Ontology terms. We combined guilt-by-profiling and guilt-by-association approaches to exploit features unique to the data types. Performance was evaluated by cross-validation, prospective validation, and by manual evaluation with the biological literature. Functional-linkage networks were also constructed, and their utility was demonstrated by identifying candidate genes related to a glioma FLN using a seed network from genome-wide association studies. Our annotations are presented-alongside existing validated annotations-in a publicly accessible and searchable web interface.
Biogeoscience from a Metallomic and Proteomic Perspective

NASA Astrophysics Data System (ADS)

Anbar, A. D.; Shock, E.

2004-12-01

In the wake of the genomics revolution, life scientists are expanding their focus from the genome to the "proteome" - the assemblage of all proteins in a cell - and the "metallome" - the distribution of inorganic species in a cell. The proteome and metallome are tightly connected because proteins and protein products are intimately involved in the transport and homeostasis of inorganic elements, and because many enzymes depend on inorganic elements for catalytic activity. Together, they are at the heart of metabolic function. Unlike the relatively static genome, the proteome and metallome are extremely dynamic, changing rapidly in response to environmental cues. They are substantially more complex than the genome; for example, in humans, some 30,000 genes code for approximately 500,000 proteins. Metaphorically, the proteome and metallome constitute the complex, dynamic "language" by which the genome and the environment communicate. Therefore biogeochemists, like life scientists, are moving beyond a strictly genomic perspective. Research guided by proteomic and metallomic perspectives and methodologies should provide new insights into the connections between life and the inorganic Earth in modern environments, and the evolution of these connections through time. For example, biogeochemical research in modern environments, such as Yellowstone hot springs, is hindered by the gap between genomic determinations of metabolic potential in ecosystems and geochemical characterizations of the energetic boundary conditions faced by these ecosystems; genomics tells us "who is there" and geochemistry tells us "what they might be doing", but neither genomics nor geochemistry easily provide quantitative information about which metabolisms are actually active or a framework for understanding why ecosystems do not fully exploit the energy available in their surroundings. Such questions are fundamentally kinetic rather than thermodynamic and therefore demand that we characterize and understand the proteins and inorganic elements used by organisms to catalyze reactions and capture energy from their surroundings. Similar challenges are faced when attempting to map the evolutionary relationships inferred from phylogenetic analyses of genomes to ecological histories determined by geochemists and paleobiologists - for example, ongoing efforts to understand the evolutionary history of eukaryotes and metazoa - because the driving forces for the evolution and ecological radiation of organisms lie at the intersection of metabolism and environment, and hence in the gap between genomes and geochemistry. Future progress in understanding the biogeochemistry of modern and ancient environments will be spurred by integrating proteomic and metallomic methods and perspectives.
The Use of Functional Genomics in Conjunction with Metabolomics for Mycobacterium tuberculosis Research

PubMed Central

Swanepoel, Conrad C.

2014-01-01

Tuberculosis (TB), caused by Mycobacterium tuberculosis, is a fatal infectious disease, resulting in 1.4 million deaths globally per annum. Over the past three decades, genomic studies have been conducted in an attempt to elucidate the functionality of the genome of the pathogen. However, many aspects of this complex genome remain largely unexplored, as approaches like genomics, proteomics, and transcriptomics have failed to characterize them successfully. In turn, metabolomics, which is relatively new to the “omics” revolution, has shown great potential for investigating biological systems or their modifications. Furthermore, when these data are interpreted in combination with previously acquired genomics, proteomics and transcriptomics data, using what is termed a systems biology approach, a more holistic understanding of these systems can be achieved. In this review we discuss how metabolomics has contributed so far to characterizing TB, with emphasis on the resulting improved elucidation of M. tuberculosis in terms of (1) metabolism, (2) growth and replication, (3) pathogenicity, and (4) drug resistance, from the perspective of systems biology. PMID:24771957
Proteomics technique opens new frontiers in mobilome research

PubMed Central

Davidson, Andrew D.; Matthews, David A.

2017-01-01

ABSTRACT A large proportion of the genome of most eukaryotic organisms consists of highly repetitive mobile genetic elements. The sum of these elements is called the “mobilome,” which in eukaryotes is made up mostly of transposons. Transposable elements contribute to disease, evolution, and normal physiology by mediating genetic rearrangement, and through the “domestication” of transposon proteins for cellular functions. Although ‘omics studies of mobilome genomes and transcriptomes are common, technical challenges have hampered high-throughput global proteomics analyses of transposons. In a recent paper, we overcame these technical hurdles using a technique called “proteomics informed by transcriptomics” (PIT), and thus published the first unbiased global mobilome-derived proteome for any organism (using cell lines derived from the mosquito Aedes aegypti). In this commentary, we describe our methods in more detail, and summarise our major findings. We also use new genome sequencing data to show that, in many cases, the specific genomic element expressing a given protein can be identified using PIT. This proteomic technique therefore represents an important technological advance that will open new avenues of research into the role that proteins derived from transposons and other repetitive and sequence diverse genetic elements, such as endogenous retroviruses, play in health and disease. PMID:28932623
CPTAC Releases Largest-Ever Ovarian Cancer Proteome Dataset from Previously Genome Characterized Tumors | Office of Cancer Clinical Proteomics Research

Cancer.gov

National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium (CPTAC) scientists have just released a comprehensive dataset of the proteomic analysis of high grade serous ovarian tumor samples, previously genomically analyzed by The Cancer Genome Atlas (TCGA). This is one of the largest public datasets covering the proteome, phosphoproteome and glycoproteome with complementary deep genomic sequencing data on the same tumor.
The path to enlightenment: making sense of genomic and proteomic information.

PubMed

Maurer, Martin H

2004-05-01

Whereas genomics describes the study of genome, mainly represented by its gene expression on the DNA or RNA level, the term proteomics denotes the study of the proteome, which is the protein complement encoded by the genome. In recent years, the number of proteomic experiments increased tremendously. While all fields of proteomics have made major technological advances, the biggest step was seen in bioinformatics. Biological information management relies on sequence and structure databases and powerful software tools to translate experimental results into meaningful biological hypotheses and answers. In this resource article, I provide a collection of databases and software available on the Internet that are useful to interpret genomic and proteomic data. The article is a toolbox for researchers who have genomic or proteomic datasets and need to put their findings into a biological context.
Current advances in esophageal cancer proteomics.

PubMed

Uemura, Norihisa; Kondo, Tadashi

2015-06-01

We review the current status of proteomics for esophageal cancer (EC) from a clinician's viewpoint. The ultimate goal of cancer proteomics is the improvement of clinical outcome. The proteome as a functional translation of the genome is a straightforward representation of genomic mechanisms that trigger carcinogenesis. Cancer proteomics has identified the mechanisms of carcinogenesis and tumor progression, detected biomarker candidates for early diagnosis, and provided novel therapeutic targets for personalized treatments. Our review focuses on three major topics in EC proteomics: diagnostics, treatment, and molecular mechanisms. We discuss the major histological differences between EC types, i.e., esophageal squamous cell carcinoma and adenocarcinoma, and evaluate the clinical significance of published proteomics studies, including promising diagnostic biomarkers and novel therapeutic targets, which should be further validated prior to launching clinical trials. Multi-disciplinary collaborations between basic scientists, clinicians, and pathologists should be established for inter-institutional validation. In conclusion, EC proteomics has provided significant results, which after thorough validation, should lead to the development of novel clinical tools and improvement of the clinical outcome for esophageal cancer patients. This article is part of a Special Issue entitled: Medical Proteomics. Copyright © 2014 Elsevier B.V. All rights reserved.
Using proteomics to study sexual reproduction in angiosperms

USDA-ARS?s Scientific Manuscript database

While a relative latecomer to the post-genomics era of functional biology, the application of mass spectrometry-based proteomic analysis has increased exponentially over the past 10 years. Some of this increase is the result of transition of chemists physicists, and mathematicians to the study of ...

Comparative analysis of genomics and proteomics in Bacillus thuringiensis 4.0718.

PubMed

Rang, Jie; He, Hao; Wang, Ting; Ding, Xuezhi; Zuo, Mingxing; Quan, Meifang; Sun, Yunjun; Yu, Ziquan; Hu, Shengbiao; Xia, Liqiu

2015-01-01

Bacillus thuringiensis is a widely used biopesticide that produced various insecticidal active substances during its life cycle. Separation and purification of numerous insecticide active substances have been difficult because of the relatively short half-life of such substances. On the other hand, substances can be synthetized at different times during development, so samples at different stages have to be studied, further complicating the analysis. A dual genomic and proteomic approach would enhance our ability to identify such substances, and particularily using mass spectrometry-based proteomic methods. The comparative analysis for genomic and proteomic data have showed that not all of the products deduced from the annotated genome could be identified among the proteomic data. For instance, genome annotation results showed that 39 coding sequences in the whole genome were related to insect pathogenicity, including five cry genes. However, Cry2Ab, Cry1Ia, Cytotoxin K, Bacteriocin, Exoenzyme C3 and Alveolysin could not be detected in the proteomic data obtained. The sporulation-related proteins were also compared analysis, results showed that the great majority sporulation-related proteins can be detected by mass spectrometry. This analysis revealed Spo0A~P, SigF, SigE(+), SigK(+) and SigG(+), all known to play an important role in the process of spore formation regulatory network, also were displayed in the proteomic data. Through the comparison of the two data sets, it was possible to infer that some genes were silenced or were expressed at very low levels. For instance, found that cry2Ab seems to lack a functional promoter while cry1Ia may not be expressed due to the presence of transposons. With this comparative study a relatively complete database can be constructed and used to transform hereditary material, thereby prompting the high expression of toxic proteins. A theoretical basis is provided for constructing highly virulent engineered bacteria and for promoting the application of proteogenomics in the life sciences.
Activity-based protein profiling: from enzyme chemistry to proteomic chemistry.

PubMed

Cravatt, Benjamin F; Wright, Aaron T; Kozarich, John W

2008-01-01

Genome sequencing projects have provided researchers with a complete inventory of the predicted proteins produced by eukaryotic and prokaryotic organisms. Assignment of functions to these proteins represents one of the principal challenges for the field of proteomics. Activity-based protein profiling (ABPP) has emerged as a powerful chemical proteomic strategy to characterize enzyme function directly in native biological systems on a global scale. Here, we review the basic technology of ABPP, the enzyme classes addressable by this method, and the biological discoveries attributable to its application.
Characterization, design, and function of the mitochondrial proteome: from organs to organisms.

PubMed

Lotz, Christopher; Lin, Amanda J; Black, Caitlin M; Zhang, Jun; Lau, Edward; Deng, Ning; Wang, Yueju; Zong, Nobel C; Choi, Jeong H; Xu, Tao; Liem, David A; Korge, Paavo; Weiss, James N; Hermjakob, Henning; Yates, John R; Apweiler, Rolf; Ping, Peipei

2014-02-07

Mitochondria are a common energy source for organs and organisms; their diverse functions are specialized according to the unique phenotypes of their hosting environment. Perturbation of mitochondrial homeostasis accompanies significant pathological phenotypes. However, the connections between mitochondrial proteome properties and function remain to be experimentally established on a systematic level. This uncertainty impedes the contextualization and translation of proteomic data to the molecular derivations of mitochondrial diseases. We present a collection of mitochondrial features and functions from four model systems, including two cardiac mitochondrial proteomes from distinct genomes (human and mouse), two unique organ mitochondrial proteomes from identical genetic codons (mouse heart and mouse liver), as well as a relevant metazoan out-group (drosophila). The data, composed of mitochondrial protein abundance and their biochemical activities, capture the core functionalities of these mitochondria. This investigation allowed us to redefine the core mitochondrial proteome from organs and organisms, as well as the relevant contributions from genetic information and hosting milieu. Our study has identified significant enrichment of disease-associated genes and their products. Furthermore, correlational analyses suggest that mitochondrial proteome design is primarily driven by cellular environment. Taken together, these results connect proteome feature with mitochondrial function, providing a prospective resource for mitochondrial pathophysiology and developing novel therapeutic targets in medicine.
Microbial Interactions in Plants: Perspectives and Applications of Proteomics.

PubMed

Imam, Jahangir; Shukla, Pratyoosh; Mandal, Nimai Prasad; Variar, Mukund

2017-01-01

The structure and function of proteins involved in plant-microbe interactions is investigated through large-scale proteomics technology in a complex biological sample. Since the whole genome sequences are now available for several plant species and microbes, proteomics study has become easier, accurate and huge amount of data can be generated and analyzed during plant-microbe interactions. Proteomics approaches are highly important and relevant in many studies and showed that only genomics approaches are not sufficient enough as much significant information are lost as the proteins and not the genes coding them are final product that is responsible for the observed phenotype. Novel approaches in proteomics are developing continuously enabling the study of the various aspects in arrangements and configuration of proteins and its functions. Its application is becoming more common and frequently used in plant-microbe interactions with the advancement in new technologies. They are more used for the portrayal of cell and extracellular destructiveness and pathogenicity variables delivered by pathogens. This distinguishes the protein level adjustments in host plants when infected with pathogens and advantageous partners. This review provides a brief overview of different proteomics technology which is currently available followed by their exploitation to study the plant-microbe interaction. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Toward an Upgraded Honey Bee (Apis mellifera L.) Genome Annotation Using Proteogenomics.

PubMed

McAfee, Alison; Harpur, Brock A; Michaud, Sarah; Beavis, Ronald C; Kent, Clement F; Zayed, Amro; Foster, Leonard J

2016-02-05

The honey bee is a key pollinator in agricultural operations as well as a model organism for studying the genetics and evolution of social behavior. The Apis mellifera genome has been sequenced and annotated twice over, enabling proteomics and functional genomics methods for probing relevant aspects of their biology. One troubling trend that emerged from proteomic analyses is that honey bee peptide samples consistently result in lower peptide identification rates compared with other organisms. This suggests that the genome annotation can be improved, or atypical biological processes are interfering with the mass spectrometry workflow. First, we tested whether high levels of polymorphisms could explain some of the missed identifications by searching spectra against the reference proteome (OGSv3.2) versus a customized proteome of a single honey bee, but our results indicate that this contribution was minor. Likewise, error-tolerant peptide searches lead us to eliminate unexpected post-translational modifications as a major factor in missed identifications. We then used a proteogenomic approach with ~1500 raw files to search for missing genes and new exons, to revive discarded annotations and to identify over 2000 new coding regions. These results will contribute to a more comprehensive genome annotation and facilitate continued research on this important insect.
OncoBinder facilitates interpretation of proteomic interaction data by capturing coactivation pairs in cancer.

PubMed

Van Coillie, Samya; Liang, Lunxi; Zhang, Yao; Wang, Huanbin; Fang, Jing-Yuan; Xu, Jie

2016-04-05

High-throughput methods such as co-immunoprecipitationmass spectrometry (coIP-MS) and yeast 2 hybridization (Y2H) have suggested a broad range of unannotated protein-protein interactions (PPIs), and interpretation of these PPIs remains a challenging task. The advancements in cancer genomic researches allow for the inference of "coactivation pairs" in cancer, which may facilitate the identification of PPIs involved in cancer. Here we present OncoBinder as a tool for the assessment of proteomic interaction data based on the functional synergy of oncoproteins in cancer. This decision tree-based method combines gene mutation, copy number and mRNA expression information to infer the functional status of protein-coding genes. We applied OncoBinder to evaluate the potential binders of EGFR and ERK2 proteins based on the gastric cancer dataset of The Cancer Genome Atlas (TCGA). As a result, OncoBinder identified high confidence interactions (annotated by Kyoto Encyclopedia of Genes and Genomes (KEGG) or validated by low-throughput assays) more efficiently than co-expression based method. Taken together, our results suggest that evaluation of gene functional synergy in cancer may facilitate the interpretation of proteomic interaction data. The OncoBinder toolbox for Matlab is freely accessible online.
The Functional Network of the Arabidopsis Plastoglobule Proteome Based on Quantitative Proteomics and Genome-Wide Coexpression Analysis1[C][W][OA

PubMed Central

Lundquist, Peter K.; Poliakov, Anton; Bhuiyan, Nazmul H.; Zybailov, Boris; Sun, Qi; van Wijk, Klaas J.

2012-01-01

Plastoglobules (PGs) in chloroplasts are thylakoid-associated monolayer lipoprotein particles containing prenyl and neutral lipids and several dozen proteins mostly with unknown functions. An integrated view of the role of the PG is lacking. Here, we better define the PG proteome and provide a conceptual framework for further studies. The PG proteome from Arabidopsis (Arabidopsis thaliana) leaf chloroplasts was determined by mass spectrometry of isolated PGs and quantitative comparison with the proteomes of unfractionated leaves, thylakoids, and stroma. Scanning electron microscopy showed the purity and size distribution of the isolated PGs. Compared with previous PG proteome analyses, we excluded several proteins and identified six new PG proteins, including an M48 metallopeptidase and two Absence of bc1 complex (ABC1) atypical kinases, confirmed by immunoblotting. This refined PG proteome consisted of 30 proteins, including six ABC1 kinases and seven fibrillins together comprising more than 70% of the PG protein mass. Other fibrillins were located predominantly in the stroma or thylakoid and not in PGs; we discovered that this partitioning can be predicted by their isoelectric point and hydrophobicity. A genome-wide coexpression network for the PG genes was then constructed from mRNA expression data. This revealed a modular network with four distinct modules that each contained at least one ABC1K and/or fibrillin gene. Each module showed clear enrichment in specific functions, including chlorophyll degradation/senescence, isoprenoid biosynthesis, plastid proteolysis, and redox regulators and phosphoregulators of electron flow. We propose a new testable model for the PGs, in which sets of genes are associated with specific PG functions. PMID:22274653
Proteomic profiling of white muscle from freshwater catfish Rita rita.

PubMed

Mohanty, Bimal Prasanna; Mitra, Tandrima; Banerjee, Sudeshna; Bhattacharjee, Soma; Mahanty, Arabinda; Ganguly, Satabdi; Purohit, Gopal Krishna; Karunakaran, Dhanasekar; Mohanty, Sasmita

2015-06-01

Muscle tissues contribute 34-48 % of the total body mass in fish. Proteomic analysis enables better understanding of the skeletal muscle physiology and metabolism. A proteome map reflects the general fingerprinting of the fish species and has the potential to identify novel proteins which could serve as biomarkers for many aspects of aquaculture including fish physiology and growth, flesh quality, food safety and aquatic environmental monitoring. The freshwater catfish Rita rita of the family Bagridae inhabiting the tropical rivers and estuaries is an important food fish with high nutritive value and is also considered a species of choice in riverine pollution monitoring. Omics information that could enhance utility of this species in molecular research is meager. Therefore, in the present study, proteomic analysis of Rita rita muscle has been carried out and functional genomics data have been generated. A reference muscle proteome has been developed, and 23 protein spots, representing 18 proteins, have been identified by MALDI-TOF/TOF-MS and LC-MS/MS. Besides, transcript information on a battery of heat shock proteins (Hsps) has been generated. The functional genomics information generated could act as the baseline data for further molecular research on this species.
How may targeted proteomics complement genomic data in breast cancer?

PubMed

Guerin, Mathilde; Gonçalves, Anthony; Toiron, Yves; Baudelet, Emilie; Audebert, Stéphane; Boyer, Jean-Baptiste; Borg, Jean-Paul; Camoin, Luc

2017-01-01

Breast cancer (BC) is the most common female cancer in the world and was recently deconstructed in different molecular entities. Although most of the recent assays to characterize tumors at the molecular level are genomic-based, proteins are the actual executors of cellular functions and represent the vast majority of targets for anticancer drugs. Accumulated data has demonstrated an important level of quantitative and qualitative discrepancies between genomic/transcriptomic alterations and their protein counterparts, mostly related to the large number of post-translational modifications. Areas covered: This review will present novel proteomics technologies such as Reverse Phase Protein Array (RPPA) or mass-spectrometry (MS) based approaches that have emerged and that could progressively replace old-fashioned methods (e.g. immunohistochemistry, ELISA, etc.) to validate proteins as diagnostic, prognostic or predictive biomarkers, and eventually monitor them in the routine practice. Expert commentary: These different targeted proteomic approaches, able to complement genomic data in BC and characterize tumors more precisely, will permit to go through a more personalized treatment for each patient and tumor.
Personalized medicine beyond genomics: alternative futures in big data-proteomics, environtome and the social proteome.

PubMed

Özdemir, Vural; Dove, Edward S; Gürsoy, Ulvi K; Şardaş, Semra; Yıldırım, Arif; Yılmaz, Şenay Görücü; Ömer Barlas, I; Güngör, Kıvanç; Mete, Alper; Srivastava, Sanjeeva

2017-01-01

No field in science and medicine today remains untouched by Big Data, and psychiatry is no exception. Proteomics is a Big Data technology and a next generation biomarker, supporting novel system diagnostics and therapeutics in psychiatry. Proteomics technology is, in fact, much older than genomics and dates to the 1970s, well before the launch of the international Human Genome Project. While the genome has long been framed as the master or "elite" executive molecule in cell biology, the proteome by contrast is humble. Yet the proteome is critical for life-it ensures the daily functioning of cells and whole organisms. In short, proteins are the blue-collar workers of biology, the down-to-earth molecules that we cannot live without. Since 2010, proteomics has found renewed meaning and international attention with the launch of the Human Proteome Project and the growing interest in Big Data technologies such as proteomics. This article presents an interdisciplinary technology foresight analysis and conceptualizes the terms "environtome" and "social proteome". We define "environtome" as the entire complement of elements external to the human host, from microbiome, ambient temperature and weather conditions to government innovation policies, stock market dynamics, human values, political power and social norms that collectively shape the human host spatially and temporally. The "social proteome" is the subset of the environtome that influences the transition of proteomics technology to innovative applications in society. The social proteome encompasses, for example, new reimbursement schemes and business innovation models for proteomics diagnostics that depart from the "once-a-life-time" genotypic tests and the anticipated hype attendant to context and time sensitive proteomics tests. Building on the "nesting principle" for governance of complex systems as discussed by Elinor Ostrom, we propose here a 3-tiered organizational architecture for Big Data science such as proteomics. The proposed nested governance structure is comprised of (a) scientists, (b) ethicists, and (c) scholars in the nascent field of "ethics-of-ethics", and aims to cultivate a robust social proteome for personalized medicine. Ostrom often noted that such nested governance designs offer assurance that political power embedded in innovation processes is distributed evenly and is not concentrated disproportionately in a single overbearing stakeholder or person. We agree with this assessment and conclude by underscoring the synergistic value of social and biological proteomes to realize the full potentials of proteomics science for personalized medicine in psychiatry in the present era of Big Data.
Contrasting patterns of evolutionary constraint and novelty revealed by comparative sperm proteomic analysis in Lepidoptera.

PubMed

Whittington, Emma; Forsythe, Desiree; Borziak, Kirill; Karr, Timothy L; Walters, James R; Dorus, Steve

2017-12-02

Rapid evolution is a hallmark of reproductive genetic systems and arises through the combined processes of sequence divergence, gene gain and loss, and changes in gene and protein expression. While studies aiming to disentangle the molecular ramifications of these processes are progressing, we still know little about the genetic basis of evolutionary transitions in reproductive systems. Here we conduct the first comparative analysis of sperm proteomes in Lepidoptera, a group that exhibits dichotomous spermatogenesis, in which males produce a functional fertilization-competent sperm (eupyrene) and an incompetent sperm morph lacking nuclear DNA (apyrene). Through the integrated application of evolutionary proteomics and genomics, we characterize the genomic patterns potentially associated with the origination and evolution of this unique spermatogenic process and assess the importance of genetic novelty in Lepidopteran sperm biology. Comparison of the newly characterized Monarch butterfly (Danaus plexippus) sperm proteome to those of the Carolina sphinx moth (Manduca sexta) and the fruit fly (Drosophila melanogaster) demonstrated conservation at the level of protein abundance and post-translational modification within Lepidoptera. In contrast, comparative genomic analyses across insects reveals significant divergence at two levels that differentiate the genetic architecture of sperm in Lepidoptera from other insects. First, a significant reduction in orthology among Monarch sperm genes relative to the remainder of the genome in non-Lepidopteran insect species was observed. Second, a substantial number of sperm proteins were found to be specific to Lepidoptera, in that they lack detectable homology to the genomes of more distantly related insects. Lastly, the functional importance of Lepidoptera specific sperm proteins is broadly supported by their increased abundance relative to proteins conserved across insects. Our results identify a burst of genetic novelty amongst sperm proteins that may be associated with the origin of heteromorphic spermatogenesis in ancestral Lepidoptera and/or the subsequent evolution of this system. This pattern of genomic diversification is distinct from the remainder of the genome and thus suggests that this transition has had a marked impact on lepidopteran genome evolution. The identification of abundant sperm proteins unique to Lepidoptera, including proteins distinct between specific lineages, will accelerate future functional studies aiming to understand the developmental origin of dichotomous spermatogenesis and the functional diversification of the fertilization incompetent apyrene sperm morph.
Mitochondrial proteome disruption in the diabetic heart through targeted epigenetic regulation at the mitochondrial heat shock protein 70 (mtHsp70) nuclear locus.

PubMed

Shepherd, Danielle L; Hathaway, Quincy A; Nichols, Cody E; Durr, Andrya J; Pinti, Mark V; Hughes, Kristen M; Kunovac, Amina; Stine, Seth M; Hollander, John M

2018-06-01

>99% of the mitochondrial proteome is nuclear-encoded. The mitochondrion relies on a coordinated multi-complex process for nuclear genome-encoded mitochondrial protein import. Mitochondrial heat shock protein 70 (mtHsp70) is a key component of this process and a central constituent of the protein import motor. Type 2 diabetes mellitus (T2DM) disrupts mitochondrial proteomic signature which is associated with decreased protein import efficiency. The goal of this study was to manipulate the mitochondrial protein import process through targeted restoration of mtHsp70, in an effort to restore proteomic signature and mitochondrial function in the T2DM heart. A novel line of cardiac-specific mtHsp70 transgenic mice on the db/db background were generated and cardiac mitochondrial subpopulations were isolated with proteomic evaluation and mitochondrial function assessed. MicroRNA and epigenetic regulation of the mtHsp70 gene during T2DM were also evaluated. MtHsp70 overexpression restored cardiac function and nuclear-encoded mitochondrial protein import, contributing to a beneficial impact on proteome signature and enhanced mitochondrial function during T2DM. Further, transcriptional repression at the mtHsp70 genomic locus through increased localization of H3K27me3 during T2DM insult was observed. Our results suggest that restoration of a key protein import constituent, mtHsp70, provides therapeutic benefit through attenuation of mitochondrial and contractile dysfunction in T2DM. Copyright © 2018 Elsevier Ltd. All rights reserved.
CPTAC Releases Largest-Ever Breast Cancer Proteome Dataset from Previously Genome Characterized Tumors | Office of Cancer Clinical Proteomics Research

Cancer.gov

National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium (CPTAC) scientists have released a dataset of proteins and phosphopeptides identified through deep proteomic and phosphoproteomic analysis of breast tumor samples, previously genomically analyzed by The Cancer Genome Atlas (TCGA).
Computer applications making rapid advances in high throughput microbial proteomics (HTMP).

PubMed

Anandkumar, Balakrishna; Haga, Steve W; Wu, Hui-Fen

2014-02-01

The last few decades have seen the rise of widely-available proteomics tools. From new data acquisition devices, such as MALDI-MS and 2DE to new database searching softwares, these new products have paved the way for high throughput microbial proteomics (HTMP). These tools are enabling researchers to gain new insights into microbial metabolism, and are opening up new areas of study, such as protein-protein interactions (interactomics) discovery. Computer software is a key part of these emerging fields. This current review considers: 1) software tools for identifying the proteome, such as MASCOT or PDQuest, 2) online databases of proteomes, such as SWISS-PROT, Proteome Web, or the Proteomics Facility of the Pathogen Functional Genomics Resource Center, and 3) software tools for applying proteomic data, such as PSI-BLAST or VESPA. These tools allow for research in network biology, protein identification, functional annotation, target identification/validation, protein expression, protein structural analysis, metabolic pathway engineering and drug discovery.
Genomics and functional genomics in Chlamydomonas reinhardtii

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blaby, Ian K.; Blaby-Haas, Crysten E.

The availability of the Chlamydomonas reinhardtii nuclear genome sequence continues to enable researchers to address biological questions relevant to algae, land plants and animals in unprecedented ways. As we continue to characterize and understand biological processes in C. reinhardtii and translate that knowledge to other systems, we are faced with the realization that many genes encode proteins without a defined function. The field of functional genomics aims to close this gap between genome sequence and protein function. Transcriptomes, proteomes and phenomes can each provide layers of gene-specific functional data while supplying a global snapshot of cellular behavior under different conditions.more » Herein we present a brief history of functional genomics, the present status of the C. reinhardtii genome, how genome-wide experiments can aid in supplying protein function inferences, and provide an outlook for functional genomics in C. reinhardtii.« less
Genomics and functional genomics in Chlamydomonas reinhardtii

DOE PAGES

Blaby, Ian K.; Blaby-Haas, Crysten E.

2017-03-21

The availability of the Chlamydomonas reinhardtii nuclear genome sequence continues to enable researchers to address biological questions relevant to algae, land plants and animals in unprecedented ways. As we continue to characterize and understand biological processes in C. reinhardtii and translate that knowledge to other systems, we are faced with the realization that many genes encode proteins without a defined function. The field of functional genomics aims to close this gap between genome sequence and protein function. Transcriptomes, proteomes and phenomes can each provide layers of gene-specific functional data while supplying a global snapshot of cellular behavior under different conditions.more » Herein we present a brief history of functional genomics, the present status of the C. reinhardtii genome, how genome-wide experiments can aid in supplying protein function inferences, and provide an outlook for functional genomics in C. reinhardtii.« less
Principles of proteome allocation are revealed using proteomic data and genome-scale models

PubMed Central

Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.; Ebrahim, Ali; Saunders, Michael A.; Palsson, Bernhard O.

2016-01-01

Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the “generalist” (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions, prediction errors for growth rate and metabolic fluxes were 69% and 14% lower, respectively. The sector-constrained ME model thus represents a generalist ME model reflecting both growth rate maximization and “hedging” against uncertain environments and stresses, as indicated by significant enrichment of these sectors for the general stress response sigma factor σS. Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally-related protein groups) as demonstrated here. This flexible formalism provides an accessible approach for narrowing the gap between the complexity captured by omics data and governing principles of proteome allocation described by systems-level models. PMID:27857205
Principles of proteome allocation are revealed using proteomic data and genome-scale models

DOE PAGES

Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.; ...

2016-11-18

Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the “generalist” (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions, prediction errors for growth rate and metabolic fluxes were 69% and 14% lower, respectively. The sector-constrained ME model thusmore » represents a generalist ME model reflecting both growth rate maximization and “hedging” against uncertain environments and stresses, as indicated by significant enrichment of these sectors for the general stress response sigma factor σS. Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally-related protein groups) as demonstrated here. Furthermore, this flexible formalism provides an accessible approach for narrowing the gap between the complexity captured by omics data and governing principles of proteome allocation described by systems-level models.« less
Proteogenomic characterization of human colon and rectal cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Bing; Wang, Jing; Wang, Xiaojing

2014-09-18

We analyzed proteomes of colon and rectal tumors previously characterized by the Cancer Genome Atlas (TCGA) and performed integrated proteogenomic analyses. Protein sequence variants encoded by somatic genomic variations displayed reduced expression compared to protein variants encoded by germline variations. mRNA transcript abundance did not reliably predict protein expression differences between tumors. Proteomics identified five protein expression subtypes, two of which were associated with the TCGA "MSI/CIMP" transcriptional subtype, but had distinct mutation and methylation patterns and associated with different clinical outcomes. Although CNAs showed strong cis- and trans-effects on mRNA expression, relatively few of these extend to the proteinmore » level. Thus, proteomics data enabled prioritization of candidate driver genes. Our analyses identified HNF4A, a novel candidate driver gene in tumors with chromosome 20q amplifications. Integrated proteogenomic analysis provides functional context to interpret genomic abnormalities and affords novel insights into cancer biology.« less
VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data.

PubMed

Peterson, Elena S; McCue, Lee Ann; Schrimpe-Rutledge, Alexandra C; Jensen, Jeffrey L; Walker, Hyunjoo; Kobold, Markus A; Webb, Samantha R; Payne, Samuel H; Ansong, Charles; Adkins, Joshua N; Cannon, William R; Webb-Robertson, Bobbie-Jo M

2012-04-05

The procedural aspects of genome sequencing and assembly have become relatively inexpensive, yet the full, accurate structural annotation of these genomes remains a challenge. Next-generation sequencing transcriptomics (RNA-Seq), global microarrays, and tandem mass spectrometry (MS/MS)-based proteomics have demonstrated immense value to genome curators as individual sources of information, however, integrating these data types to validate and improve structural annotation remains a major challenge. Current visual and statistical analytic tools are focused on a single data type, or existing software tools are retrofitted to analyze new data forms. We present Visual Exploration and Statistics to Promote Annotation (VESPA) is a new interactive visual analysis software tool focused on assisting scientists with the annotation of prokaryotic genomes though the integration of proteomics and transcriptomics data with current genome location coordinates. VESPA is a desktop Java™ application that integrates high-throughput proteomics data (peptide-centric) and transcriptomics (probe or RNA-Seq) data into a genomic context, all of which can be visualized at three levels of genomic resolution. Data is interrogated via searches linked to the genome visualizations to find regions with high likelihood of mis-annotation. Search results are linked to exports for further validation outside of VESPA or potential coding-regions can be analyzed concurrently with the software through interaction with BLAST. VESPA is demonstrated on two use cases (Yersinia pestis Pestoides F and Synechococcus sp. PCC 7002) to demonstrate the rapid manner in which mis-annotations can be found and explored in VESPA using either proteomics data alone, or in combination with transcriptomic data. VESPA is an interactive visual analytics tool that integrates high-throughput data into a genomic context to facilitate the discovery of structural mis-annotations in prokaryotic genomes. Data is evaluated via visual analysis across multiple levels of genomic resolution, linked searches and interaction with existing bioinformatics tools. We highlight the novel functionality of VESPA and core programming requirements for visualization of these large heterogeneous datasets for a client-side application. The software is freely available at https://www.biopilot.org/docs/Software/Vespa.php.

VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data

PubMed Central

2012-01-01

Background The procedural aspects of genome sequencing and assembly have become relatively inexpensive, yet the full, accurate structural annotation of these genomes remains a challenge. Next-generation sequencing transcriptomics (RNA-Seq), global microarrays, and tandem mass spectrometry (MS/MS)-based proteomics have demonstrated immense value to genome curators as individual sources of information, however, integrating these data types to validate and improve structural annotation remains a major challenge. Current visual and statistical analytic tools are focused on a single data type, or existing software tools are retrofitted to analyze new data forms. We present Visual Exploration and Statistics to Promote Annotation (VESPA) is a new interactive visual analysis software tool focused on assisting scientists with the annotation of prokaryotic genomes though the integration of proteomics and transcriptomics data with current genome location coordinates. Results VESPA is a desktop Java™ application that integrates high-throughput proteomics data (peptide-centric) and transcriptomics (probe or RNA-Seq) data into a genomic context, all of which can be visualized at three levels of genomic resolution. Data is interrogated via searches linked to the genome visualizations to find regions with high likelihood of mis-annotation. Search results are linked to exports for further validation outside of VESPA or potential coding-regions can be analyzed concurrently with the software through interaction with BLAST. VESPA is demonstrated on two use cases (Yersinia pestis Pestoides F and Synechococcus sp. PCC 7002) to demonstrate the rapid manner in which mis-annotations can be found and explored in VESPA using either proteomics data alone, or in combination with transcriptomic data. Conclusions VESPA is an interactive visual analytics tool that integrates high-throughput data into a genomic context to facilitate the discovery of structural mis-annotations in prokaryotic genomes. Data is evaluated via visual analysis across multiple levels of genomic resolution, linked searches and interaction with existing bioinformatics tools. We highlight the novel functionality of VESPA and core programming requirements for visualization of these large heterogeneous datasets for a client-side application. The software is freely available at https://www.biopilot.org/docs/Software/Vespa.php. PMID:22480257
Proteogenomic characterization of human colon and rectal cancer

PubMed Central

Zhang, Bing; Wang, Jing; Wang, Xiaojing; Zhu, Jing; Liu, Qi; Shi, Zhiao; Chambers, Matthew C.; Zimmerman, Lisa J.; Shaddox, Kent F.; Kim, Sangtae; Davies, Sherri R.; Wang, Sean; Wang, Pei; Kinsinger, Christopher R.; Rivers, Robert C.; Rodriguez, Henry; Townsend, R. Reid; Ellis, Matthew J.C.; Carr, Steven A.; Tabb, David L.; Coffey, Robert J.; Slebos, Robbert J.C.; Liebler, Daniel C.

2014-01-01

Summary We analyzed proteomes of colon and rectal tumors previously characterized by the Cancer Genome Atlas (TCGA) and performed integrated proteogenomic analyses. Somatic variants displayed reduced protein abundance compared to germline variants. mRNA transcript abundance did not reliably predict protein abundance differences between tumors. Proteomics identified five proteomic subtypes in the TCGA cohort, two of which overlapped with the TCGA “MSI/CIMP” transcriptomic subtype, but had distinct mutation, methylation, and protein expression patterns associated with different clinical outcomes. Although copy number alterations showed strong cis- and trans-effects on mRNA abundance, relatively few of these extend to the protein level. Thus, proteomics data enabled prioritization of candidate driver genes. The chromosome 20q amplicon was associated with the largest global changes at both mRNA and protein levels; proteomics data highlighted potential 20q candidates including HNF4A, TOMM34 and SRC. Integrated proteogenomic analysis provides functional context to interpret genomic abnormalities and affords a new paradigm for understanding cancer biology. PMID:25043054
CPTAC Releases Largest-Ever Colorectal Cancer Proteome Dataset from Previously Genome Characterized Tumors | Office of Cancer Clinical Proteomics Research

Cancer.gov

On September 4, 2013, NCI’s Clinical Proteomics Tumor Analysis Consortium (CPTAC) publicly released proteomic data produced from colorectal tumor samples previously analyzed by The Cancer Genome Atlas (TCGA). This is the initial release of proteomic tumor data designed to complement genomic data on the same tumors. The data is publicly available at the CPTAC data portal.
Integration of gel-based and gel-free proteomic data for functional analysis of proteins through Soybean Proteome Database.

PubMed

Komatsu, Setsuko; Wang, Xin; Yin, Xiaojian; Nanjo, Yohei; Ohyanagi, Hajime; Sakata, Katsumi

2017-06-23

The Soybean Proteome Database (SPD) stores data on soybean proteins obtained with gel-based and gel-free proteomic techniques. The database was constructed to provide information on proteins for functional analyses. The majority of the data is focused on soybean (Glycine max 'Enrei'). The growth and yield of soybean are strongly affected by environmental stresses such as flooding. The database was originally constructed using data on soybean proteins separated by two-dimensional polyacrylamide gel electrophoresis, which is a gel-based proteomic technique. Since 2015, the database has been expanded to incorporate data obtained by label-free mass spectrometry-based quantitative proteomics, which is a gel-free proteomic technique. Here, the portions of the database consisting of gel-free proteomic data are described. The gel-free proteomic database contains 39,212 proteins identified in 63 sample sets, such as temporal and organ-specific samples of soybean plants grown under flooding stress or non-stressed conditions. In addition, data on organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored. Furthermore, the database integrates multiple omics data such as genomics, transcriptomics, metabolomics, and proteomics. The SPD database is accessible at http://proteome.dc.affrc.go.jp/Soybean/. The Soybean Proteome Database stores data obtained from both gel-based and gel-free proteomic techniques. The gel-free proteomic database comprises 39,212 proteins identified in 63 sample sets, such as different organs of soybean plants grown under flooding stress or non-stressed conditions in a time-dependent manner. In addition, organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored in the gel-free proteomics database. A total of 44,704 proteins, including 5490 proteins identified using a gel-based proteomic technique, are stored in the SPD. It accounts for approximately 80% of all predicted proteins from genome sequences, though there are over lapped proteins. Based on the demonstrated application of data stored in the database for functional analyses, it is suggested that these data will be useful for analyses of biological mechanisms in soybean. Furthermore, coupled with recent advances in information and communication technology, the usefulness of this database would increase in the analyses of biological mechanisms. Copyright © 2017 Elsevier B.V. All rights reserved.
Microchip-Based Single-Cell Functional Proteomics for Biomedical Applications

PubMed Central

Lu, Yao; Yang, Liu; Wei, Wei; Shi, Qihui

2017-01-01

Cellular heterogeneity has been widely recognized but only recently have single cell tools become available that allow characterizing heterogeneity at the genomic and proteomic levels. We review the technological advances in microchip-based toolkits for single-cell functional proteomics. Each of these tools has distinct advantages and limitations, and a few have advanced toward being applied to address biological or clinical problems that fail to be addressed by traditional population-based methods. High-throughput single-cell proteomic assays generate high-dimensional data sets that contain new information and thus require developing new analytical framework to extract new biology. In this review article, we highlight a few biological and clinical applications in which the microchip-based single-cell proteomic tools provide unique advantages. The examples include resolving functional heterogeneity and dynamics of immune cells, dissecting cell-cell interaction by creating well-contolled on-chip microenvironment, capturing high-resolution snapshots of immune system functions in patients for better immunotherapy and elucidating phosphoprotein signaling networks in cancer cells for guiding effective molecularly targeted therapies. PMID:28280819
Evaluation of a Genome-Scale In Silico Metabolic Model for Geobacter metallireducens Using Proteomic Data from a Field Biostimulation Experiment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fang, Yilin; Wilkins, Michael J.; Yabusaki, Steven B.

2012-12-12

Biomass and shotgun global proteomics data that reflected relative protein abundances from samples collected during the 2008 experiment at the U.S. Department of Energy Integrated Field-Scale Subsurface Research Challenge site in Rifle, Colorado, provided an unprecedented opportunity to validate a genome-scale metabolic model of Geobacter metallireducens and assess its performance with respect to prediction of metal reduction, biomass yield, and growth rate under dynamic field conditions. Reconstructed from annotated genomic sequence, biochemical, and physiological data, the constraint-based in silico model of G. metallireducens relates an annotated genome sequence to the physiological functions with 697 reactions controlled by 747 enzyme-coding genes.more » Proteomic analysis showed that 180 of the 637 G. metallireducens proteins detected during the 2008 experiment were associated with specific metabolic reactions in the in silico model. When the field-calibrated Fe(III) terminal electron acceptor process reaction in a reactive transport model for the field experiments was replaced with the genome-scale model, the model predicted that the largest metabolic fluxes through the in silico model reactions generally correspond to the highest abundances of proteins that catalyze those reactions. Central metabolism predicted by the model agrees well with protein abundance profiles inferred from proteomic analysis. Model discrepancies with the proteomic data, such as the relatively low fluxes through amino acid transport and metabolism, revealed pathways or flux constraints in the in silico model that could be updated to more accurately predict metabolic processes that occur in the subsurface environment.« less
Genomics, transcriptomics and proteomics: enabling insights into social evolution and disease challenges for managed and wild bees.

PubMed

Trapp, Judith; McAfee, Alison; Foster, Leonard J

2017-02-01

Globally, there are over 20 000 bee species (Hymenoptera: Apoidea: Anthophila) with a host of biologically fascinating characteristics. Although they have long been studied as models for social evolution, recent challenges to bee health (mainly diseases and pesticides) have gathered the attention of both public and research communities. Genome sequences of twelve bee species are now complete or under progress, facilitating the application of additional 'omic technologies. Here, we review recent developments in honey bee and native bee research in the genomic era. We discuss the progress in genome sequencing and functional annotation, followed by the enabled comparative genomics, proteomics and transcriptomics applications regarding social evolution and health. Finally, we end with comments on future challenges in the postgenomic era. © 2016 John Wiley & Sons Ltd.
Mining biological databases for candidate disease genes

NASA Astrophysics Data System (ADS)

Braun, Terry A.; Scheetz, Todd; Webster, Gregg L.; Casavant, Thomas L.

2001-07-01

The publicly-funded effort to sequence the complete nucleotide sequence of the human genome, the Human Genome Project (HGP), has currently produced more than 93% of the 3 billion nucleotides of the human genome into a preliminary `draft' format. In addition, several valuable sources of information have been developed as direct and indirect results of the HGP. These include the sequencing of model organisms (rat, mouse, fly, and others), gene discovery projects (ESTs and full-length), and new technologies such as expression analysis and resources (micro-arrays or gene chips). These resources are invaluable for the researchers identifying the functional genes of the genome that transcribe and translate into the transcriptome and proteome, both of which potentially contain orders of magnitude more complexity than the genome itself. Preliminary analyses of this data identified approximately 30,000 - 40,000 human `genes.' However, the bulk of the effort still remains -- to identify the functional and structural elements contained within the transcriptome and proteome, and to associate function in the transcriptome and proteome to genes. A fortuitous consequence of the HGP is the existence of hundreds of databases containing biological information that may contain relevant data pertaining to the identification of disease-causing genes. The task of mining these databases for information on candidate genes is a commercial application of enormous potential. We are developing a system to acquire and mine data from specific databases to aid our efforts to identify disease genes. A high speed cluster of Linux of workstations is used to analyze sequence and perform distributed sequence alignments as part of our data mining and processing. This system has been used to mine GeneMap99 sequences within specific genomic intervals to identify potential candidate disease genes associated with Bardet-Biedle Syndrome (BBS).
PARALLEL ASSAY OF OXYGEN EQUILIBRIA OF HEMOGLOBIN

PubMed Central

Lilly, Laura E.; Blinebry, Sara K.; Viscardi, Chelsea M.; Perez, Luis; Bonaventura, Joe; McMahon, Tim J.

2013-01-01

Methods to systematically analyze in parallel the function of multiple protein or cell samples in vivo or ex vivo (i.e. functional proteomics) in a controlled gaseous environment have thus far been limited. Here we describe an apparatus and procedure that enables, for the first time, parallel assay of oxygen equilibria in multiple samples. Using this apparatus, numerous simultaneous oxygen equilibrium curves (OECs) can be obtained under truly identical conditions from blood cell samples or purified hemoglobins (Hbs). We suggest that the ability to obtain these parallel datasets under identical conditions can be of immense value, both to biomedical researchers and clinicians who wish to monitor blood health, and to physiologists studying non-human organisms and the effects of climate change on these organisms. Parallel monitoring techniques are essential in order to better understand the functions of critical cellular proteins. The procedure can be applied to human studies, wherein an OEC can be analyzed in light of an individual’s entire genome. Here, we analyzed intraerythrocytic Hb, a protein that operates at the organism’s environmental interface and then comes into close contact with virtually all of the organism’s cells. The apparatus is theoretically scalable, and establishes a functional proteomic screen that can be correlated with genomic information on the same individuals. This new method is expected to accelerate our general understanding of protein function, an increasingly challenging objective as advances in proteomic and genomic throughput outpace the ability to study proteins’ functional properties. PMID:23827235
Comparative Analysis of Proteomes and Functionomes Provides Insights into Origins of Cellular Diversification

PubMed Central

Caetano-Anollés, Gustavo

2013-01-01

Reconstructing the evolutionary history of modern species is a difficult problem complicated by the conceptual and technical limitations of phylogenetic tree building methods. Here, we propose a comparative proteomic and functionomic inferential framework for genome evolution that allows resolving the tripartite division of cells and sketching their history. Evolutionary inferences were derived from the spread of conserved molecular features, such as molecular structures and functions, in the proteomes and functionomes of contemporary organisms. Patterns of use and reuse of these traits yielded significant insights into the origins of cellular diversification. Results uncovered an unprecedented strong evolutionary association between Bacteria and Eukarya while revealing marked evolutionary reductive tendencies in the archaeal genomic repertoires. The effects of nonvertical evolutionary processes (e.g., HGT, convergent evolution) were found to be limited while reductive evolution and molecular innovation appeared to be prevalent during the evolution of cells. Our study revealed a strong vertical trace in the history of proteins and associated molecular functions, which was reliably recovered using the comparative genomics approach. The trace supported the existence of a stem line of descent and the very early appearance of Archaea as a diversified superkingdom, but failed to uncover a hidden canonical pattern in which Bacteria was the first superkingdom to deploy superkingdom-specific structures and functions. PMID:24492748
An in silico argument for mitochondrial microRNA as a determinant of primary non function in liver transplantation.

PubMed

Khorsandi, Shirin Elizabeth; Salehi, Siamak; Cortes, Miriam; Vilca-Melendez, Hector; Menon, Krishna; Srinivasan, Parthi; Prachalias, Andreas; Jassem, Wayel; Heaton, Nigel

2018-02-15

Mitochondria have their own genomic, transcriptomic and proteomic machinery but are unable to be autonomous, needing both nuclear and mitochondrial genomes. The aim of this work was to use computational biology to explore the involvement of Mitochondrial microRNAs (MitomiRs) and their interactions with the mitochondrial proteome in a clinical model of primary non function (PNF) of the donor after cardiac death (DCD) liver. Archival array data on the differential expression of miRNA in DCD PNF was re-analyzed using a number of publically available computational algorithms. 10 MitomiRs were identified of importance in DCD PNF, 7 with predicted interaction of their seed sequence with the mitochondrial transcriptome that included both coding, and non coding areas of the hypervariability region 1 (HVR1) and control region. Considering miRNA regulation of the nuclear encoded mitochondrial proteome, 7 hypothetical small proteins were identified with homolog function that ranged from co-factor for formation of ATP Synthase, REDOX balance and an importin/exportin protein. In silico, unconventional seed interactions, both non canonical and alternative seed sites, appear to be of greater importance in MitomiR regulation of the mitochondrial genome. Additionally, a number of novel small proteins of relevance in transplantation have been identified which need further characterization.
Comparison of theoretical proteomes: identification of COGs with conserved and variable pI within the multimodal pI distribution.

PubMed

Nandi, Soumyadeep; Mehra, Nipun; Lynn, Andrew M; Bhattacharya, Alok

2005-09-09

Theoretical proteome analysis, generated by plotting theoretical isoelectric points (pI) against molecular masses of all proteins encoded by the genome show a multimodal distribution for pI. This multimodal distribution is an effect of allowed combinations of the charged amino acids, and not due to evolutionary causes. The variation in this distribution can be correlated to the organisms ecological niche. Contributions to this variation maybe mapped to individual proteins by studying the variation in pI of orthologs across microorganism genomes. The distribution of ortholog pI values showed trimodal distributions for all prokaryotic genomes analyzed, similar to whole proteome plots. Pairwise analysis of pI variation show that a few COGs are conserved within, but most vary between, the acidic and basic regions of the distribution, while molecular mass is more highly conserved. At the level of functional grouping of orthologs, five groups vary significantly from the population of orthologs, which is attributed to either conservation at the level of sequences or a bias for either positively or negatively charged residues contributing to the function. Individual COGs conserved in both the acidic and basic regions of the trimodal distribution are identified, and orthologs that best represent the variation in levels of the acidic and basic regions are listed. The analysis of pI distribution by using orthologs provides a basis for resolution of theoretical proteome comparison at the level of individual proteins. Orthologs identified that significantly vary between the major acidic and basic regions maybe used as representative of the variation of the entire proteome.
High-Throughput Cloning and Expression Library Creation for Functional Proteomics

PubMed Central

Festa, Fernanda; Steel, Jason; Bian, Xiaofang; Labaer, Joshua

2013-01-01

The study of protein function usually requires the use of a cloned version of the gene for protein expression and functional assays. This strategy is particular important when the information available regarding function is limited. The functional characterization of the thousands of newly identified proteins revealed by genomics requires faster methods than traditional single gene experiments, creating the need for fast, flexible and reliable cloning systems. These collections of open reading frame (ORF) clones can be coupled with high-throughput proteomics platforms, such as protein microarrays and cell-based assays, to answer biological questions. In this tutorial we provide the background for DNA cloning, discuss the major high-throughput cloning systems (Gateway® Technology, Flexi® Vector Systems, and Creator™ DNA Cloning System) and compare them side-by-side. We also report an example of high-throughput cloning study and its application in functional proteomics. This Tutorial is part of the International Proteomics Tutorial Programme (IPTP12). Details can be found at http://www.proteomicstutorials.org. PMID:23457047
High-throughput cloning and expression library creation for functional proteomics.

PubMed

Festa, Fernanda; Steel, Jason; Bian, Xiaofang; Labaer, Joshua

2013-05-01

The study of protein function usually requires the use of a cloned version of the gene for protein expression and functional assays. This strategy is particularly important when the information available regarding function is limited. The functional characterization of the thousands of newly identified proteins revealed by genomics requires faster methods than traditional single-gene experiments, creating the need for fast, flexible, and reliable cloning systems. These collections of ORF clones can be coupled with high-throughput proteomics platforms, such as protein microarrays and cell-based assays, to answer biological questions. In this tutorial, we provide the background for DNA cloning, discuss the major high-throughput cloning systems (Gateway® Technology, Flexi® Vector Systems, and Creator(TM) DNA Cloning System) and compare them side-by-side. We also report an example of high-throughput cloning study and its application in functional proteomics. This tutorial is part of the International Proteomics Tutorial Programme (IPTP12). © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
University of Victoria Genome British Columbia Proteomics Centre Partners with CPTAC | Office of Cancer Clinical Proteomics Research

Cancer.gov

University of Victoria Genome British Columbia Proteomics Centre, a leader in proteomic technology development, has partnered with the U.S. National Cancer Institute (NCI) to make targeted proteomic assays accessible to the community through NCI’s CPTAC Assay Portal (https://assays.cancer.gov).
Anopheles gambiae genome reannotation through synthesis of ab initio and comparative gene prediction algorithms

PubMed Central

Li, Jun; Riehle, Michelle M; Zhang, Yan; Xu, Jiannong; Oduol, Frederick; Gomez, Shawn M; Eiglmeier, Karin; Ueberheide, Beatrix M; Shabanowitz, Jeffrey; Hunt, Donald F; Ribeiro, José MC; Vernick, Kenneth D

2006-01-01

Background Complete genome annotation is a necessary tool as Anopheles gambiae researchers probe the biology of this potent malaria vector. Results We reannotate the A. gambiae genome by synthesizing comparative and ab initio sets of predicted coding sequences (CDSs) into a single set using an exon-gene-union algorithm followed by an open-reading-frame-selection algorithm. The reannotation predicts 20,970 CDSs supported by at least two lines of evidence, and it lowers the proportion of CDSs lacking start and/or stop codons to only approximately 4%. The reannotated CDS set includes a set of 4,681 novel CDSs not represented in the Ensembl annotation but with EST support, and another set of 4,031 Ensembl-supported genes that undergo major structural and, therefore, probably functional changes in the reannotated set. The quality and accuracy of the reannotation was assessed by comparison with end sequences from 20,249 full-length cDNA clones, and evaluation of mass spectrometry peptide hit rates from an A. gambiae shotgun proteomic dataset confirms that the reannotated CDSs offer a high quality protein database for proteomics. We provide a functional proteomics annotation, ReAnoXcel, obtained by analysis of the new CDSs through the AnoXcel pipeline, which allows functional comparisons of the CDS sets within the same bioinformatic platform. CDS data are available for download. Conclusion Comprehensive A. gambiae genome reannotation is achieved through a combination of comparative and ab initio gene prediction algorithms. PMID:16569258
Improving transcriptome de novo assembly by using a reference genome of a related species: Translational genomics from oil palm to coconut.

PubMed

Armero, Alix; Baudouin, Luc; Bocs, Stéphanie; This, Dominique

2017-01-01

The palms are a family of tropical origin and one of the main constituents of the ecosystems of these regions around the world. The two main species of palm represent different challenges: coconut (Cocos nucifera L.) is a source of multiple goods and services in tropical communities, while oil palm (Elaeis guineensis Jacq) is the main protagonist of the oil market. In this study, we present a workflow that exploits the comparative genomics between a target species (coconut) and a reference species (oil palm) to improve the transcriptomic data, providing a proteome useful to answer functional or evolutionary questions. This workflow reduces redundancy and fragmentation, two inherent problems of transcriptomic data, while preserving the functional representation of the target species. Our approach was validated in Arabidopsis thaliana using Arabidopsis lyrata and Capsella rubella as references species. This analysis showed the high sensitivity and specificity of our strategy, relatively independent of the reference proteome. The workflow increased the length of proteins products in A. thaliana by 13%, allowing, often, to recover 100% of the protein sequence length. In addition redundancy was reduced by a factor greater than 3. In coconut, the approach generated 29,366 proteins, 1,246 of these proteins deriving from new contigs obtained with the BRANCH software. The coconut proteome presented a functional profile similar to that observed in rice and an important number of metabolic pathways related to secondary metabolism. The new sequences found with BRANCH software were enriched in functions related to biotic stress. Our strategy can be used as a complementary step to de novo transcriptome assembly to get a representative proteome of a target species. The results of the current analysis are available on the website PalmComparomics (http://palm-comparomics.southgreen.fr/).
CPTAC Proteomics Data on UCSC Genome Browser | Office of Cancer Clinical Proteomics Research

Cancer.gov

The National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium scientists are working together with the University of California, Santa Cruz (UCSC) Genomics Institute to provide public access to cancer proteomics data via the UCSC Genome Browser. This effort extends accessibility of the CPTAC data to more researchers and provides an additional level of analysis to assist the cancer biology community.
Recent advances in proteomics of cereals.

PubMed

Bansal, Monika; Sharma, Madhu; Kanwar, Priyanka; Goyal, Aakash

Cereals contribute a major part of human nutrition and are considered as an integral source of energy for human diets. With genomic databases already available in cereals such as rice, wheat, barley, and maize, the focus has now moved to proteome analysis. Proteomics studies involve the development of appropriate databases based on developing suitable separation and purification protocols, identification of protein functions, and can confirm their functional networks based on already available data from other sources. Tremendous progress has been made in the past decade in generating huge data-sets for covering interactions among proteins, protein composition of various organs and organelles, quantitative and qualitative analysis of proteins, and to characterize their modulation during plant development, biotic, and abiotic stresses. Proteomics platforms have been used to identify and improve our understanding of various metabolic pathways. This article gives a brief review of efforts made by different research groups on comparative descriptive and functional analysis of proteomics applications achieved in the cereal science so far.
Connecting genomic alterations to cancer biology with proteomics: the NCI Clinical Proteomic Tumor Analysis Consortium.

PubMed

Ellis, Matthew J; Gillette, Michael; Carr, Steven A; Paulovich, Amanda G; Smith, Richard D; Rodland, Karin K; Townsend, R Reid; Kinsinger, Christopher; Mesri, Mehdi; Rodriguez, Henry; Liebler, Daniel C

2013-10-01

The National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium is applying the latest generation of proteomic technologies to genomically annotated tumors from The Cancer Genome Atlas (TCGA) program, a joint initiative of the NCI and the National Human Genome Research Institute. By providing a fully integrated accounting of DNA, RNA, and protein abnormalities in individual tumors, these datasets will illuminate the complex relationship between genomic abnormalities and cancer phenotypes, thus producing biologic insights as well as a wave of novel candidate biomarkers and therapeutic targets amenable to verification using targeted mass spectrometry methods. ©2013 AACR.

Experimental annotation of post-translational features and translated coding regions in the pathogen Salmonella Typhimurium

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ansong, Charles; Tolic, Nikola; Purvine, Samuel O.

Complete and accurate genome annotation is crucial for comprehensive and systematic studies of biological systems. For example systems biology-oriented genome scale modeling efforts greatly benefit from accurate annotation of protein-coding genes to develop proper functioning models. However, determining protein-coding genes for most new genomes is almost completely performed by inference, using computational predictions with significant documented error rates (> 15%). Furthermore, gene prediction programs provide no information on biologically important post-translational processing events critical for protein function. With the ability to directly measure peptides arising from expressed proteins, mass spectrometry-based proteomics approaches can be used to augment and verify codingmore » regions of a genomic sequence and importantly detect post-translational processing events. In this study we utilized “shotgun” proteomics to guide accurate primary genome annotation of the bacterial pathogen Salmonella Typhimurium 14028 to facilitate a systems-level understanding of Salmonella biology. The data provides protein-level experimental confirmation for 44% of predicted protein-coding genes, suggests revisions to 48 genes assigned incorrect translational start sites, and uncovers 13 non-annotated genes missed by gene prediction programs. We also present a comprehensive analysis of post-translational processing events in Salmonella, revealing a wide range of complex chemical modifications (70 distinct modifications) and confirming more than 130 signal peptide and N-terminal methionine cleavage events in Salmonella. This study highlights several ways in which proteomics data applied during the primary stages of annotation can improve the quality of genome annotations, especially with regards to the annotation of mature protein products.« less
Rice proteome analysis: a step toward functional analysis of the rice genome.

PubMed

Komatsu, Setsuko; Tanaka, Naoki

2005-03-01

The technique of proteome analysis using 2-DE has the power to monitor global changes that occur in the protein complement of tissues and subcellular compartments. In this review, we describe construction of the rice proteome database, the cataloging of rice proteins, and the functional characterization of some of the proteins identified. Initially, proteins extracted from various tissues and organelles were separated by 2-DE and an image analyzer was used to construct a display or reference map of the proteins. The rice proteome database currently contains 23 reference maps based on 2-DE of proteins from different rice tissues and subcellular compartments. These reference maps comprise 13 129 rice proteins, and the amino acid sequences of 5092 of these proteins are entered in the database. Major proteins involved in growth or stress responses have been identified by using a proteomics approach and some of these proteins have unique functions. Furthermore, initial work has also begun on analyzing the phosphoproteome and protein-protein interactions in rice. The information obtained from the rice proteome database will aid in the molecular cloning of rice genes and in predicting the function of unknown proteins.
Rice proteome database: a step toward functional analysis of the rice genome.

PubMed

Komatsu, Setsuko

2005-09-01

The technique of proteome analysis using two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) has the power to monitor global changes that occur in the protein complement of tissues and subcellular compartments. In this study, the proteins of rice were cataloged, a rice proteome database was constructed, and a functional characterization of some of the identified proteins was undertaken. Proteins extracted from various tissues and subcellular compartments in rice were separated by 2D-PAGE and an image analyzer was used to construct a display of the proteins. The Rice Proteome Database contains 23 reference maps based on 2D-PAGE of proteins from various rice tissues and subcellular compartments. These reference maps comprise 13129 identified proteins, and the amino acid sequences of 5092 proteins are entered in the database. Major proteins involved in growth or stress responses were identified using the proteome approach. Some of these proteins, including a beta-tubulin, calreticulin, and ribulose-1,5-bisphosphate carboxylase/oxygenase activase in rice, have unexpected functions. The information obtained from the Rice Proteome Database will aid in cloning the genes for and predicting the function of unknown proteins.
Genetic and Proteomic Interrogation of Lower Confidence Candidate Genes Reveals Signaling Networks in beta-Catenin-Active Cancers | Office of Cancer Genomics

Cancer.gov

Genome-scale expression studies and comprehensive loss-of-function genetic screens have focused almost exclusively on the highest confidence candidate genes. Here, we describe a strategy for characterizing the lower confidence candidates identified by such approaches.
Functional analysis of proteins and protein species using shotgun proteomics and linear mathematics.

PubMed

Hoehenwarter, Wolfgang; Chen, Yanmei; Recuenco-Munoz, Luis; Wienkoop, Stefanie; Weckwerth, Wolfram

2011-07-01

Covalent post-translational modification of proteins is the primary modulator of protein function in the cell. It greatly expands the functional potential of the proteome compared to the genome. In the past few years shotgun proteomics-based research, where the proteome is digested into peptides prior to mass spectrometric analysis has been prolific in this area. It has determined the kinetics of tens of thousands of sites of covalent modification on an equally large number of proteins under various biological conditions and uncovered a transiently active regulatory network that extends into diverse branches of cellular physiology. In this review, we discuss this work in light of the concept of protein speciation, which emphasizes the entire post-translationally modified molecule and its interactions and not just the modification site as the functional entity. Sometimes, particularly when considering complex multisite modification, all of the modified molecular species involved in the investigated condition, the protein species must be completely resolved for full understanding. We present a mathematical technique that delivers a good approximation for shotgun proteomics data.
The Schistosoma mansoni phylome: using evolutionary genomics to gain insight into a parasite's biology.

PubMed

Silva, Larissa Lopes; Marcet-Houben, Marina; Nahum, Laila Alves; Zerlotini, Adhemar; Gabaldón, Toni; Oliveira, Guilherme

2012-11-13

Schistosoma mansoni is one of the causative agents of schistosomiasis, a neglected tropical disease that affects about 237 million people worldwide. Despite recent efforts, we still lack a general understanding of the relevant host-parasite interactions, and the possible treatments are limited by the emergence of resistant strains and the absence of a vaccine. The S. mansoni genome was completely sequenced and still under continuous annotation. Nevertheless, more than 45% of the encoded proteins remain without experimental characterization or even functional prediction. To improve our knowledge regarding the biology of this parasite, we conducted a proteome-wide evolutionary analysis to provide a broad view of the S. mansoni's proteome evolution and to improve its functional annotation. Using a phylogenomic approach, we reconstructed the S. mansoni phylome, which comprises the evolutionary histories of all parasite proteins and their homologs across 12 other organisms. The analysis of a total of 7,964 phylogenies allowed a deeper understanding of genomic complexity and evolutionary adaptations to a parasitic lifestyle. In particular, the identification of lineage-specific gene duplications pointed to the diversification of several protein families that are relevant for host-parasite interaction, including proteases, tetraspanins, fucosyltransferases, venom allergen-like proteins, and tegumental-allergen-like proteins. In addition to the evolutionary knowledge, the phylome data enabled us to automatically re-annotate 3,451 proteins through a phylogenetic-based approach rather than solely sequence similarity searches. To allow further exploitation of this valuable data, all information has been made available at PhylomeDB (http://www.phylomedb.org). In this study, we used an evolutionary approach to assess S. mansoni parasite biology, improve genome/proteome functional annotation, and provide insights into host-parasite interactions. Taking advantage of a proteome-wide perspective rather than focusing on individual proteins, we identified that this parasite has experienced specific gene duplication events, particularly affecting genes that are potentially related to the parasitic lifestyle. These innovations may be related to the mechanisms that protect S. mansoni against host immune responses being important adaptations for the parasite survival in a potentially hostile environment. Continuing this work, a comparative analysis involving genomic, transcriptomic, and proteomic data from other helminth parasites, other parasites, and vectors will supply more information regarding parasite's biology as well as host-parasite interactions.
Interaction Analysis through Proteomic Phage Display

PubMed Central

2014-01-01

Phage display is a powerful technique for profiling specificities of peptide binding domains. The method is suited for the identification of high-affinity ligands with inhibitor potential when using highly diverse combinatorial peptide phage libraries. Such experiments further provide consensus motifs for genome-wide scanning of ligands of potential biological relevance. A complementary but considerably less explored approach is to display expression products of genomic DNA, cDNA, open reading frames (ORFs), or oligonucleotide libraries designed to encode defined regions of a target proteome on phage particles. One of the main applications of such proteomic libraries has been the elucidation of antibody epitopes. This review is focused on the use of proteomic phage display to uncover protein-protein interactions of potential relevance for cellular function. The method is particularly suited for the discovery of interactions between peptide binding domains and their targets. We discuss the largely unexplored potential of this method in the discovery of domain-motif interactions of potential biological relevance. PMID:25295249
Proteomic and genomic studies of non-alcoholic fatty liver disease - clues in the pathogenesis

PubMed Central

Lim, Jun Wei; Dillon, John; Miller, Michael

2014-01-01

Non-alcoholic fatty liver disease (NAFLD) is a widely prevalent hepatic disorder that covers wide spectrum of liver pathology. NAFLD is strongly associated with liver inflammation, metabolic hyperlipidaemia and insulin resistance. Frequently, NAFLD has been considered as the hepatic manifestation of metabolic syndrome. The pathophysiology of NAFLD has not been fully elucidated. Some patients can remain in the stage of simple steatosis, which generally is a benign condition; whereas others can develop liver inflammation and progress into non-alcoholic steatohepatitis, fibrosis, cirrhosis and hepatocellular carcinoma. The mechanism behind the progression is still not fully understood. Much ongoing proteomic researches have focused on discovering the unbiased circulating biochemical markers to allow early detection and treatment of NAFLD. Comprehensive genomic studies have also begun to provide new insights into the gene polymorphism to understand patient-disease variations. Therefore, NAFLD is considered a complex and mutifactorial disease phenotype resulting from environmental exposures acting on a susceptible polygenic background. This paper reviewed the current status of proteomic and genomic studies that have contributed to the understanding of NAFLD pathogenesis. For proteomics section, this review highlighted functional proteins that involved in: (1) transportation; (2) metabolic pathway; (3) acute phase reaction; (4) anti-inflammatory; (5) extracellular matrix; and (6) immune system. In the genomic studies, this review will discuss genes which involved in: (1) lipolysis; (2) adipokines; and (3) cytokines production. PMID:25024592
Comparison of theoretical proteomes: Identification of COGs with conserved and variable pI within the multimodal pI distribution

PubMed Central

Nandi, Soumyadeep; Mehra, Nipun; Lynn, Andrew M; Bhattacharya, Alok

2005-01-01

Background Theoretical proteome analysis, generated by plotting theoretical isoelectric points (pI) against molecular masses of all proteins encoded by the genome show a multimodal distribution for pI. This multimodal distribution is an effect of allowed combinations of the charged amino acids, and not due to evolutionary causes. The variation in this distribution can be correlated to the organisms ecological niche. Contributions to this variation maybe mapped to individual proteins by studying the variation in pI of orthologs across microorganism genomes. Results The distribution of ortholog pI values showed trimodal distributions for all prokaryotic genomes analyzed, similar to whole proteome plots. Pairwise analysis of pI variation show that a few COGs are conserved within, but most vary between, the acidic and basic regions of the distribution, while molecular mass is more highly conserved. At the level of functional grouping of orthologs, five groups vary significantly from the population of orthologs, which is attributed to either conservation at the level of sequences or a bias for either positively or negatively charged residues contributing to the function. Individual COGs conserved in both the acidic and basic regions of the trimodal distribution are identified, and orthologs that best represent the variation in levels of the acidic and basic regions are listed. Conclusion The analysis of pI distribution by using orthologs provides a basis for resolution of theoretical proteome comparison at the level of individual proteins. Orthologs identified that significantly vary between the major acidic and basic regions maybe used as representative of the variation of the entire proteome. PMID:16150155
Draft Genome Sequences of Human Pathogenic Fungus Geomyces pannorum Sensu Lato and Bat White Nose Syndrome Pathogen Geomyces (Pseudogymnoascus) destructans.

PubMed

Chibucos, Marcus C; Crabtree, Jonathan; Nagaraj, Sushma; Chaturvedi, Sudha; Chaturvedi, Vishnu

2013-12-19

We report the draft genome sequences of Geomyces pannorum sensu lato and Geomyces (Pseudogymnoascus) destructans. G. pannorum has a larger proteome than G. destructans, containing more proteins with ascribed enzymatic functions. This dichotomy in the genomes of related psychrophilic fungi is a valuable target for defining their distinct saprobic and pathogenic attributes.
Proteome Studies of Filamentous Fungi

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Scott E.; Panisko, Ellen A.

2011-04-20

The continued fast pace of fungal genome sequence generation has enabled proteomic analysis of a wide breadth of organisms that span the breadth of the Kingdom Fungi. There is some phylogenetic bias to the current catalog of fungi with reasonable DNA sequence databases (genomic or EST) that could be analyzed at a global proteomic level. However, the rapid development of next generation sequencing platforms has lowered the cost of genome sequencing such that in the near future, having a genome sequence will no longer be a time or cost bottleneck for downstream proteomic (and transcriptomic) analyses. High throughput, non-gel basedmore » proteomics offers a snapshot of proteins present in a given sample at a single point in time. There are a number of different variations on the general method and technologies for identifying peptides in a given sample. We present a method that can serve as a “baseline” for proteomic studies of fungi.« less
Proteome studies of filamentous fungi.

PubMed

Baker, Scott E; Panisko, Ellen A

2011-01-01

The continued fast pace of fungal genome sequence generation has enabled proteomic analysis of a wide variety of organisms that span the breadth of the Kingdom Fungi. There is some phylogenetic bias to the current catalog of fungi with reasonable DNA sequence databases (genomic or EST) that could be analyzed at a global proteomic level. However, the rapid development of next generation sequencing platforms has lowered the cost of genome sequencing such that in the near future, having a genome sequence will no longer be a time or cost bottleneck for downstream proteomic (and transcriptomic) analyses. High throughput, nongel-based proteomics offers a snapshot of proteins present in a given sample at a single point in time. There are a number of variations on the general methods and technologies for identifying peptides in a given sample. We present a method that can serve as a "baseline" for proteomic studies of fungi.
Enhancement of Environmental Hazard Degradation in the Presence of Lignin: a Proteomics Study

DOE PAGES

Sun, Su; Xie, Shangxian; Cheng, Yanbing; ...

2017-09-12

Proteomics studies of fungal systems have progressed dramatically based on the availability of more fungal genome sequences in recent years. Different proteomics strategies have been applied toward characterization of fungal proteome and revealed important gene functions and proteome dynamics. Presented here is the application of shot-gun proteomic technology to study the bio-remediation of environmental hazards by white-rot fungus. Lignin, a naturally abundant component of the plant biomass, is discovered to promote the degradation of Azo dye by white-rot fungus Irpex lacteus CD2 in the lignin/dye/fungus system. Shotgun proteomics technique was used to understand degradation mechanism at the protein level formore » the lignin/dye/fungus system. Our proteomics study can identify about two thousand proteins (one third of the predicted white-rot fungal proteome) in a single experiment, as one of the most powerful proteomics platforms to study the fungal system to date. The study shows a significant enrichment of oxidoreduction functional category under the dye/lignin combined treatment. An in vitro validation is performed and supports our hypothesis that the synergy of Fenton reaction and manganese peroxidase might play an important role in DR5B dye degradation. The results could guide the development of effective bioremediation strategies and efficient lignocellulosic biomass conversion.« less
Enhancement of Environmental Hazard Degradation in the Presence of Lignin: a Proteomics Study.

PubMed

Sun, Su; Xie, Shangxian; Cheng, Yanbing; Yu, Hongbo; Zhao, Honglu; Li, Muzi; Li, Xiaotong; Zhang, Xiaoyu; Yuan, Joshua S; Dai, Susie Y

2017-09-12

Proteomics studies of fungal systems have progressed dramatically based on the availability of more fungal genome sequences in recent years. Different proteomics strategies have been applied toward characterization of fungal proteome and revealed important gene functions and proteome dynamics. Presented here is the application of shot-gun proteomic technology to study the bio-remediation of environmental hazards by white-rot fungus. Lignin, a naturally abundant component of the plant biomass, is discovered to promote the degradation of Azo dye by white-rot fungus Irpex lacteus CD2 in the lignin/dye/fungus system. Shotgun proteomics technique was used to understand degradation mechanism at the protein level for the lignin/dye/fungus system. Our proteomics study can identify about two thousand proteins (one third of the predicted white-rot fungal proteome) in a single experiment, as one of the most powerful proteomics platforms to study the fungal system to date. The study shows a significant enrichment of oxidoreduction functional category under the dye/lignin combined treatment. An in vitro validation is performed and supports our hypothesis that the synergy of Fenton reaction and manganese peroxidase might play an important role in DR5B dye degradation. The results could guide the development of effective bioremediation strategies and efficient lignocellulosic biomass conversion.
Enhancement of Environmental Hazard Degradation in the Presence of Lignin: a Proteomics Study

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sun, Su; Xie, Shangxian; Cheng, Yanbing

Proteomics studies of fungal systems have progressed dramatically based on the availability of more fungal genome sequences in recent years. Different proteomics strategies have been applied toward characterization of fungal proteome and revealed important gene functions and proteome dynamics. Presented here is the application of shot-gun proteomic technology to study the bio-remediation of environmental hazards by white-rot fungus. Lignin, a naturally abundant component of the plant biomass, is discovered to promote the degradation of Azo dye by white-rot fungus Irpex lacteus CD2 in the lignin/dye/fungus system. Shotgun proteomics technique was used to understand degradation mechanism at the protein level formore » the lignin/dye/fungus system. Our proteomics study can identify about two thousand proteins (one third of the predicted white-rot fungal proteome) in a single experiment, as one of the most powerful proteomics platforms to study the fungal system to date. The study shows a significant enrichment of oxidoreduction functional category under the dye/lignin combined treatment. An in vitro validation is performed and supports our hypothesis that the synergy of Fenton reaction and manganese peroxidase might play an important role in DR5B dye degradation. The results could guide the development of effective bioremediation strategies and efficient lignocellulosic biomass conversion.« less
Quantitative proteomics in Giardia duodenalis-Achievements and challenges.

PubMed

Emery, Samantha J; Lacey, Ernest; Haynes, Paul A

2016-08-01

Giardia duodenalis (syn. G. lamblia and G. intestinalis) is a protozoan parasite of vertebrates and a major contributor to the global burden of diarrheal diseases and gastroenteritis. The publication of multiple genome sequences in the G. duodenalis species complex has provided important insights into parasite biology, and made post-genomic technologies, including proteomics, significantly more accessible. The aims of proteomics are to identify and quantify proteins present in a cell, and assign functions to them within the context of dynamic biological systems. In Giardia, proteomics in the post-genomic era has transitioned from reliance on gel-based systems to utilisation of a diverse array of techniques based on bottom-up LC-MS/MS technologies. Together, these have generated crucial foundations for subcellular proteomes, elucidated intra- and inter-assemblage isolate variation, and identified pathways and markers in differentiation, host-parasite interactions and drug resistance. However, in Giardia, proteomics remains an emerging field, with considerable shortcomings evident from the published research. These include a bias towards assemblage A, a lack of emphasis on quantitative analytical techniques, and limited information on post-translational protein modifications. Additionally, there are multiple areas of research for which proteomic data is not available to add value to published transcriptomic data. The challenge of amalgamating data in the systems biology paradigm necessitates the further generation of large, high-quality quantitative datasets to accurately model parasite biology. This review surveys the current proteomic research available for Giardia and evaluates their technical and quantitative approaches, while contextualising their biological insights into parasite pathology, isolate variation and eukaryotic evolution. Finally, we propose areas of priority for the generation of future proteomic data to explore fundamental questions in Giardia, including the analysis of post-translational modifications, and the design of MS-based assays for validation of differentially expressed proteins in large datasets. Copyright © 2016 Elsevier B.V. All rights reserved.
New Markers for Predicting Fertility of the Male Gametes in the Post Genomic Age.

PubMed

Dipresa, Savina; De Toni, Luca; Foresta, Carlo; Garolla, Andrea

2018-04-18

A number of test have been proposed to assess male fertility potential, ranging from routine testing by light microscopic method for evaluating semen samples, to screening test for DNA integrity aimed to look at sperm chromatin abnormalities. Spermatozoa are an extremely differentiated cell, they have critical functions for embryo development and heredity, in addiction to delivering a haploid paternal genome to the oocyte. Towards this goal certain requirements must always be met. The ability of spermatozoa to perform its reproductive function taking place in the spermatogenesis, a highly specialized process depending on multiple factors with effect on male fertility. In the past 30 years, large-scale analyses of transcriptomic and genome expression in mammals have generated a large amount of informations on numberless biomolecules involved in spermatogenesis and male germ cell reproductive function. Sperm proteome represents the protein content that spermatozoa needs to survive and work correctly and modifications of sperm proteome play a role in determining functional changes leading to a decrease of reproductive competence into affected spermatozoa. The post-genomic approach consists of different methodologies for concurrently testicular transcriptome studies, protein compositional analysis and metabolomics findings of the spermatozoa in humans. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Platelet proteomics: from discovery to diagnosis.

PubMed

Looße, Christina; Swieringa, Frauke; Heemskerk, Johan W M; Sickmann, Albert; Lorenz, Christin

2018-05-22

Platelets are the smallest cells within the circulating blood with key roles in physiological haemostasis and pathological thrombosis regulated by the onset of activating/inhibiting processes via receptor responses and signalling cascades. Areas covered: Proteomics as well as genomic approaches have been fundamental in identifying and quantifying potential targets for future diagnostic strategies in the prevention of bleeding and thrombosis, and uncovering the complexity of platelet functions in health and disease. In this article, we provide a critical overview on current functional tests used in diagnostics and the future perspectives for platelet proteomics in clinical applications. Expert commentary: Proteomics represents a valuable tool for the identification of patients with diverse platelet associated defects. In-depth validation of identified biomarkers, e.g. receptors, signalling proteins, post-translational modifications, in large cohorts is decisive for translation into routine clinical diagnostics.
Scientific Approaches | Office of Cancer Clinical Proteomics Research

Cancer.gov

CPTAC employs two complementary scientific approaches, a "Targeting Genome to Proteome" (Targeting G2P) approach and a "Mapping Proteome to Genome" (Mapping P2G) approach, in order to address biological questions from data generated on a sample.
Large-scale label-free quantitative proteomics of the pea aphid-Buchnera symbiosis.

PubMed

Poliakov, Anton; Russell, Calum W; Ponnala, Lalit; Hoops, Harold J; Sun, Qi; Douglas, Angela E; van Wijk, Klaas J

2011-06-01

Many insects are nutritionally dependent on symbiotic microorganisms that have tiny genomes and are housed in specialized host cells called bacteriocytes. The obligate symbiosis between the pea aphid Acyrthosiphon pisum and the γ-proteobacterium Buchnera aphidicola (only 584 predicted proteins) is particularly amenable for molecular analysis because the genomes of both partners have been sequenced. To better define the symbiotic relationship between this aphid and Buchnera, we used large-scale, high accuracy tandem mass spectrometry (nanoLC-LTQ-Orbtrap) to identify aphid and Buchnera proteins in the whole aphid body, purified bacteriocytes, isolated Buchnera cells and the residual bacteriocyte fraction. More than 1900 aphid and 400 Buchnera proteins were identified. All enzymes in amino acid metabolism annotated in the Buchnera genome were detected, reflecting the high (68%) coverage of the proteome and supporting the core function of Buchnera in the aphid symbiosis. Transporters mediating the transport of predicted metabolites were present in the bacteriocyte. Label-free spectral counting combined with hierarchical clustering, allowed to define the quantitative distribution of a subset of these proteins across both symbiotic partners, yielding no evidence for the selective transfer of protein among the partners in either direction. This is the first quantitative proteome analysis of bacteriocyte symbiosis, providing a wealth of information about molecular function of both the host cell and bacterial symbiont.

Deorphanizing the human transmembrane genome: A landscape of uncharacterized membrane proteins.

PubMed

Babcock, Joseph J; Li, Min

2014-01-01

The sequencing of the human genome has fueled the last decade of work to functionally characterize genome content. An important subset of genes encodes membrane proteins, which are the targets of many drugs. They reside in lipid bilayers, restricting their endogenous activity to a relatively specialized biochemical environment. Without a reference phenotype, the application of systematic screens to profile candidate membrane proteins is not immediately possible. Bioinformatics has begun to show its effectiveness in focusing the functional characterization of orphan proteins of a particular functional class, such as channels or receptors. Here we discuss integration of experimental and bioinformatics approaches for characterizing the orphan membrane proteome. By analyzing the human genome, a landscape reference for the human transmembrane genome is provided.
Computational functional genomics-based approaches in analgesic drug discovery and repurposing.

PubMed

Lippmann, Catharina; Kringel, Dario; Ultsch, Alfred; Lötsch, Jörn

2018-06-01

Persistent pain is a major healthcare problem affecting a fifth of adults worldwide with still limited treatment options. The search for new analgesics increasingly includes the novel research area of functional genomics, which combines data derived from various processes related to DNA sequence, gene expression or protein function and uses advanced methods of data mining and knowledge discovery with the goal of understanding the relationship between the genome and the phenotype. Its use in drug discovery and repurposing for analgesic indications has so far been performed using knowledge discovery in gene function and drug target-related databases; next-generation sequencing; and functional proteomics-based approaches. Here, we discuss recent efforts in functional genomics-based approaches to analgesic drug discovery and repurposing and highlight the potential of computational functional genomics in this field including a demonstration of the workflow using a novel R library 'dbtORA'.
Draft Genome Sequences of Human Pathogenic Fungus Geomyces pannorum Sensu Lato and Bat White Nose Syndrome Pathogen Geomyces (Pseudogymnoascus) destructans

PubMed Central

Crabtree, Jonathan; Nagaraj, Sushma; Chaturvedi, Sudha

2013-01-01

We report the draft genome sequences of Geomyces pannorum sensu lato and Geomyces (Pseudogymnoascus) destructans. G. pannorum has a larger proteome than G. destructans, containing more proteins with ascribed enzymatic functions. This dichotomy in the genomes of related psychrophilic fungi is a valuable target for defining their distinct saprobic and pathogenic attributes. PMID:24356829
CPTAC researchers report first large-scale integrated proteomic and genomic analysis of a human cancer | Office of Cancer Clinical Proteomics Research

Cancer.gov

Investigators from the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) who comprehensively analyzed 95 human colorectal tumor samples, have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, provides a more comprehensive view of the biological features that drive cancer than genomic analysis alone and may help identify the most important targets for cancer detection and intervention.
Reductive evolution of architectural repertoires in proteomes and the birth of the tripartite world

PubMed Central

Wang, Minglei; Yafremava, Liudmila S.; Caetano-Anollés, Derek; Mittenthal, Jay E.; Caetano-Anollés, Gustavo

2007-01-01

The repertoire of protein architectures in proteomes is evolutionarily conserved and capable of preserving an accurate record of genomic history. Here we use a census of protein architecture in 185 genomes that have been fully sequenced to generate genome-based phylogenies that describe the evolution of the protein world at fold (F) and fold superfamily (FSF) levels. The patterns of representation of F and FSF architectures over evolutionary history suggest three epochs in the evolution of the protein world: (1) architectural diversification, where members of an architecturally rich ancestral community diversified their protein repertoire; (2) superkingdom specification, where superkingdoms Archaea, Bacteria, and Eukarya were specified; and (3) organismal diversification, where F and FSF specific to relatively small sets of organisms appeared as the result of diversification of organismal lineages. Functional annotation of FSF along these architectural chronologies revealed patterns of discovery of biological function. Most importantly, the analysis identified an early and extensive differential loss of architectures occurring primarily in Archaea that segregates the archaeal lineage from the ancient community of organisms and establishes the first organismal divide. Reconstruction of phylogenomic trees of proteomes reflects the timeline of architectural diversification in the emerging lineages. Thus, Archaea undertook a minimalist strategy using only a small subset of the full architectural repertoire and then crystallized into a diversified superkingdom late in evolution. Our analysis also suggests a communal ancestor to all life that was molecularly complex and adopted genomic strategies currently present in Eukarya. PMID:17908824
A Proteogenomic Approach to Understanding MYC Function in Metastatic Medulloblastoma Tumors.

PubMed

Staal, Jerome A; Pei, Yanxin; Rood, Brian R

2016-10-19

Brain tumors are the leading cause of cancer-related deaths in children, and medulloblastoma is the most prevalent malignant childhood/pediatric brain tumor. Providing effective treatment for these cancers, with minimal damage to the still-developing brain, remains one of the greatest challenges faced by clinicians. Understanding the diverse events driving tumor formation, maintenance, progression, and recurrence is necessary for identifying novel targeted therapeutics and improving survival of patients with this disease. Genomic copy number alteration data, together with clinical studies, identifies c-MYC amplification as an important risk factor associated with the most aggressive forms of medulloblastoma with marked metastatic potential. Yet despite this, very little is known regarding the impact of such genomic abnormalities upon the functional biology of the tumor cell. We discuss here how recent advances in quantitative proteomic techniques are now providing new insights into the functional biology of these aggressive tumors, as illustrated by the use of proteomics to bridge the gap between the genotype and phenotype in the case of c-MYC -amplified/associated medulloblastoma. These integrated proteogenomic approaches now provide a new platform for understanding cancer biology by providing a functional context to frame genomic abnormalities.
Constructing Proteome Reference Map of the Porcine Jejunal Cell Line (IPEC-J2) by Label-Free Mass Spectrometry.

PubMed

Kim, Sang Hoon; Pajarillo, Edward Alain B; Balolong, Marilen P; Lee, Ji Yoon; Kang, Dae-Kyung

2016-06-28

In this study, the global proteome of the IPEC-J2 cell line was evaluated using ultra-high performance liquid chromatography coupled to a quadrupole Q Exactive™ Orbitrap mass spectrometer. Proteins were isolated from highly confluent IPEC-J2 cells in biological replicates and analyzed by label-free mass spectrometry prior to matching against a porcine genomic dataset. The results identified 1,517 proteins, accounting for 7.35% of all genes in the porcine genome. The highly abundant proteins detected, such as actin, annexin A2, and AHNAK nucleoprotein, are involved in structural integrity, signaling mechanisms, and cellular homeostasis. The high abundance of heat shock proteins indicated their significance in cellular defenses, barrier function, and gut homeostasis. Pathway analysis and annotation using the Kyoto Encyclopedia of Genes and Genomes database resulted in a putative protein network map of the regulation of immunological responses and structural integrity in the cell line. The comprehensive proteome analysis of IPEC-J2 cells provides fundamental insights into overall protein expression and pathway dynamics that might be useful in cell adhesion studies and immunological applications.
Computational clustering for viral reference proteomes

PubMed Central

Chen, Chuming; Huang, Hongzhan; Mazumder, Raja; Natale, Darren A.; McGarvey, Peter B.; Zhang, Jian; Polson, Shawn W.; Wang, Yuqi; Wu, Cathy H.

2016-01-01

Motivation: The enormous number of redundant sequenced genomes has hindered efforts to analyze and functionally annotate proteins. As the taxonomy of viruses is not uniformly defined, viral proteomes pose special challenges in this regard. Grouping viruses based on the similarity of their proteins at proteome scale can normalize against potential taxonomic nomenclature anomalies. Results: We present Viral Reference Proteomes (Viral RPs), which are computed from complete virus proteomes within UniProtKB. Viral RPs based on 95, 75, 55, 35 and 15% co-membership in proteome similarity based clusters are provided. Comparison of our computational Viral RPs with UniProt’s curator-selected Reference Proteomes indicates that the two sets are consistent and complementary. Furthermore, each Viral RP represents a cluster of virus proteomes that was consistent with virus or host taxonomy. We provide BLASTP search and FTP download of Viral RP protein sequences, and a browser to facilitate the visualization of Viral RPs. Availability and implementation: http://proteininformationresource.org/rps/viruses/ Contact: chenc@udel.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153712
Improving transcriptome de novo assembly by using a reference genome of a related species: Translational genomics from oil palm to coconut

PubMed Central

Armero, Alix; Bocs, Stéphanie; This, Dominique

2017-01-01

The palms are a family of tropical origin and one of the main constituents of the ecosystems of these regions around the world. The two main species of palm represent different challenges: coconut (Cocos nucifera L.) is a source of multiple goods and services in tropical communities, while oil palm (Elaeis guineensis Jacq) is the main protagonist of the oil market. In this study, we present a workflow that exploits the comparative genomics between a target species (coconut) and a reference species (oil palm) to improve the transcriptomic data, providing a proteome useful to answer functional or evolutionary questions. This workflow reduces redundancy and fragmentation, two inherent problems of transcriptomic data, while preserving the functional representation of the target species. Our approach was validated in Arabidopsis thaliana using Arabidopsis lyrata and Capsella rubella as references species. This analysis showed the high sensitivity and specificity of our strategy, relatively independent of the reference proteome. The workflow increased the length of proteins products in A. thaliana by 13%, allowing, often, to recover 100% of the protein sequence length. In addition redundancy was reduced by a factor greater than 3. In coconut, the approach generated 29,366 proteins, 1,246 of these proteins deriving from new contigs obtained with the BRANCH software. The coconut proteome presented a functional profile similar to that observed in rice and an important number of metabolic pathways related to secondary metabolism. The new sequences found with BRANCH software were enriched in functions related to biotic stress. Our strategy can be used as a complementary step to de novo transcriptome assembly to get a representative proteome of a target species. The results of the current analysis are available on the website PalmComparomics (http://palm-comparomics.southgreen.fr/). PMID:28334050
Proteomic analysis of bovine nucleolus.

PubMed

Patel, Amrutlal K; Olson, Doug; Tikoo, Suresh K

2010-09-01

Nucleolus is the most prominent subnuclear structure, which performs a wide variety of functions in the eukaryotic cellular processes. In order to understand the structural and functional role of the nucleoli in bovine cells, we analyzed the proteomic composition of the bovine nucleoli. The nucleoli were isolated from Madin Darby bovine kidney cells and subjected to proteomic analysis by LC-MS/MS after fractionation by SDS-PAGE and strong cation exchange chromatography. Analysis of the data using the Mascot database search and the GPM database search identified 311 proteins in the bovine nucleoli, which contained 22 proteins previously not identified in the proteomic analysis of human nucleoli. Analysis of the identified proteins using the GoMiner software suggested that the bovine nucleoli contained proteins involved in ribosomal biogenesis, cell cycle control, transcriptional, translational and post-translational regulation, transport, and structural organization. Copyright © 2010 Beijing Genomics Institute. Published by Elsevier Ltd. All rights reserved.
Advances in Proteomics of Mycobacterium leprae.

PubMed

Parkash, O; Singh, B P

2012-04-01

Although Mycobacterium leprae was the first bacterial pathogen identified causing human disease, it remains one of the few that is non-cultivable. Understanding the biology of M. leprae is one of the primary challenges in current leprosy research. Genomics has been extremely valuable, nonetheless, functional proteins are ultimately responsible for controlling most aspects of cellular functions, which in turn could facilitate parasitizing the host. Furthermore, bacterial proteins provide targets for most of the vaccines and immunodiagnostic tools. Better understanding of the proteomics of M. leprae could also help in developing new drugs against M. leprae. During the past nearly 15 years, there have been several developments towards the identification of M. leprae proteins employing contemporary proteomics tools. In this review, we discuss the knowledge gained on the biology and pathogenesis of M. leprae from current proteomic studies. © 2012 The Authors. Scandinavian Journal of Immunology © 2012 Blackwell Publishing Ltd.
The chordate proteome history database.

PubMed

Levasseur, Anthony; Paganini, Julien; Dainat, Jacques; Thompson, Julie D; Poch, Olivier; Pontarotti, Pierre; Gouret, Philippe

2012-01-01

The chordate proteome history database (http://ioda.univ-provence.fr) comprises some 20,000 evolutionary analyses of proteins from chordate species. Our main objective was to characterize and study the evolutionary histories of the chordate proteome, and in particular to detect genomic events and automatic functional searches. Firstly, phylogenetic analyses based on high quality multiple sequence alignments and a robust phylogenetic pipeline were performed for the whole protein and for each individual domain. Novel approaches were developed to identify orthologs/paralogs, and predict gene duplication/gain/loss events and the occurrence of new protein architectures (domain gains, losses and shuffling). These important genetic events were localized on the phylogenetic trees and on the genomic sequence. Secondly, the phylogenetic trees were enhanced by the creation of phylogroups, whereby groups of orthologous sequences created using OrthoMCL were corrected based on the phylogenetic trees; gene family size and gene gain/loss in a given lineage could be deduced from the phylogroups. For each ortholog group obtained from the phylogenetic or the phylogroup analysis, functional information and expression data can be retrieved. Database searches can be performed easily using biological objects: protein identifier, keyword or domain, but can also be based on events, eg, domain exchange events can be retrieved. To our knowledge, this is the first database that links group clustering, phylogeny and automatic functional searches along with the detection of important events occurring during genome evolution, such as the appearance of a new domain architecture.
Proteogenomics | Office of Cancer Clinical Proteomics Research

Cancer.gov

Proteogenomics, or the integration of proteomics with genomics and transcriptomics, is an emerging approach that promises to advance basic, translational and clinical research. By combining genomic and proteomic information, leading scientists are gaining new insights due to a more complete and unified understanding of complex biological processes.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Jing; Ma, Zihao; Carr, Steven A.

Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products despite well-known limitations of this “guilt-by-association” (GBA) approach. Recent advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level; however, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction has not been systematically investigated. Here, we address this question by constructing and analyzing mRNA and protein coexpression networks for three cancer types with matched mRNA and protein profiling data from The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC).more » Our analyses revealed a marked difference in wiring between the mRNA and protein coexpression networks. Whereas protein coexpression was driven primarily by functional similarity between coexpressed genes, mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes. Functionally coherent mRNA modules were more likely to have their edges preserved in corresponding protein networks than functionally incoherent mRNA modules. Proteomic data strengthened the link between gene expression and function for at least 75% of Gene Ontology (GO) biological processes and 90% of KEGG pathways. A web application Gene2Net (http://cptac.gene2net.org) developed based on the three protein coexpression networks revealed novel gene-function relationships, such as linking ERBB2 (HER2) to lipid biosynthetic process in breast cancer, identifying PLG as a new gene involved in complement activation, and identifying AEBP1 as a new epithelial-mesenchymal transition (EMT) marker. Our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. Proteomics should be integrated if not preferred in gene function and human disease studies. Molecular & Cellular Proteomics 16: 10.1074/mcp.M116.060301, 121–134, 2017.« less
Curated protein information in the Saccharomyces genome database.

PubMed

Hellerstedt, Sage T; Nash, Robert S; Weng, Shuai; Paskov, Kelley M; Wong, Edith D; Karra, Kalpana; Engel, Stacia R; Cherry, J Michael

2017-01-01

Due to recent advancements in the production of experimental proteomic data, the Saccharomyces genome database (SGD; www.yeastgenome.org ) has been expanding our protein curation activities to make new data types available to our users. Because of broad interest in post-translational modifications (PTM) and their importance to protein function and regulation, we have recently started incorporating expertly curated PTM information on individual protein pages. Here we also present the inclusion of new abundance and protein half-life data obtained from high-throughput proteome studies. These new data types have been included with the aim to facilitate cellular biology research. : www.yeastgenome.org. © The Author(s) 2017. Published by Oxford University Press.
Functional genomics of root growth and development in Arabidopsis

PubMed Central

Iyer-Pascuzzi, Anjali; Simpson, June; Herrera-Estrella, Luis; Benfey, Philip N.

2009-01-01

Summary Roots are vital for the uptake of water and nutrients, and for anchorage in the soil. They are highly plastic, able to adapt developmentally and physiologically to changing environmental conditions. Understanding the molecular mechanisms behind this growth and development requires knowledge of root transcriptomics, proteomics and metabolomics. Genomics approaches, including the recent publication of a root expression map, root proteome, and environment-specific root expression studies, are uncovering complex transcriptional and post-transcriptional networks underlying root development. The challenge is in further capitalizing on the information in these datasets to understand the fundamental principles of root growth and development. In this review, we highlight progress researchers have made toward this goal. PMID:19117793
Functional genomics of root growth and development in Arabidopsis.

PubMed

Iyer-Pascuzzi, Anjali; Simpson, June; Herrera-Estrella, Luis; Benfey, Philip N

2009-04-01

Roots are vital for the uptake of water and nutrients, and for anchorage in the soil. They are highly plastic, able to adapt developmentally and physiologically to changing environmental conditions. Understanding the molecular mechanisms behind this growth and development requires knowledge of root transcriptomics, proteomics, and metabolomics. Genomics approaches, including the recent publication of a root expression map, root proteome, and environment-specific root expression studies, are uncovering complex transcriptional and post-transcriptional networks underlying root development. The challenge is in further capitalizing on the information in these datasets to understand the fundamental principles of root growth and development. In this review, we highlight progress researchers have made toward this goal.
NCI-CPTAC DREAM Proteogenomics Challenge (Registration Now Open) | Office of Cancer Clinical Proteomics Research

Cancer.gov

Proteogenomics, integration of proteomics, genomics, and transcriptomics, is an emerging approach that promises to advance basic, translational and clinical research. By combining genomic and proteomic information, leading scientists are gaining new insights due to a more complete and unified understanding of complex biological processes.
Functional Genomics in the Study of Mind-Body Therapies

PubMed Central

Niles, Halsey; Mehta, Darshan H.; Corrigan, Alexandra A.; Bhasin, Manoj K.; Denninger, John W.

2014-01-01

Background Mind-body therapies (MBTs) are used throughout the world in treatment, disease prevention, and health promotion. However, the mechanisms by which MBTs exert their positive effects are not well understood. Investigations into MBTs using functional genomics have revolutionized the understanding of MBT mechanisms and their effects on human physiology. Methods We searched the literature for the effects of MBTs on functional genomics determinants using MEDLINE, supplemented by a manual search of additional journals and a reference list review. Results We reviewed 15 trials that measured global or targeted transcriptomic, epigenomic, or proteomic changes in peripheral blood. Sample sizes ranged from small pilot studies (n=2) to large trials (n=500). While the reliability of individual genes from trial to trial was often inconsistent, genes related to inflammatory response, particularly those involved in the nuclear factor-kappa B (NF-κB) pathway, were consistently downregulated across most studies. Conclusion In general, existing trials focusing on gene expression changes brought about by MBTs have revealed intriguing connections to the immune system through the NF-κB cascade, to telomere maintenance, and to apoptotic regulation. However, these findings are limited to a small number of trials and relatively small sample sizes. More rigorous randomized controlled trials of healthy subjects and specific disease states are warranted. Future research should investigate functional genomics areas both upstream and downstream of MBT-related gene expression changes—from epigenomics to proteomics and metabolomics. PMID:25598735
Functional genomics in the study of mind-body therapies.

PubMed

Niles, Halsey; Mehta, Darshan H; Corrigan, Alexandra A; Bhasin, Manoj K; Denninger, John W

2014-01-01

Mind-body therapies (MBTs) are used throughout the world in treatment, disease prevention, and health promotion. However, the mechanisms by which MBTs exert their positive effects are not well understood. Investigations into MBTs using functional genomics have revolutionized the understanding of MBT mechanisms and their effects on human physiology. We searched the literature for the effects of MBTs on functional genomics determinants using MEDLINE, supplemented by a manual search of additional journals and a reference list review. We reviewed 15 trials that measured global or targeted transcriptomic, epigenomic, or proteomic changes in peripheral blood. Sample sizes ranged from small pilot studies (n=2) to large trials (n=500). While the reliability of individual genes from trial to trial was often inconsistent, genes related to inflammatory response, particularly those involved in the nuclear factor-kappa B (NF-κB) pathway, were consistently downregulated across most studies. In general, existing trials focusing on gene expression changes brought about by MBTs have revealed intriguing connections to the immune system through the NF-κB cascade, to telomere maintenance, and to apoptotic regulation. However, these findings are limited to a small number of trials and relatively small sample sizes. More rigorous randomized controlled trials of healthy subjects and specific disease states are warranted. Future research should investigate functional genomics areas both upstream and downstream of MBT-related gene expression changes-from epigenomics to proteomics and metabolomics.

Systematic Analysis of Compositional Order of Proteins Reveals New Characteristics of Biological Functions and a Universal Correlate of Macroevolution

PubMed Central

Persi, Erez; Horn, David

2013-01-01

We present a novel analysis of compositional order (CO) based on the occurrence of Frequent amino-acid Triplets (FTs) that appear much more than random in protein sequences. The method captures all types of proteomic compositional order including single amino-acid runs, tandem repeats, periodic structure of motifs and otherwise low complexity amino-acid regions. We introduce new order measures, distinguishing between ‘regularity’, ‘periodicity’ and ‘vocabulary’, to quantify these phenomena and to facilitate the identification of evolutionary effects. Detailed analysis of representative species across the tree-of-life demonstrates that CO proteins exhibit numerous functional enrichments, including a wide repertoire of particular patterns of dependencies on regularity and periodicity. Comparison between human and mouse proteomes further reveals the interplay of CO with evolutionary trends, such as faster substitution rate in mouse leading to decrease of periodicity, while innovation along the human lineage leads to larger regularity. Large-scale analysis of 94 proteomes leads to systematic ordering of all major taxonomic groups according to FT-vocabulary size. This is measured by the count of Different Frequent Triplets (DFT) in proteomes. The latter provides a clear hierarchical delineation of vertebrates, invertebrates, plants, fungi and prokaryotes, with thermophiles showing the lowest level of FT-vocabulary. Among eukaryotes, this ordering correlates with phylogenetic proximity. Interestingly, in all kingdoms CO accumulation in the proteome has universal characteristics. We suggest that CO is a genomic-information correlate of both macroevolution and various protein functions. The results indicate a mechanism of genomic ‘innovation’ at the peptide level, involved in protein elongation, shaped in a universal manner by mutational and selective forces. PMID:24278003
Using the underlying biological organization of the Mycobacterium tuberculosis functional network for protein function prediction.

PubMed

Mazandu, Gaston K; Mulder, Nicola J

2012-07-01

Despite ever-increasing amounts of sequence and functional genomics data, there is still a deficiency of functional annotation for many newly sequenced proteins. For Mycobacterium tuberculosis (MTB), more than half of its genome is still uncharacterized, which hampers the search for new drug targets within the bacterial pathogen and limits our understanding of its pathogenicity. As for many other genomes, the annotations of proteins in the MTB proteome were generally inferred from sequence homology, which is effective but its applicability has limitations. We have carried out large-scale biological data integration to produce an MTB protein functional interaction network. Protein functional relationships were extracted from the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database, and additional functional interactions from microarray, sequence and protein signature data. The confidence level of protein relationships in the additional functional interaction data was evaluated using a dynamic data-driven scoring system. This functional network has been used to predict functions of uncharacterized proteins using Gene Ontology (GO) terms, and the semantic similarity between these terms measured using a state-of-the-art GO similarity metric. To achieve better trade-off between improvement of quality, genomic coverage and scalability, this prediction is done by observing the key principles driving the biological organization of the functional network. This study yields a new functionally characterized MTB strain CDC1551 proteome, consisting of 3804 and 3698 proteins out of 4195 with annotations in terms of the biological process and molecular function ontologies, respectively. These data can contribute to research into the Development of effective anti-tubercular drugs with novel biological mechanisms of action. Copyright © 2011 Elsevier B.V. All rights reserved.
Thermosensitivity of growth is determined by chaperone-mediated proteome reallocation

PubMed Central

Chen, Ke; Gao, Ye; Mih, Nathan; O’Brien, Edward J.; Yang, Laurence; Palsson, Bernhard O.

2017-01-01

Maintenance of a properly folded proteome is critical for bacterial survival at notably different growth temperatures. Understanding the molecular basis of thermoadaptation has progressed in two main directions, the sequence and structural basis of protein thermostability and the mechanistic principles of protein quality control assisted by chaperones. Yet we do not fully understand how structural integrity of the entire proteome is maintained under stress and how it affects cellular fitness. To address this challenge, we reconstruct a genome-scale protein-folding network for Escherichia coli and formulate a computational model, FoldME, that provides statistical descriptions of multiscale cellular response consistent with many datasets. FoldME simulations show (i) that the chaperones act as a system when they respond to unfolding stress rather than achieving efficient folding of any single component of the proteome, (ii) how the proteome is globally balanced between chaperones for folding and the complex machinery synthesizing the proteins in response to perturbation, (iii) how this balancing determines growth rate dependence on temperature and is achieved through nonspecific regulation, and (iv) how thermal instability of the individual protein affects the overall functional state of the proteome. Overall, these results expand our view of cellular regulation, from targeted specific control mechanisms to global regulation through a web of nonspecific competing interactions that modulate the optimal reallocation of cellular resources. The methodology developed in this study enables genome-scale integration of environment-dependent protein properties and a proteome-wide study of cellular stress responses. PMID:29073085
The Escherichia coli Proteome: Past, Present, and Future Prospects†

PubMed Central

Han, Mee-Jung; Lee, Sang Yup

2006-01-01

Proteomics has emerged as an indispensable methodology for large-scale protein analysis in functional genomics. The Escherichia coli proteome has been extensively studied and is well defined in terms of biochemical, biological, and biotechnological data. Even before the entire E. coli proteome was fully elucidated, the largest available data set had been integrated to decipher regulatory circuits and metabolic pathways, providing valuable insights into global cellular physiology and the development of metabolic and cellular engineering strategies. With the recent advent of advanced proteomic technologies, the E. coli proteome has been used for the validation of new technologies and methodologies such as sample prefractionation, protein enrichment, two-dimensional gel electrophoresis, protein detection, mass spectrometry (MS), combinatorial assays with n-dimensional chromatographies and MS, and image analysis software. These important technologies will not only provide a great amount of additional information on the E. coli proteome but also synergistically contribute to other proteomic studies. Here, we review the past development and current status of E. coli proteome research in terms of its biological, biotechnological, and methodological significance and suggest future prospects. PMID:16760308
By their genes ye shall know them: genomic signatures of predatory bacteria

PubMed Central

Pasternak, Zohar; Pietrokovski, Shmuel; Rotem, Or; Gophna, Uri; Lurie-Weinberger, Mor N; Jurkevitch, Edouard

2013-01-01

Predatory bacteria are taxonomically disparate, exhibit diverse predatory strategies and are widely distributed in varied environments. To date, their predatory phenotypes cannot be discerned in genome sequence data thereby limiting our understanding of bacterial predation, and of its impact in nature. Here, we define the ‘predatome,' that is, sets of protein families that reflect the phenotypes of predatory bacteria. The proteomes of all sequenced 11 predatory bacteria, including two de novo sequenced genomes, and 19 non-predatory bacteria from across the phylogenetic and ecological landscapes were compared. Protein families discriminating between the two groups were identified and quantified, demonstrating that differences in the proteomes of predatory and non-predatory bacteria are large and significant. This analysis allows predictions to be made, as we show by confirming from genome data an over-looked bacterial predator. The predatome exhibits deficiencies in riboflavin and amino acids biosynthesis, suggesting that predators obtain them from their prey. In contrast, these genomes are highly enriched in adhesins, proteases and particular metabolic proteins, used for binding to, processing and consuming prey, respectively. Strikingly, predators and non-predators differ in isoprenoid biosynthesis: predators use the mevalonate pathway, whereas non-predators, like almost all bacteria, use the DOXP pathway. By defining predatory signatures in bacterial genomes, the predatory potential they encode can be uncovered, filling an essential gap for measuring bacterial predation in nature. Moreover, we suggest that full-genome proteomic comparisons are applicable to other ecological interactions between microbes, and provide a convenient and rational tool for the functional classification of bacteria. PMID:23190728
Functional proteomics within the genus Lactobacillus.

PubMed

De Angelis, Maria; Calasso, Maria; Cavallo, Noemi; Di Cagno, Raffaella; Gobbetti, Marco

2016-03-01

Lactobacillus are mainly used for the manufacture of fermented dairy, sourdough, meat, and vegetable foods or used as probiotics. Under optimal processing conditions, Lactobacillus strains contribute to food functionality through their enzyme portfolio and the release of metabolites. An extensive genomic diversity analysis was conducted to elucidate the core features of the genus Lactobacillus, and to provide a better comprehension of niche adaptation of the strains. However, proteomics is an indispensable "omics" science to elucidate the proteome diversity, and the mechanisms of regulation and adaptation of Lactobacillus strains. This review focuses on the novel and comprehensive knowledge of functional proteomics and metaproteomics of Lactobacillus species. A large list of proteomic case studies of different Lactobacillus species is provided to illustrate the adaptability of the main metabolic pathways (e.g., carbohydrate transport and metabolism, pyruvate metabolism, proteolytic system, amino acid metabolism, and protein synthesis) to various life conditions. These investigations have highlighted that lactobacilli modulate the level of a complex panel of proteins to growth/survive in different ecological niches. In addition to the general regulation and stress response, specific metabolic pathways can be switched on and off, modifying the behavior of the strains. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
MitProNet: A Knowledgebase and Analysis Platform of Proteome, Interactome and Diseases for Mammalian Mitochondria

PubMed Central

Mao, Song; Chai, Xiaoqiang; Hu, Yuling; Hou, Xugang; Tang, Yiheng; Bi, Cheng; Li, Xiao

2014-01-01

Mitochondrion plays a central role in diverse biological processes in most eukaryotes, and its dysfunctions are critically involved in a large number of diseases and the aging process. A systematic identification of mitochondrial proteomes and characterization of functional linkages among mitochondrial proteins are fundamental in understanding the mechanisms underlying biological functions and human diseases associated with mitochondria. Here we present a database MitProNet which provides a comprehensive knowledgebase for mitochondrial proteome, interactome and human diseases. First an inventory of mammalian mitochondrial proteins was compiled by widely collecting proteomic datasets, and the proteins were classified by machine learning to achieve a high-confidence list of mitochondrial proteins. The current version of MitProNet covers 1124 high-confidence proteins, and the remainders were further classified as middle- or low-confidence. An organelle-specific network of functional linkages among mitochondrial proteins was then generated by integrating genomic features encoded by a wide range of datasets including genomic context, gene expression profiles, protein-protein interactions, functional similarity and metabolic pathways. The functional-linkage network should be a valuable resource for the study of biological functions of mitochondrial proteins and human mitochondrial diseases. Furthermore, we utilized the network to predict candidate genes for mitochondrial diseases using prioritization algorithms. All proteins, functional linkages and disease candidate genes in MitProNet were annotated according to the information collected from their original sources including GO, GEO, OMIM, KEGG, MIPS, HPRD and so on. MitProNet features a user-friendly graphic visualization interface to present functional analysis of linkage networks. As an up-to-date database and analysis platform, MitProNet should be particularly helpful in comprehensive studies of complicated biological mechanisms underlying mitochondrial functions and human mitochondrial diseases. MitProNet is freely accessible at http://bio.scu.edu.cn:8085/MitProNet. PMID:25347823
Label-free proteomic analysis to confirm the predicted proteome of Corynebacterium pseudotuberculosis under nitrosative stress mediated by nitric oxide.

PubMed

Silva, Wanderson M; Carvalho, Rodrigo D; Soares, Siomar C; Bastos, Isabela Fs; Folador, Edson L; Souza, Gustavo Hmf; Le Loir, Yves; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

2014-12-04

Corynebacterium pseudotuberculosis biovar ovis is a facultative intracellular pathogen, and the etiological agent of caseous lymphadenitis in small ruminants. During the infection process, the bacterium is subjected to several stress conditions, including nitrosative stress, which is caused by nitric oxide (NO). In silico analysis of the genome of C. pseudotuberculosis ovis 1002 predicted several genes that could influence the resistance of this pathogen to nitrosative stress. Here, we applied high-throughput proteomics using high definition mass spectrometry to characterize the functional genome of C. pseudotuberculosis ovis 1002 in the presence of NO-donor Diethylenetriamine/nitric oxide adduct (DETA/NO), with the aim of identifying proteins involved in nitrosative stress resistance. We characterized 835 proteins, representing approximately 41% of the predicted proteome of C. pseudotuberculosis ovis 1002, following exposure to nitrosative stress. In total, 102 proteins were exclusive to the proteome of DETA/NO-induced cells, and a further 58 proteins were differentially regulated between the DETA/NO and control conditions. An interactomic analysis of the differential proteome of C. pseudotuberculosis in response to nitrosative stress was also performed. Our proteomic data set suggested the activation of both a general stress response and a specific nitrosative stress response, as well as changes in proteins involved in cellular metabolism, detoxification, transcriptional regulation, and DNA synthesis and repair. Our proteomic analysis validated previously-determined in silico data for C. pseudotuberculosis ovis 1002. In addition, proteomic screening performed in the presence of NO enabled the identification of a set of factors that can influence the resistance and survival of C. pseudotuberculosis during exposure to nitrosative stress.
PeroxisomeDB: a database for the peroxisomal proteome, functional genomics and disease

PubMed Central

Schlüter, Agatha; Fourcade, Stéphane; Domènech-Estévez, Enric; Gabaldón, Toni; Huerta-Cepas, Jaime; Berthommier, Guillaume; Ripp, Raymond; Wanders, Ronald J. A.; Poch, Olivier; Pujol, Aurora

2007-01-01

Peroxisomes are essential organelles of eukaryotic origin, ubiquitously distributed in cells and organisms, playing key roles in lipid and antioxidant metabolism. Loss or malfunction of peroxisomes causes more than 20 fatal inherited conditions. We have created a peroxisomal database () that includes the complete peroxisomal proteome of Homo sapiens and Saccharomyces cerevisiae, by gathering, updating and integrating the available genetic and functional information on peroxisomal genes. PeroxisomeDB is structured in interrelated sections ‘Genes’, ‘Functions’, ‘Metabolic pathways’ and ‘Diseases’, that include hyperlinks to selected features of NCBI, ENSEMBL and UCSC databases. We have designed graphical depictions of the main peroxisomal metabolic routes and have included updated flow charts for diagnosis. Precomputed BLAST, PSI-BLAST, multiple sequence alignment (MUSCLE) and phylogenetic trees are provided to assist in direct multispecies comparison to study evolutionary conserved functions and pathways. Highlights of the PeroxisomeDB include new tools developed for facilitating (i) identification of novel peroxisomal proteins, by means of identifying proteins carrying peroxisome targeting signal (PTS) motifs, (ii) detection of peroxisomes in silico, particularly useful for screening the deluge of newly sequenced genomes. PeroxisomeDB should contribute to the systematic characterization of the peroxisomal proteome and facilitate system biology approaches on the organelle. PMID:17135190
Covering complete proteomes with X-ray structures: A current snapshot

DOE PAGES

Mizianty, Marcin J.; Fan, Xiao; Yan, Jing; ...

2014-10-23

Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtainedmore » through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-ray structure determination were determined.« less
The cell envelope proteome of Aggregatibacter actinomycetemcomitans

PubMed Central

Smith, Kenneth P.; Fields, Julia G.; Voogt, Richard D.; Deng, Bin; Lam, Ying-Wai; Mintz, Keith P.

2014-01-01

Summary The cell envelope of Gram-negative bacteria serves a critical role in maintenance of cellular homeostasis, resistance to external stress, and host-pathogen interactions. Envelope protein composition is influenced by the physiological and environmental demands placed on the bacterium. In this study, we report a comprehensive compilation of cell envelope proteins from the periodontal and systemic pathogen Aggregatibacter actinomycetemcomitans VT1169, an afimbriated serotype b strain. The urea-extracted membrane proteins were identified by mass spectrometry-based shotgun proteomics. The membrane proteome, isolated from actively growing bacteria under normal laboratory conditions, included 648 proteins representing 28% of the predicted ORFs in the genome. Bioinformatic analyses were used to annotate and predict the cellular location and function of the proteins. Surface adhesins, porins, lipoproteins, numerous influx and efflux pumps, multiple sugar, amino acid and iron transporters, and components of the type I, II and V secretion systems were identified. Periplasmic space and cytoplasmic proteins with chaperone function were also identified. 107 proteins with unknown function were associated with the cell envelope. Orthologs of a subset of these uncharacterized proteins are present in other bacterial genomes, while others are found exclusively in A. actinomycetemcomitans. This knowledge will contribute to elucidating the role of cell envelope proteins in bacterial growth and survival in the oral cavity. PMID:25055881
Genome-wide identification of the subcellular localization of the Escherichia coli B proteome using experimental and computational methods.

PubMed

Han, Mee-Jung; Yun, Hongseok; Lee, Jeong Wook; Lee, Yu Hyun; Lee, Sang Yup; Yoo, Jong-Shin; Kim, Jin Young; Kim, Jihyun F; Hur, Cheol-Goo

2011-04-01

Escherichia coli K-12 and B strains have most widely been employed for scientific studies as well as industrial applications. Recently, the complete genome sequences of two representative descendants of E. coli B strains, REL606 and BL21(DE3), have been determined. Here, we report the subproteome reference maps of E. coli B REL606 by analyzing cytoplasmic, periplasmic, inner and outer membrane, and extracellular proteomes based on the genome information using experimental and computational approaches. Among the total of 3487 spots, 651 proteins including 410 non-redundant proteins were identified and characterized by 2-DE and LC-MS/MS; they include 440 cytoplasmic, 45 periplasmic, 50 inner membrane, 61 outer membrane, and 55 extracellular proteins. In addition, subcellular localizations of all 4205 ORFs of E. coli B were predicted by combined computational prediction methods. The subcellular localizations of 1812 (43.09%) proteins of currently unknown function were newly assigned. The results of computational prediction were also compared with the experimental results, showing that overall precision and recall were 92.16 and 92.16%, respectively. This work represents the most comprehensive analyses of the subproteomes of E. coli B, and will be useful as a reference for proteome profiling studies under various conditions. The complete proteome data are available online (http://ecolib.kaist.ac.kr). Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Proteomics in medical microbiology.

PubMed

Cash, P

2000-04-01

The techniques of proteomics (high resolution two-dimensional electrophoresis and protein characterisation) are widely used for microbiological research to analyse global protein synthesis as an indicator of gene expression. The rapid progress in microbial proteomics has been achieved through the wide availability of whole genome sequences for a number of bacterial groups. Beyond providing a basic understanding of microbial gene expression, proteomics has also played a role in medical areas of microbiology. Progress has been made in the use of the techniques for investigating the epidemiology and taxonomy of human microbial pathogens, the identification of novel pathogenic mechanisms and the analysis of drug resistance. In each of these areas, proteomics has provided new insights that complement genomic-based investigations. This review describes the current progress in these research fields and highlights some of the technical challenges existing for the application of proteomics in medical microbiology. The latter concern the analysis of genetically heterogeneous bacterial populations and the integration of the proteomic and genomic data for these bacteria. The characterisation of the proteomes of bacterial pathogens growing in their natural hosts remains a future challenge.
New Funding Opportunity - Illuminating the Druggable Genome | Office of Cancer Clinical Proteomics Research

Cancer.gov

The National Institutes of Health Common Fund announces two new Funding Opportunity Announcements with a focus on the Illuminating the Druggable Genome (IDG). These funding opportunities are designed to foster the development of technologies and information management to facilitate the unveiling of the functions of the poorly characterized and/or un-annotated members in four protein classes of the Druggable Genome. The IDG project is predicated on the need to fully explore the underlying biology and role in disease of genes linked to already drugged genes within the Druggable Genome.
An insight into cyanobacterial genomics--a perspective.

PubMed

Lakshmi, Palaniswamy Thanga Velan

2007-05-20

At the turn of the millennium, cyanobacteria deserve attention to be reviewed to understand the past, present and future. The advent of post genomic research, which encompasses functional genomics, structural genomics, transcriptomics, pharmacogenomics, proteomics and metabolomics that allows a systematic wide approach for biological system studies. Thus by exploiting genomic and associated protein information through computational analyses, the fledging information that are generated by biotechnological analyses, could be well extrapolated to fill in the lacuna of scarce information on cyanobacteria and as an effort this paper attempts to highlights the perspectives available and awakens researcher to concentrate in the field of cyanobacterial informatics.
Biochemical and genetic analysis of the yeast proteome with a movable ORF collection

PubMed Central

Gelperin, Daniel M.; White, Michael A.; Wilkinson, Martha L.; Kon, Yoshiko; Kung, Li A.; Wise, Kevin J.; Lopez-Hoyo, Nelson; Jiang, Lixia; Piccirillo, Stacy; Yu, Haiyuan; Gerstein, Mark; Dumont, Mark E.; Phizicky, Eric M.; Snyder, Michael; Grayhack, Elizabeth J.

2005-01-01

Functional analysis of the proteome is an essential part of genomic research. To facilitate different proteomic approaches, a MORF (moveable ORF) library of 5854 yeast expression plasmids was constructed, each expressing a sequence-verified ORF as a C-terminal ORF fusion protein, under regulated control. Analysis of 5573 MORFs demonstrates that nearly all verified ORFs are expressed, suggests the authenticity of 48 ORFs characterized as dubious, and implicates specific processes including cytoskeletal organization and transcriptional control in growth inhibition caused by overexpression. Global analysis of glycosylated proteins identifies 109 new confirmed N-linked and 345 candidate glycoproteins, nearly doubling the known yeast glycome. PMID:16322557
A phylogenomic data-driven exploration of viral origins and evolution

PubMed Central

Nasir, Arshan; Caetano-Anollés, Gustavo

2015-01-01

The origin of viruses remains mysterious because of their diverse and patchy molecular and functional makeup. Although numerous hypotheses have attempted to explain viral origins, none is backed by substantive data. We take full advantage of the wealth of available protein structural and functional data to explore the evolution of the proteomic makeup of thousands of cells and viruses. Despite the extremely reduced nature of viral proteomes, we established an ancient origin of the “viral supergroup” and the existence of widespread episodes of horizontal transfer of genetic information. Viruses harboring different replicon types and infecting distantly related hosts shared many metabolic and informational protein structural domains of ancient origin that were also widespread in cellular proteomes. Phylogenomic analysis uncovered a universal tree of life and revealed that modern viruses reduced from multiple ancient cells that harbored segmented RNA genomes and coexisted with the ancestors of modern cells. The model for the origin and evolution of viruses and cells is backed by strong genomic and structural evidence and can be reconciled with existing models of viral evolution if one considers viruses to have originated from ancient cells and not from modern counterparts. PMID:26601271
Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics

PubMed Central

Deutsch, Eric W.; Mendoza, Luis; Shteynberg, David; Slagel, Joseph; Sun, Zhi; Moritz, Robert L.

2015-01-01

Democratization of genomics technologies has enabled the rapid determination of genotypes. More recently the democratization of comprehensive proteomics technologies is enabling the determination of the cellular phenotype and the molecular events that define its dynamic state. Core proteomic technologies include mass spectrometry to define protein sequence, protein:protein interactions, and protein post-translational modifications. Key enabling technologies for proteomics are bioinformatic pipelines to identify, quantitate, and summarize these events. The Trans-Proteomics Pipeline (TPP) is a robust open-source standardized data processing pipeline for large-scale reproducible quantitative mass spectrometry proteomics. It supports all major operating systems and instrument vendors via open data formats. Here we provide a review of the overall proteomics workflow supported by the TPP, its major tools, and how it can be used in its various modes from desktop to cloud computing. We describe new features for the TPP, including data visualization functionality. We conclude by describing some common perils that affect the analysis of tandem mass spectrometry datasets, as well as some major upcoming features. PMID:25631240
Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics.

PubMed

Deutsch, Eric W; Mendoza, Luis; Shteynberg, David; Slagel, Joseph; Sun, Zhi; Moritz, Robert L

2015-08-01

Democratization of genomics technologies has enabled the rapid determination of genotypes. More recently the democratization of comprehensive proteomics technologies is enabling the determination of the cellular phenotype and the molecular events that define its dynamic state. Core proteomic technologies include MS to define protein sequence, protein:protein interactions, and protein PTMs. Key enabling technologies for proteomics are bioinformatic pipelines to identify, quantitate, and summarize these events. The Trans-Proteomics Pipeline (TPP) is a robust open-source standardized data processing pipeline for large-scale reproducible quantitative MS proteomics. It supports all major operating systems and instrument vendors via open data formats. Here, we provide a review of the overall proteomics workflow supported by the TPP, its major tools, and how it can be used in its various modes from desktop to cloud computing. We describe new features for the TPP, including data visualization functionality. We conclude by describing some common perils that affect the analysis of MS/MS datasets, as well as some major upcoming features. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Characterisation of the Manduca sexta sperm proteome: Genetic novelty underlying sperm composition in Lepidoptera.

PubMed

Whittington, Emma; Zhao, Qian; Borziak, Kirill; Walters, James R; Dorus, Steve

2015-07-01

The application of mass spectrometry based proteomics to sperm biology has greatly accelerated progress in understanding the molecular composition and function of spermatozoa. To date, these approaches have been largely restricted to model organisms, all of which produce a single sperm morph capable of oocyte fertilisation. Here we apply high-throughput mass spectrometry proteomic analysis to characterise sperm composition in Manduca sexta, the tobacco hornworm moth, which produce heteromorphic sperm, including one fertilisation competent (eupyrene) and one incompetent (apyrene) sperm type. This resulted in the high confidence identification of 896 proteins from a co-mixed sample of both sperm types, of which 167 are encoded by genes with strict one-to-one orthology in Drosophila melanogaster. Importantly, over half (55.1%) of these orthologous proteins have previously been identified in the D. melanogaster sperm proteome and exhibit significant conservation in quantitative protein abundance in sperm between the two species. Despite the complex nature of gene expression across spermatogenic stages, a significant correlation was also observed between sperm protein abundance and testis gene expression. Lepidopteran-specific sperm proteins (e.g., proteins with no homology to proteins in non-Lepidopteran taxa) were present in significantly greater abundance on average than those with homology outside the Lepidoptera. Given the disproportionate production of apyrene sperm (96% of all mature sperm in Manduca) relative to eupyrene sperm, these evolutionarily novel and highly abundant proteins are candidates for possessing apyrene-specific functions. Lastly, comparative genomic analyses of testis-expressed, ovary-expressed and sperm genes identified a concentration of novel sperm proteins shared amongst Lepidoptera of potential relevance to the evolutionary origin of heteromorphic spermatogenesis. As the first published Lepidopteran sperm proteome, this whole-cell proteomic characterisation will facilitate future evolutionary genetic and developmental studies of heteromorphic sperm production and parasperm function. Furthermore, the analyses presented here provide useful annotation information regarding sex-biased gene expression, novel Lepidopteran genes and gene function in the male gamete to complement the newly sequenced and annotated Manduca genome. Copyright © 2015 Elsevier Ltd. All rights reserved.

Quantitative trait loci mapping of the mouse plasma proteome (pQTL).

PubMed

Holdt, Lesca M; von Delft, Annette; Nicolaou, Alexandros; Baumann, Sven; Kostrzewa, Markus; Thiery, Joachim; Teupser, Daniel

2013-02-01

A current challenge in the era of genome-wide studies is to determine the responsible genes and mechanisms underlying newly identified loci. Screening of the plasma proteome by high-throughput mass spectrometry (MALDI-TOF MS) is considered a promising approach for identification of metabolic and disease processes. Therefore, plasma proteome screening might be particularly useful for identifying responsible genes when combined with analysis of variation in the genome. Here, we describe a proteomic quantitative trait locus (pQTL) study of plasma proteome screens in an F(2) intercross of 455 mice mapped with 177 genetic markers across the genome. A total of 69 of 176 peptides revealed significant LOD scores (≥5.35) demonstrating strong genetic regulation of distinct components of the plasma proteome. Analyses were confirmed by mechanistic studies and MALDI-TOF/TOF, liquid chromatography-tandem mass spectrometry (LC-MS/MS) analyses of the two strongest pQTLs: A pQTL for mass-to-charge ratio (m/z) 3494 (LOD 24.9, D11Mit151) was identified as the N-terminal 35 amino acids of hemoglobin subunit A (Hba) and caused by genetic variation in Hba. Another pQTL for m/z 8713 (LOD 36.4; D1Mit111) was caused by variation in apolipoprotein A2 (Apoa2) and cosegregated with HDL cholesterol. Taken together, we show that genome-wide plasma proteome profiling in combination with genome-wide genetic screening aids in the identification of causal genetic variants affecting abundance of plasma proteins.
Quantitative Trait Loci Mapping of the Mouse Plasma Proteome (pQTL)

PubMed Central

Holdt, Lesca M.; von Delft, Annette; Nicolaou, Alexandros; Baumann, Sven; Kostrzewa, Markus; Thiery, Joachim; Teupser, Daniel

2013-01-01

A current challenge in the era of genome-wide studies is to determine the responsible genes and mechanisms underlying newly identified loci. Screening of the plasma proteome by high-throughput mass spectrometry (MALDI-TOF MS) is considered a promising approach for identification of metabolic and disease processes. Therefore, plasma proteome screening might be particularly useful for identifying responsible genes when combined with analysis of variation in the genome. Here, we describe a proteomic quantitative trait locus (pQTL) study of plasma proteome screens in an F2 intercross of 455 mice mapped with 177 genetic markers across the genome. A total of 69 of 176 peptides revealed significant LOD scores (≥5.35) demonstrating strong genetic regulation of distinct components of the plasma proteome. Analyses were confirmed by mechanistic studies and MALDI-TOF/TOF, liquid chromatography-tandem mass spectrometry (LC-MS/MS) analyses of the two strongest pQTLs: A pQTL for mass-to-charge ratio (m/z) 3494 (LOD 24.9, D11Mit151) was identified as the N-terminal 35 amino acids of hemoglobin subunit A (Hba) and caused by genetic variation in Hba. Another pQTL for m/z 8713 (LOD 36.4; D1Mit111) was caused by variation in apolipoprotein A2 (Apoa2) and cosegregated with HDL cholesterol. Taken together, we show that genome-wide plasma proteome profiling in combination with genome-wide genetic screening aids in the identification of causal genetic variants affecting abundance of plasma proteins. PMID:23172855
Objectives | Office of Cancer Clinical Proteomics Research

Cancer.gov

The overall objective of CPTAC is to systematically identify proteins that derive from alterations in cancer genomes and related biological processes, in order to understand the molecular basis of cancer that is not fully elucidated or not possible through genomics and to accelerate the translation of molecular findings into the clinic. This is to be achieved through enhancing our understanding of cancer genome biology by adding a complementary functional layer of protein biology (a “proteogenome” approach) that refines/prioritizes driver genes, enhances understanding of pathogenesis
Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis

PubMed Central

Tellgren-Roth, Christian; Baudo, Charles D.; Kennell, John C.; Sun, Sheng; Billmyre, R. Blake; Schröder, Markus S.; Andersson, Anna; Holm, Tina; Sigurgeirsson, Benjamin; Wu, Guangxi; Sankaranarayanan, Sundar Ram; Siddharthan, Rahul; Sanyal, Kaustuv; Lundeberg, Joakim; Nystedt, Björn; Boekhout, Teun; Dawson, Thomas L.; Heitman, Joseph

2017-01-01

Abstract Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies. PMID:28100699
Proteome Characterization of Leaves in Common Bean

PubMed Central

Robison, Faith M.; Heuberger, Adam L.; Brick, Mark A.; Prenni, Jessica E.

2015-01-01

Dry edible bean (Phaseolus vulgaris L.) is a globally relevant food crop. The bean genome was recently sequenced and annotated allowing for proteomics investigations aimed at characterization of leaf phenotypes important to agriculture. The objective of this study was to utilize a shotgun proteomics approach to characterize the leaf proteome and to identify protein abundance differences between two bean lines with known variation in their physiological resistance to biotic stresses. Overall, 640 proteins were confidently identified. Among these are proteins known to be involved in a variety of molecular functions including oxidoreductase activity, binding peroxidase activity, and hydrolase activity. Twenty nine proteins were found to significantly vary in abundance (p-value < 0.05) between the two bean lines, including proteins associated with biotic stress. To our knowledge, this work represents the first large scale shotgun proteomic analysis of beans and our results lay the groundwork for future studies designed to investigate the molecular mechanisms involved in pathogen resistance. PMID:28248269
Automation, parallelism, and robotics for proteomics.

PubMed

Alterovitz, Gil; Liu, Jonathan; Chow, Jijun; Ramoni, Marco F

2006-07-01

The speed of the human genome project (Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C. et al., Nature 2001, 409, 860-921) was made possible, in part, by developments in automation of sequencing technologies. Before these technologies, sequencing was a laborious, expensive, and personnel-intensive task. Similarly, automation and robotics are changing the field of proteomics today. Proteomics is defined as the effort to understand and characterize proteins in the categories of structure, function and interaction (Englbrecht, C. C., Facius, A., Comb. Chem. High Throughput Screen. 2005, 8, 705-715). As such, this field nicely lends itself to automation technologies since these methods often require large economies of scale in order to achieve cost and time-saving benefits. This article describes some of the technologies and methods being applied in proteomics in order to facilitate automation within the field as well as in linking proteomics-based information with other related research areas.
Enriching the annotation of Mycobacterium tuberculosis H37Rv proteome using remote homology detection approaches: insights into structure and function.

PubMed

Ramakrishnan, Gayatri; Ochoa-Montaño, Bernardo; Raghavender, Upadhyayula S; Mudgal, Richa; Joshi, Adwait G; Chandra, Nagasuma R; Sowdhamini, Ramanathan; Blundell, Tom L; Srinivasan, Narayanaswamy

2015-01-01

The availability of the genome sequence of Mycobacterium tuberculosis H37Rv has encouraged determination of large numbers of protein structures and detailed definition of the biological information encoded therein; yet, the functions of many proteins in M. tuberculosis remain unknown. The emergence of multidrug resistant strains makes it a priority to exploit recent advances in homology recognition and structure prediction to re-analyse its gene products. Here we report the structural and functional characterization of gene products encoded in the M. tuberculosis genome, with the help of sensitive profile-based remote homology search and fold recognition algorithms resulting in an enhanced annotation of the proteome where 95% of the M. tuberculosis proteins were identified wholly or partly with information on structure or function. New information includes association of 244 proteins with 205 domain families and a separate set of new association of folds to 64 proteins. Extending structural information across uncharacterized protein families represented in the M. tuberculosis proteome, by determining superfamily relationships between families of known and unknown structures, has contributed to an enhancement in the knowledge of structural content. In retrospect, such superfamily relationships have facilitated recognition of probable structure and/or function for several uncharacterized protein families, eventually aiding recognition of probable functions for homologous proteins corresponding to such families. Gene products unique to mycobacteria for which no functions could be identified are 183. Of these 18 were determined to be M. tuberculosis specific. Such pathogen-specific proteins are speculated to harbour virulence factors required for pathogenesis. A re-annotated proteome of M. tuberculosis, with greater completeness of annotated proteins and domain assigned regions, provides a valuable basis for experimental endeavours designed to obtain a better understanding of pathogenesis and to accelerate the process of drug target discovery. Copyright © 2014 Elsevier Ltd. All rights reserved.
Draft Map of Human Proteome Published | Office of Cancer Clinical Proteomics Research

Cancer.gov

In a recently published article in the journal Nature, researchers have developed a draft map of the human proteome. Striving for the protein equivalent of the Human Genome Project, an international team of researchers has created an initial catalog of the human proteome. In total, using 30 different human tissues, the researchers identified proteins encoded by 17,294 genes, which is approximately 84 percent of all of the genes in the human genome predicted to encode proteins.
Proteomics research in India: an update.

PubMed

Reddy, Panga Jaipal; Atak, Apurva; Ghantasala, Saicharan; Kumar, Saurabh; Gupta, Shabarni; Prasad, T S Keshava; Zingde, Surekha M; Srivastava, Sanjeeva

2015-09-08

After a successful completion of the Human Genome Project, deciphering the mystery surrounding the human proteome posed a major challenge. Despite not being largely involved in the Human Genome Project, the Indian scientific community contributed towards proteomic research along with the global community. Currently, more than 76 research/academic institutes and nearly 145 research labs are involved in core proteomic research across India. The Indian researchers have been major contributors in drafting the "human proteome map" along with international efforts. In addition to this, virtual proteomics labs, proteomics courses and remote triggered proteomics labs have helped to overcome the limitations of proteomics education posed due to expensive lab infrastructure. The establishment of Proteomics Society, India (PSI) has created a platform for the Indian proteomic researchers to share ideas, research collaborations and conduct annual conferences and workshops. Indian proteomic research is really moving forward with the global proteomics community in a quest to solve the mysteries of proteomics. A draft map of the human proteome enhances the enthusiasm among intellectuals to promote proteomic research in India to the world.This article is part of a Special Issue entitled: Proteomics in India. Copyright © 2015 Elsevier B.V. All rights reserved.
Updated biological roles for matrix metalloproteinases and new "intracellular" substrates revealed by degradomics.

PubMed

Butler, Georgina S; Overall, Christopher M

2009-11-24

Shotgun proteomics techniques are conceptually unbiased, but data interpretation and follow-up experiments are often constrained by dogma, established beliefs that are accepted without question, that can dilute the power of proteomics and hinder scientific progress. Proteomics and degradomics, the characterization of all proteases, inhibitors, and protease substrates by genomic and proteomic techniques, have exponentially expanded the known substrate repertoire of the matrix metalloproteinases (MMPs), even to include intracellular proteins with newly recognized extracellular functions. Thus, the dogma that MMPs are dowdy degraders of extracellular matrix has been resolutely overturned, and the metamorphosis of MMPs into modulators of multiple signaling pathways has been facilitated. Here we review progress made in the field of degradomics and present a current view of the MMP degradome.
The Human Proteome Organization Chromosome 6 Consortium: integrating chromosome-centric and biology/disease driven strategies.

PubMed

Borchers, C H; Kast, J; Foster, L J; Siu, K W M; Overall, C M; Binkowski, T A; Hildebrand, W H; Scherer, A; Mansoor, M; Keown, P A

2014-04-04

The Human Proteome Project (HPP) is designed to generate a comprehensive map of the protein-based molecular architecture of the human body, to provide a resource to help elucidate biological and molecular function, and to advance diagnosis and treatment of diseases. Within this framework, the chromosome-based HPP (C-HPP) has allocated responsibility for mapping individual chromosomes by country or region, while the biology/disease HPP (B/D-HPP) coordinates these teams in cross-functional disease-based groups. Chromosome 6 (Ch6) provides an excellent model for integration of these two tasks. This metacentric chromosome has a complement of 1002-1034 genes that code for known, novel or putative proteins. Ch6 is functionally associated with more than 120 major human diseases, many with high population prevalence, devastating clinical impact and profound societal consequences. The unique combination of genomic, proteomic, metabolomic, phenomic and health services data being drawn together within the Ch6 program has enormous potential to advance personalized medicine by promoting robust biomarkers, subunit vaccines and new drug targets. The strong liaison between the clinical and laboratory teams, and the structured framework for technology transfer and health policy decisions within Canada will increase the speed and efficacy of this transition, and the value of this translational research. Canada has been selected to play a leading role in the international Human Proteome Project, the global counterpart of the Human Genome Project designed to understand the structure and function of the human proteome in health and disease. Canada will lead an international team focusing on chromosome 6, which is functionally associated with more than 120 major human diseases, including immune and inflammatory disorders affecting the brain, skeletal system, heart and blood vessels, lungs, kidney, liver, gastrointestinal tract and endocrine system. Many of these chronic and persistent diseases have a high population prevalence, devastating clinical impact and profound societal consequences. As a result, they impose a multi-billion dollar economic burden on Canada and on all advanced societies through direct costs of patient care, the loss of health and productivity, and extensive caregiver burden. There is no definitive treatment at the present time for any of these disorders. The manuscript outlines the research which will involve a systematic assessment of all chromosome 6 genes, development of a knowledge base, and development of assays and reagents for all chromosome 6 proteins. We feel that the informatic infrastructure and MRM assays developed will place the chromosome 6 consortium in an excellent position to be a leading player in this major international research initiative. This article is part of a Special Issue: Can Proteomics Fill the Gap Between Genomics and Phenotypes? © 2013.
Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

PubMed Central

Callister, Stephen J.; McCue, Lee Ann; Turse, Joshua E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

2008-01-01

While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits. PMID:18253490
GENOMIC AND PROTEOMIC TECHNIQUES APPLIED TO REPRODUCTIVE BIOLOGY

EPA Science Inventory

Genomic and proteomic techniques applied to reproductive biology
John C. Rockett
Reproductive Toxicology Division, National Health and Environmental Effects Research Laboratory, Office of Research and Development, United States Environmental Protection Agency, Research Tria...
Wheat proteomics: proteome modulation and abiotic stress acclimation

PubMed Central

Komatsu, Setsuko; Kamal, Abu H. M.; Hossain, Zahed

2014-01-01

Cellular mechanisms of stress sensing and signaling represent the initial plant responses to adverse conditions. The development of high-throughput “Omics” techniques has initiated a new era of the study of plant molecular strategies for adapting to environmental changes. However, the elucidation of stress adaptation mechanisms in plants requires the accurate isolation and characterization of stress-responsive proteins. Because the functional part of the genome, namely the proteins and their post-translational modifications, are critical for plant stress responses, proteomic studies provide comprehensive information about the fine-tuning of cellular pathways that primarily involved in stress mitigation. This review summarizes the major proteomic findings related to alterations in the wheat proteomic profile in response to abiotic stresses. Moreover, the strengths and weaknesses of different sample preparation techniques, including subcellular protein extraction protocols, are discussed in detail. The continued development of proteomic approaches in combination with rapidly evolving bioinformatics tools and interactive databases will facilitate understanding of the plant mechanisms underlying stress tolerance. PMID:25538718
Directed Shotgun Proteomics Guided by Saturated RNA-seq Identifies a Complete Expressed Prokaryotic Proteome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Omasits, U.; Quebatte, Maxime; Stekhoven, Daniel J.

2013-11-01

Prokaryotes, due to their moderate complexity, are particularly amenable to the comprehensive identification of the protein repertoire expressed under different conditions. We applied a generic strategy to identify a complete expressed prokaryotic proteome, which is based on the analysis of RNA and proteins extracted from matched samples. Saturated transcriptome profiling by RNA-seq provided an endpoint estimate of the protein-coding genes expressed under two conditions which mimic the interaction of Bartonella henselae with its mammalian host. Directed shotgun proteomics experiments were carried out on four subcellular fractions. By specifically targeting proteins which are short, basic, low abundant, and membrane localized, wemore » could eliminate their initial underrepresentation compared to the estimated endpoint. A total of 1250 proteins were identified with an estimated false discovery rate below 1%. This represents 85% of all distinct annotated proteins and ~90% of the expressed protein-coding genes. Genes that were detected at the transcript but not protein level, were found to be highly enriched in several genomic islands. Furthermore, genes that lacked an ortholog and a functional annotation were not detected at the protein level; these may represent examples of overprediction in genome annotations. A dramatic membrane proteome reorganization was observed, including differential regulation of autotransporters, adhesins, and hemin binding proteins. Particularly noteworthy was the complete membrane proteome coverage, which included expression of all members of the VirB/D4 type IV secretion system, a key virulence factor.« less
Directed shotgun proteomics guided by saturated RNA-seq identifies a complete expressed prokaryotic proteome

PubMed Central

Omasits, Ulrich; Quebatte, Maxime; Stekhoven, Daniel J.; Fortes, Claudia; Roschitzki, Bernd; Robinson, Mark D.; Dehio, Christoph; Ahrens, Christian H.

2013-01-01

Prokaryotes, due to their moderate complexity, are particularly amenable to the comprehensive identification of the protein repertoire expressed under different conditions. We applied a generic strategy to identify a complete expressed prokaryotic proteome, which is based on the analysis of RNA and proteins extracted from matched samples. Saturated transcriptome profiling by RNA-seq provided an endpoint estimate of the protein-coding genes expressed under two conditions which mimic the interaction of Bartonella henselae with its mammalian host. Directed shotgun proteomics experiments were carried out on four subcellular fractions. By specifically targeting proteins which are short, basic, low abundant, and membrane localized, we could eliminate their initial underrepresentation compared to the estimated endpoint. A total of 1250 proteins were identified with an estimated false discovery rate below 1%. This represents 85% of all distinct annotated proteins and ∼90% of the expressed protein-coding genes. Genes that were detected at the transcript but not protein level, were found to be highly enriched in several genomic islands. Furthermore, genes that lacked an ortholog and a functional annotation were not detected at the protein level; these may represent examples of overprediction in genome annotations. A dramatic membrane proteome reorganization was observed, including differential regulation of autotransporters, adhesins, and hemin binding proteins. Particularly noteworthy was the complete membrane proteome coverage, which included expression of all members of the VirB/D4 type IV secretion system, a key virulence factor. PMID:23878158
Caenorhabditis elegans chemical biology: lessons from small molecules

USDA-ARS?s Scientific Manuscript database

How can we complement Caenorhabditis elegans genomics and proteomics with a comprehensive structural and functional annotation of its metabolome? Several lines of evidence indicate that small molecules of largely undetermined structure play important roles in C. elegans biology, including key pathw...
Integrated proteomic and genomic analysis of colorectal cancer

Cancer.gov

Investigators who analyzed 95 human colorectal tumor samples have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, pro
Integrated proteogenomic characterization of human high grade serous ovarian cancer

PubMed Central

Zhang, Bai; McDermott, Jason E; Zhou, Jian-Ying; Petyuk, Vladislav A; Chen, Li; Ray, Debjit; Sun, Shisheng; Yang, Feng; Chen, Lijun; Wang, Jing; Shah, Punit; Cha, Seong Won; Aiyetan, Paul; Woo, Sunghee; Tian, Yuan; Gritsenko, Marina A; Clauss, Therese R; Choi, Caitlin; Monroe, Matthew E; Thomas, Stefani; Nie, Song; Wu, Chaochao; Moore, Ronald J; Yu, Kun-Hsing; Tabb, David L; Fenyö, David; Bafna, Vineet; Wang, Yue; Rodriguez, Henry; Boja, Emily S; Hiltke, Tara; Rivers, Robert C; Sokoll, Lori; Zhu, Heng; Shih, Ie-Ming; Cope, Leslie; Pandey, Akhilesh; Zhang, Bing; Snyder, Michael P; Levine, Douglas A; Smith, Richard D

2016-01-01

SUMMARY To provide a detailed analysis of the molecular components and underlying mechanisms associated with ovarian cancer, we performed a comprehensive mass spectrometry-based proteomic characterization of 174 ovarian tumors previously analyzed by The Cancer Genome Atlas (TCGA), of which 169 were high-grade serous carcinomas (HGSC). Integrating our proteomic measurements with the genomic data yielded a number of insights into disease such as how different copy number alternations influence the proteome, the proteins associated with chromosomal instability, the sets of signaling pathways that diverse genome rearrangements converge on, as well as the ones most associated with short overall survival. Specific protein acetylations associated with homologous recombination deficiency suggest a potential means for stratifying patients for therapy. In addition to providing a valuable resource, these findings provide a view of how the somatic genome drives the cancer proteome and associations between protein and post-translational modification levels and clinical outcomes in HGSC. PMID:27372738
Comprehensive genome-wide proteomic analysis of human placental tissue for the Chromosome-Centric Human Proteome Project.

PubMed

Lee, Hyoung-Joo; Jeong, Seul-Ki; Na, Keun; Lee, Min Jung; Lee, Sun Hee; Lim, Jong-Sun; Cha, Hyun-Jeong; Cho, Jin-Young; Kwon, Ja-Young; Kim, Hoguen; Song, Si Young; Yoo, Jong Shin; Park, Young Mok; Kim, Hail; Hancock, William S; Paik, Young-Ki

2013-06-07

As a starting point of the Chromosome-Centric Human Proteome Project (C-HPP), we established strategies of genome-wide proteomic analysis, including protein identification, quantitation of disease-specific proteins, and assessment of post-translational modifications, using paired human placental tissues from healthy and preeclampsia patients. This analysis resulted in identification of 4239 unique proteins with high confidence (two or more unique peptides with a false discovery rate less than 1%), covering 21% of approximately 20, 059 (Ensembl v69, Oct 2012) human proteins, among which 28 proteins exhibited differentially expressed preeclampsia-specific proteins. When these proteins are assigned to all human chromosomes, the pattern of the newly identified placental protein population is proportional to that of the gene count distribution of each chromosome. We also identified 219 unique N-linked glycopeptides, 592 unique phosphopeptides, and 66 chromosome 13-specific proteins. In particular, protein evidence of 14 genes previously known to be specifically up-regulated in human placenta was verified by mass spectrometry. With respect to the functional implication of these proteins, 38 proteins were found to be involved in regulatory factor biosynthesis or the immune system in the placenta, but the molecular mechanism of these proteins during pregnancy warrants further investigation. As far as we know, this work produced the highest number of proteins identified in the placenta and will be useful for annotating and mapping all proteins encoded in the human genome.

Comparative proteomic study on Brassica hexaploid and its parents provides new insights into the effects of polyploidization.

PubMed

Shen, Yanyue; Zhang, Yu; Zou, Jun; Meng, Jinling; Wang, Jianbo

2015-01-01

Polyploidy has played an important role in promoting plant evolution through genomic merging and doubling. Although genomic and transcriptomic changes have been observed in polyploids, the effects of polyploidization on proteomic divergence are poorly understood. In this study, we reported quantitative analysis of proteomic changes in leaves of Brassica hexaploid and its parents using isobaric tags for relative and absolute quantitation (iTRAQ) coupled with mass spectrometry. A total of 2044 reproducible proteins were quantified by at least two unique peptides. We detected 452 proteins differentially expressed between Brassica hexaploid and its parents, and 100 proteins were non-additively expressed in Brassica hexaploid, which suggested a trend of non-additive protein regulation following genomic merger and doubling. Functional categories of cellular component biogenesis, immune system process, and response to stimulus, were significantly enriched in non-additive proteins, probably providing a driving force for variation and adaptation in allopolyploids. In particular, majority of the total 452 differentially expressed proteins showed expression level dominance of one parental expression, and there was an expression level dominance bias toward the tetraploid progenitor. In addition, the percentage of differentially expressed proteins that matched previously reported differentially genes were relatively low. This study aimed to get new insights into the effects of polyploidization on proteomic divergence. Using iTRAQ LC-MS/MS technology, we identified 452 differentially expressed proteins between allopolyploid and its parents which involved in response to stimulus, multi-organism process, and immune system process, much more than previous studies using 2-DE coupled with mass spectrometry technology. Therefore, our manuscript represents the most comprehensive analysis of protein profiles in allopolyploid and its parents, which will lead to a better understanding of novelty and plasticity of the allopolyploid genomes. Copyright © 2014 Elsevier B.V. All rights reserved.
Proteome Exploration to Provide a Resource for the Investigation of Ganoderma lucidum

PubMed Central

Yu, Guo-Jun; Yin, Ya-Lin; Yu, Wen-Hui; Liu, Wei; Jin, Yan-Xia; Shrestha, Alok; Yang, Qing; Ye, Xiang-Dong; Sun, Hui

2015-01-01

Ganoderma lucidum is a basidiomycete white rot fungus that has been used for medicinal purposes worldwide. Although information concerning its genome and transcriptome has recently been reported, relatively little information is available for G. lucidum at the proteomic level. In this study, protein fractions from G. lucidum at three developmental stages (16-day mycelia, and fruiting bodies at 60 and 90 days) were prepared and subjected to LC-MS/MS analysis. A search against the G. lucidum genome database identified 803 proteins. Among these proteins, 61 lignocellulose degrading proteins were detected, most of which (49 proteins) were found in the 90-day fruiting bodies. Fourteen TCA-cycle related proteins, 17 peptidases, two argonaute-like proteins, and two immunomodulatory proteins were also detected. A majority (470) of the 803 proteins had GO annotations and were classified into 36 GO terms, with “binding”, “catalytic activity”, and “hydrolase activity” having high percentages. Additionally, 357 out of the 803 proteins were assigned to at least one COG functional category and grouped into 22 COG classifications. Based on the results from the proteomic and sequence alignment analyses, a potentially new immunomodulatory protein (GL18769) was expressed and shown to have high immunomodulatory activity. In this study, proteomic and biochemical analyses of G. lucidum were performed for the first time, revealing that proteins from this fungus can play significant bioactive roles and providing a new foundation for the further functional investigations that this fungus merits. PMID:25756518
The strategy, organization, and progress of the HUPO Human Proteome Project.

PubMed

Omenn, Gilbert S

2014-04-04

The Human Proteome Project is a major, comprehensive initiative of the Human Proteome Organization. This global collaborative effort aims to identify and characterize at least one protein product and many PTM, SAP, and splice variant isoforms from the 20,300 human protein-coding genes. The deliverables are an extensive parts list and an array of technology platforms, reagents, spectral libraries, and linked knowledge bases that advance the field and facilitate the use of proteomics by a much wider community of life scientists. Such enablement will help address the Grand Challenge of using proteomics to bridge major gaps between evidence of genomic variation and diverse phenotypes. The HUPO Human Proteome Project (HPP) has made an outstanding launch, including a special issue of the Journal of Proteome Research on the Chromosome-centric HPP with a total of 48 articles. This article is part of a Special Issue: Can Proteomics Fill the Gap Between Genomics and Phenotypes? © 2013.
Genomics, transcriptomics and proteomics to elucidate the pathogenesis of rheumatoid arthritis.

PubMed

Song, Xinqiang; Lin, Qingsong

2017-08-01

Rheumatoid arthritis is an autoimmune disease that affects several organs and tissues, predominantly the synovial joints. The pathogenesis of this disease is not completely understood, which maybe involved in the genomic variations, gene expression, protein translation and post-translational modifications. These system variations in genomics, transcriptomics and proteomics are dynamic in nature and their crosstalk is overwhelmingly complex, thus analyzing them separately may not be very informative. However, various '-omics' techniques developed in recent years have opened up new possibilities for clarifying disease pathways and thereby facilitating early diagnosis and specific therapies. This review examines how recent advances in the fields of genomics, transcriptomics and proteomics have contributed to our understanding of rheumatoid arthritis.
A-to-I RNA Editing Contributes to Proteomic Diversity in Cancer. | Office of Cancer Genomics

Cancer.gov

Adenosine (A) to inosine (I) RNA editing introduces many nucleotide changes in cancer transcriptomes. However, due to the complexity of post-transcriptional regulation, the contribution of RNA editing to proteomic diversity in human cancers remains unclear. Here, we performed an integrated analysis of TCGA genomic data and CPTAC proteomic data. Despite limited site diversity, we demonstrate that A-to-I RNA editing contributes to proteomic diversity in breast cancer through changes in amino acid sequences. We validate the presence of editing events at both RNA and protein levels.
Molecular Basis of Essential Thrombocytosis

DTIC Science & Technology

2008-06-01

Membrane proteins,” below), and 64% were present in the cytoskeleton, endoplasmic reticulum, mitochondria, cytosol, or Golgi apparatus ... perspectives Future advances in proteomic technology that incorporate miniatur- ization,101 coupled with an ability to integrate functional genomics...14. Kralovics R, Passamonti F, Buser AS, et al. A gain-of- function mutation of JAK2 in myeloproliferative disorders. The New England Journal of
Proteomics reveals novel components of the Anopheles gambiae eggshell

PubMed Central

Amenya, Dolphine A.; Chou, Wayne; Li, Jianyong; Yan, Guiyun; Gershon, Paul D.; James, Anthony A.; Marinotti, Osvaldo

2010-01-01

While genome and transcriptome sequencing has revealed a large number and diversity of Anopheles gambiae predicted proteins, identifying their functions and biosynthetic pathways remains challenging. Applied mass spectrometry based proteomics in conjunction with mosquito genome and transcriptome databases were used to identify 44 proteins as putative components of the eggshell. Among the identified molecules are two vitelline membrane proteins and a group of seven putative chorion proteins. Enzymes with peroxidase, laccase and phenoloxidase activities, likely involved in cross-linking reactions that stabilize the eggshell structure, also were identified. Seven odorant binding proteins were found in association with the mosquito eggshell, although their role has yet to be demonstrated. This analysis fills a considerable gap of knowledge about proteins that build the eggshell of anopheline mosquitoes. PMID:20433845
Selection on plant male function genes identifies candidates for reproductive isolation of yellow monkeyflowers.

PubMed

Aagaard, Jan E; George, Renee D; Fishman, Lila; Maccoss, Michael J; Swanson, Willie J

2013-01-01

Understanding the genetic basis of reproductive isolation promises insight into speciation and the origins of biological diversity. While progress has been made in identifying genes underlying barriers to reproduction that function after fertilization (post-zygotic isolation), we know much less about earlier acting pre-zygotic barriers. Of particular interest are barriers involved in mating and fertilization that can evolve extremely rapidly under sexual selection, suggesting they may play a prominent role in the initial stages of reproductive isolation. A significant challenge to the field of speciation genetics is developing new approaches for identification of candidate genes underlying these barriers, particularly among non-traditional model systems. We employ powerful proteomic and genomic strategies to study the genetic basis of conspecific pollen precedence, an important component of pre-zygotic reproductive isolation among yellow monkeyflowers (Mimulus spp.) resulting from male pollen competition. We use isotopic labeling in combination with shotgun proteomics to identify more than 2,000 male function (pollen tube) proteins within maternal reproductive structures (styles) of M. guttatus flowers where pollen competition occurs. We then sequence array-captured pollen tube exomes from a large outcrossing population of M. guttatus, and identify those genes with evidence of selective sweeps or balancing selection consistent with their role in pollen competition. We also test for evidence of positive selection on these genes more broadly across yellow monkeyflowers, because a signal of adaptive divergence is a common feature of genes causing reproductive isolation. Together the molecular evolution studies identify 159 pollen tube proteins that are candidate genes for conspecific pollen precedence. Our work demonstrates how powerful proteomic and genomic tools can be readily adapted to non-traditional model systems, allowing for genome-wide screens towards the goal of identifying the molecular basis of genetically complex traits.
Marine proteomics: a critical assessment of an emerging technology.

PubMed

Slattery, Marc; Ankisetty, Sridevi; Corrales, Jone; Marsh-Hunkin, K Erica; Gochfeld, Deborah J; Willett, Kristine L; Rimoldi, John M

2012-10-26

The application of proteomics to marine sciences has increased in recent years because the proteome represents the interface between genotypic and phenotypic variability and, thus, corresponds to the broadest possible biomarker for eco-physiological responses and adaptations. Likewise, proteomics can provide important functional information regarding biosynthetic pathways, as well as insights into mechanism of action, of novel marine natural products. The goal of this review is to (1) explore the application of proteomics methodologies to marine systems, (2) assess the technical approaches that have been used, and (3) evaluate the pros and cons of this proteomic research, with the intent of providing a critical analysis of its future roles in marine sciences. To date, proteomics techniques have been utilized to investigate marine microbe, plant, invertebrate, and vertebrate physiology, developmental biology, seafood safety, susceptibility to disease, and responses to environmental change. However, marine proteomics studies often suffer from poor experimental design, sample processing/optimization difficulties, and data analysis/interpretation issues. Moreover, a major limitation is the lack of available annotated genomes and proteomes for most marine organisms, including several "model species". Even with these challenges in mind, there is no doubt that marine proteomics is a rapidly expanding and powerful integrative molecular research tool from which our knowledge of the marine environment, and the natural products from this resource, will be significantly expanded.
The Multinational Arabidopsis Steering Subcommittee for Proteomics Assembles the Largest Proteome Database Resource for Plant Systems Biology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Weckwerth, Wolfram; Baginsky, Sacha; Van Wijk, Klass

2009-12-01

In the past 10 years, we have witnessed remarkable advances in the field of plant molecular biology. The rapid development of proteomic technologies and the speed with which these techniques have been applied to the field have altered our perception of how we can analyze proteins in complex systems. At nearly the same time, the availability of the complete genome for the model plant Arabidopsis thaliana was released; this effort provides an unsurpassed resource for the identification of proteins when researchers use MS to analyze plant samples. Recognizing the growth in this area, the Multinational Arabidopsis Steering Committee (MASC) establishedmore » a subcommittee for A. thaliana proteomics in 2006 with the objective of consolidating databases, technique standards, and experimentally validated candidate genes and functions. Since the establishment of the Multinational Arabidopsis Steering Subcommittee for Proteomics (MASCP), many new approaches and resources have become available. Recently, the subcommittee established a webpage to consolidate this information (www.masc-proteomics.org). It includes links to plant proteomic databases, general information about proteomic techniques, meeting information, a summary of proteomic standards, and other relevant resources. Altogether, this website provides a useful resource for the Arabidopsis proteomics community. In the future, the website will host discussions and investigate the cross-linking of databases. The subcommittee members have extensive experience in arabidopsis proteomics and collectively have produced some of the most extensive proteomics data sets for this model plant (Table S1 in the Supporting Information has a list of resources). The largest collection of proteomics data from a single study in A. thaliana was assembled into an accessible database (AtProteome; http://fgcz-atproteome.unizh.ch/index.php) and was recently published by the Baginsky lab.1 The database provides links to major Arabidopsis online resources, and raw data have been deposited in PRIDE and PRIDE BioMart. Included in this database is an Arabidopsis proteome map that provides evidence for the expression of {approx}50% of all predicted gene models, including several alternative gene models that are not represented in The Arabidopsis Information Resource (TAIR) protein database. A set of organ-specific biomarkers is provided, as well as organ-specific proteotypic peptides for 4105 proteins that can be used to facilitate targeted quantitative proteomic surveys. In the future, the AtProteome database will be linked to additional existing resources developed by MASCP members, such as PPDB, ProMEX, and SUBA. The most comprehensive study on the Arabidopsis chloroplast proteome, which includes information on chloroplast sorting signals, posttranslational modifications (PTMs), and protein abundances (analyzed by high-accuracy MS [Orbitrap]), was recently published by the van Wijk lab.2 These and previous data are available via the plant proteome database (PPDB; http://ppdb.tc.cornell.edu) for A. thaliana and maize. PPDB provides genome-wide experimental and functional characterization of the A. thaliana and maize proteomes, including PTMs and subcellular localization information, with an emphasis on leaf and plastid proteins. Maize and Arabidopsis proteome entries are directly linked via internal BLAST alignments within PPDB. Direct links for each protein to TAIR, SUBA, ProMEX, and other resources are also provided.« less
Identification of Maturation-Specific Proteins by Single-Cell Proteomics of Human Oocytes

PubMed Central

Virant-Klun, Irma; Leicht, Stefan; Hughes, Christopher; Krijgsveld, Jeroen

2016-01-01

Oocytes undergo a range of complex processes via oogenesis, maturation, fertilization, and early embryonic development, eventually giving rise to a fully functioning organism. To understand proteome composition and diversity during maturation of human oocytes, here we have addressed crucial aspects of oocyte collection and proteome analysis, resulting in the first proteome and secretome maps of human oocytes. Starting from 100 oocytes collected via a novel serum-free hanging drop culture system, we identified 2,154 proteins, whose function indicate that oocytes are largely resting cells with a proteome that is tailored for homeostasis, cellular attachment, and interaction with its environment via secretory factors. In addition, we have identified 158 oocyte-enriched proteins (such as ECAT1, PIWIL3, NLRP7)1 not observed in high-coverage proteomics studies of other human cell lines or tissues. Exploiting SP3, a novel technology for proteomic sample preparation using magnetic beads, we scaled down proteome analysis to single cells. Despite the low protein content of only ∼100 ng per cell, we consistently identified ∼450 proteins from individual oocytes. When comparing individual oocytes at the germinal vesicle (GV) and metaphase II (MII) stage, we found that the Tudor and KH domain-containing protein (TDRKH) is preferentially expressed in immature oocytes, while Wee2, PCNA, and DNMT1 were enriched in mature cells, collectively indicating that maintenance of genome integrity is crucial during oocyte maturation. This study demonstrates that an innovative proteomics workflow facilitates analysis of single human oocytes to investigate human oocyte biology and preimplantation development. The approach presented here paves the way for quantitative proteomics in other quantity-limited tissues and cell types. Data associated with this study are available via ProteomeXchange with identifier PXD004142. PMID:27215607
Identification of Maturation-Specific Proteins by Single-Cell Proteomics of Human Oocytes.

PubMed

Virant-Klun, Irma; Leicht, Stefan; Hughes, Christopher; Krijgsveld, Jeroen

2016-08-01

Oocytes undergo a range of complex processes via oogenesis, maturation, fertilization, and early embryonic development, eventually giving rise to a fully functioning organism. To understand proteome composition and diversity during maturation of human oocytes, here we have addressed crucial aspects of oocyte collection and proteome analysis, resulting in the first proteome and secretome maps of human oocytes. Starting from 100 oocytes collected via a novel serum-free hanging drop culture system, we identified 2,154 proteins, whose function indicate that oocytes are largely resting cells with a proteome that is tailored for homeostasis, cellular attachment, and interaction with its environment via secretory factors. In addition, we have identified 158 oocyte-enriched proteins (such as ECAT1, PIWIL3, NLRP7)(1) not observed in high-coverage proteomics studies of other human cell lines or tissues. Exploiting SP3, a novel technology for proteomic sample preparation using magnetic beads, we scaled down proteome analysis to single cells. Despite the low protein content of only ∼100 ng per cell, we consistently identified ∼450 proteins from individual oocytes. When comparing individual oocytes at the germinal vesicle (GV) and metaphase II (MII) stage, we found that the Tudor and KH domain-containing protein (TDRKH) is preferentially expressed in immature oocytes, while Wee2, PCNA, and DNMT1 were enriched in mature cells, collectively indicating that maintenance of genome integrity is crucial during oocyte maturation. This study demonstrates that an innovative proteomics workflow facilitates analysis of single human oocytes to investigate human oocyte biology and preimplantation development. The approach presented here paves the way for quantitative proteomics in other quantity-limited tissues and cell types. Data associated with this study are available via ProteomeXchange with identifier PXD004142. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Significant expansion of exon-bordering protein domains during animal proteome evolution

PubMed Central

Liu, Mingyi; Walch, Heiko; Wu, Shaoping; Grigoriev, Andrei

2005-01-01

We present evidence of remarkable genome-wide mobility and evolutionary expansion for a class of protein domains whose borders locate close to the borders of their encoding exons. These exon-bordering domains are more numerous and widely distributed in the human genome than other domains. They also co-occur with more diverse domains to form a larger variety of domain architectures in human proteins. A systematic comparison of nine animal genomes from nematodes to mammals revealed that exon-bordering domains expanded faster than other protein domains in both abundance and distribution, as well as the diversity of co-occurring domains and the domain architectures of harboring proteins. Furthermore, exon-bordering domains exhibited a particularly strong preference for class 1-1 intron phase. Our findings suggest that exon-bordering domains were amplified and interchanged within a genome more often and/or more successfully than other domains during evolution, probably the result of extensive exon shuffling and gene duplication events. The diverse biological functions of these domains underscore the important role they play in the expansion and diversification of animal proteomes. PMID:15640447
The Role of Clinical Proteomics, Lipidomics, and Genomics in the Diagnosis of Alzheimer's Disease.

PubMed

Martins, Ian James

2016-03-31

The early diagnosis of Alzheimer's disease (AD) has become important to the reversal and treatment of neurodegeneration, which may be relevant to premature brain aging that is associated with chronic disease progression. Clinical proteomics allows the detection of various proteins in fluids such as the urine, plasma, and cerebrospinal fluid for the diagnosis of AD. Interest in lipidomics has accelerated with plasma testing for various lipid biomarkers that may with clinical proteomics provide a more reproducible diagnosis for early brain aging that is connected to other chronic diseases. The combination of proteomics with lipidomics may decrease the biological variability between studies and provide reproducible results that detect a community's susceptibility to AD. The diagnosis of chronic disease associated with AD that now involves genomics may provide increased sensitivity to avoid inadvertent errors related to plasma versus cerebrospinal fluid testing by proteomics and lipidomics that identify new disease biomarkers in body fluids, cells, and tissues. The diagnosis of AD by various plasma biomarkers with clinical proteomics may now require the involvement of lipidomics and genomics to provide interpretation of proteomic results from various laboratories around the world.
Functional Genomics of Lignocellulose Degradation in the Basidiomycete White Rot Schizophyllum commune

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ohm, Robin A.; Tegelaar, Martin; Henrissat, Bernard

2013-03-01

White and brown rot fungi are among the most important wood decayers in nature. Although more than 50 genomes of Basidiomycete white and brown rots have been sequenced by the Joint Genome Institute, there is still a lot to learn about how these fungi degrade the tough polymers present in wood. In particular, very little is known about how these fungi regulate the expression of genes involved in lignocellulose degradation. Here, we used transcriptomics, proteomics, and promoter analysis in an effort to gain insight into the process of lignocellulose degradation.
CRISPR/Cas9: From Genome Engineering to Cancer Drug Discovery

PubMed Central

Luo, Ji

2016-01-01

Advances in translational research are often driven by new technologies. The advent of microarrays, next-generation sequencing, proteomics and RNA interference (RNAi) have led to breakthroughs in our understanding of the mechanisms of cancer and the discovery of new cancer drug targets. The discovery of the bacterial clustered regularly interspaced palindromic repeat (CRISPR) system and its subsequent adaptation as a tool for mammalian genome engineering has opened up new avenues for functional genomics studies. This review will focus on the utility of CRISPR in the context of cancer drug target discovery. PMID:28603775
ROCK1 is a potential combinatorial drug target for BRAF mutant melanoma

PubMed Central

Smit, Marjon A; Maddalo, Gianluca; Greig, Kylie; Raaijmakers, Linsey M; Possik, Patricia A; van Breukelen, Bas; Cappadona, Salvatore; Heck, Albert JR; Altelaar, AF Maarten; Peeper, Daniel S

2014-01-01

Treatment of BRAF mutant melanomas with specific BRAF inhibitors leads to tumor remission. However, most patients eventually relapse due to drug resistance. Therefore, we designed an integrated strategy using (phospho)proteomic and functional genomic platforms to identify drug targets whose inhibition sensitizes melanoma cells to BRAF inhibition. We found many proteins to be induced upon PLX4720 (BRAF inhibitor) treatment that are known to be involved in BRAF inhibitor resistance, including FOXD3 and ErbB3. Several proteins were down-regulated, including Rnd3, a negative regulator of ROCK1 kinase. For our genomic approach, we performed two parallel shRNA screens using a kinome library to identify genes whose inhibition sensitizes to BRAF or ERK inhibitor treatment. By integrating our functional genomic and (phospho)proteomic data, we identified ROCK1 as a potential drug target for BRAF mutant melanoma. ROCK1 silencing increased melanoma cell elimination when combined with BRAF or ERK inhibitor treatment. Translating this to a preclinical setting, a ROCK inhibitor showed augmented melanoma cell death upon BRAF or ERK inhibition in vitro. These data merit exploration of ROCK1 as a target in combination with current BRAF mutant melanoma therapies. PMID:25538140
Global response of Acidithiobacillus ferrooxidans ATCC 53993 to high concentrations of copper: A quantitative proteomics approach.

PubMed

Martínez-Bussenius, Cristóbal; Navarro, Claudio A; Orellana, Luis; Paradela, Alberto; Jerez, Carlos A

2016-08-11

Acidithiobacillus ferrooxidans is used in industrial bioleaching of minerals to extract valuable metals. A. ferrooxidans strain ATCC 53993 is much more resistant to copper than other strains of this microorganism and it has been proposed that genes present in an exclusive genomic island (GI) of this strain would contribute to its extreme copper tolerance. ICPL (isotope-coded protein labeling) quantitative proteomics was used to study in detail the response of this bacterium to copper. A high overexpression of RND efflux systems and CusF copper chaperones, both present in the genome and the GI of strain ATCC 53993 was found. Also, changes in the levels of the respiratory system proteins such as AcoP and Rus copper binding proteins and several proteins with other predicted functions suggest that numerous metabolic changes are apparently involved in controlling the effects of the toxic metal on this acidophile. Using quantitative proteomics we overview the adaptation mechanisms that biomining acidophiles use to stand their harsh environment. The overexpression of several genes present in an exclusive genomic island strongly suggests the importance of the proteins coded in this DNA region in the high tolerance of A. ferrooxidans ATCC 53993 to metals. Copyright © 2016 Elsevier B.V. All rights reserved.
The role of internal duplication in the evolution of multi-domain proteins.

PubMed

Nacher, J C; Hayashida, M; Akutsu, T

2010-08-01

Many proteins consist of several structural domains. These multi-domain proteins have likely been generated by selective genome growth dynamics during evolution to perform new functions as well as to create structures that fold on a biologically feasible time scale. Domain units frequently evolved through a variety of genetic shuffling mechanisms. Here we examine the protein domain statistics of more than 1000 organisms including eukaryotic, archaeal and bacterial species. The analysis extends earlier findings on asymmetric statistical laws for proteome to a wider variety of species. While proteins are composed of a wide range of domains, displaying a power-law decay, the computation of domain families for each protein reveals an exponential distribution, characterizing a protein universe composed of a thin number of unique families. Structural studies in proteomics have shown that domain repeats, or internal duplicated domains, represent a small but significant fraction of genome. In spite of its importance, this observation has been largely overlooked until recently. We model the evolutionary dynamics of proteome and demonstrate that these distinct distributions are in fact rooted in an internal duplication mechanism. This process generates the contemporary protein structural domain universe, determines its reduced thickness, and tames its growth. These findings have important implications, ranging from protein interaction network modeling to evolutionary studies based on fundamental mechanisms governing genome expansion.
Predicting protein-protein interactions on a proteome scale by matching evolutionary and structural similarities at interfaces using PRISM.

PubMed

Tuncbag, Nurcan; Gursoy, Attila; Nussinov, Ruth; Keskin, Ozlem

2011-08-11

Prediction of protein-protein interactions at the structural level on the proteome scale is important because it allows prediction of protein function, helps drug discovery and takes steps toward genome-wide structural systems biology. We provide a protocol (termed PRISM, protein interactions by structural matching) for large-scale prediction of protein-protein interactions and assembly of protein complex structures. The method consists of two components: rigid-body structural comparisons of target proteins to known template protein-protein interfaces and flexible refinement using a docking energy function. The PRISM rationale follows our observation that globally different protein structures can interact via similar architectural motifs. PRISM predicts binding residues by using structural similarity and evolutionary conservation of putative binding residue 'hot spots'. Ultimately, PRISM could help to construct cellular pathways and functional, proteome-scale annotation. PRISM is implemented in Python and runs in a UNIX environment. The program accepts Protein Data Bank-formatted protein structures and is available at http://prism.ccbb.ku.edu.tr/prism_protocol/.

Encapsulated in silica: genome, proteome and physiology of the thermophilic bacterium Anoxybacillus flavithermus WK1

PubMed Central

Saw, Jimmy H; Mountain, Bruce W; Feng, Lu; Omelchenko, Marina V; Hou, Shaobin; Saito, Jennifer A; Stott, Matthew B; Li, Dan; Zhao, Guang; Wu, Junli; Galperin, Michael Y; Koonin, Eugene V; Makarova, Kira S; Wolf, Yuri I; Rigden, Daniel J; Dunfield, Peter F; Wang, Lei; Alam, Maqsudul

2008-01-01

Background Gram-positive bacteria of the genus Anoxybacillus have been found in diverse thermophilic habitats, such as geothermal hot springs and manure, and in processed foods such as gelatin and milk powder. Anoxybacillus flavithermus is a facultatively anaerobic bacterium found in super-saturated silica solutions and in opaline silica sinter. The ability of A. flavithermus to grow in super-saturated silica solutions makes it an ideal subject to study the processes of sinter formation, which might be similar to the biomineralization processes that occurred at the dawn of life. Results We report here the complete genome sequence of A. flavithermus strain WK1, isolated from the waste water drain at the Wairakei geothermal power station in New Zealand. It consists of a single chromosome of 2,846,746 base pairs and is predicted to encode 2,863 proteins. In silico genome analysis identified several enzymes that could be involved in silica adaptation and biofilm formation, and their predicted functions were experimentally validated in vitro. Proteomic analysis confirmed the regulation of biofilm-related proteins and crucial enzymes for the synthesis of long-chain polyamines as constituents of silica nanospheres. Conclusions Microbial fossils preserved in silica and silica sinters are excellent objects for studying ancient life, a new paleobiological frontier. An integrated analysis of the A. flavithermus genome and proteome provides the first glimpse of metabolic adaptation during silicification and sinter formation. Comparative genome analysis suggests an extensive gene loss in the Anoxybacillus/Geobacillus branch after its divergence from other bacilli. PMID:19014707
Encapsulated in silica: genome, proteome and physiology of the thermophilic bacterium Anoxybacillus flavithermus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Saw, Jimmy H; Mountain, Bruce W; Feng, Lu

Gram-positive bacteria of the genus Anoxybacillus have been found in diverse thermophilic habitats, such as geothermal hot springs and manure, and in processed foods such as gelatin and milk powder. Anoxybacillus flavithermus is a facultatively anaerobic bacterium found in super-saturated silica solutions and in opaline silica sinter. The ability of A. flavithermus to grow in super-saturated silica solutions makes it an ideal subject to study the processes of sinter formation, which might be similar to the biomineralization processes that occurred at the dawn of life. We report here the complete genome sequence of A. flavithermus strain WK1, isolated from themore » waste water drain at the Wairakei geothermal power station in New Zealand. It consists of a single chromosome of 2,846,746 base pairs and is predicted to encode 2,863 proteins. In silico genome analysis identified several enzymes that could be involved in silica adaptation and biofilm formation, and their predicted functions were experimentally validated in vitro. Proteomic analysis confirmed the regulation of biofilm-related proteins and crucial enzymes for the synthesis of long-chain polyamines as constituents of silica nanospheres. Microbial fossils preserved in silica and silica sinters are excellent objects for studying ancient life, a new paleobiological frontier. An integrated analysis of the A. flavithermus genome and proteome provides the first glimpse of metabolic adaptation during silicification and sinter formation. Comparative genome analysis suggests an extensive gene loss in the Anoxybacillus/Geobacillus branch after its divergence from other bacilli.« less
Transcriptome and proteomic analysis of mango (Mangifera indica Linn) fruits.

PubMed

Wu, Hong-xia; Jia, Hui-min; Ma, Xiao-wei; Wang, Song-biao; Yao, Quan-sheng; Xu, Wen-tian; Zhou, Yi-gang; Gao, Zhong-shan; Zhan, Ru-lin

2014-06-13

Here we used Illumina RNA-seq technology for transcriptome sequencing of a mixed fruit sample from 'Zill' mango (Mangifera indica Linn) fruit pericarp and pulp during the development and ripening stages. RNA-seq generated 68,419,722 sequence reads that were assembled into 54,207 transcripts with a mean length of 858bp, including 26,413 clusters and 27,794 singletons. A total of 42,515(78.43%) transcripts were annotated using public protein databases, with a cut-off E-value above 10(-5), of which 35,198 and 14,619 transcripts were assigned to gene ontology terms and clusters of orthologous groups respectively. Functional annotation against the Kyoto Encyclopedia of Genes and Genomes database identified 23,741(43.79%) transcripts which were mapped to 128 pathways. These pathways revealed many previously unknown transcripts. We also applied mass spectrometry-based transcriptome data to characterize the proteome of ripe fruit. LC-MS/MS analysis of the mango fruit proteome was using tandem mass spectrometry (MS/MS) in an LTQ Orbitrap Velos (Thermo) coupled online to the HPLC. This approach enabled the identification of 7536 peptides that matched 2754 proteins. Our study provides a comprehensive sequence for a systemic view of transcriptome during mango fruit development and the most comprehensive fruit proteome to date, which are useful for further genomics research and proteomic studies. Our study provides a comprehensive sequence for a systemic view of both the transcriptome and proteome of mango fruit, and a valuable reference for further research on gene expression and protein identification. This article is part of a Special Issue entitled: Proteomics of non-model organisms. Copyright © 2014 Elsevier B.V. All rights reserved.
Prioritization of potential drug targets against P. aeruginosa by core proteomic analysis using computational subtractive genomics and Protein-Protein interaction network.

PubMed

Uddin, Reaz; Jamil, Faiza

2018-06-01

Pseudomonas aeruginosa is an opportunistic gram-negative bacterium that has the capability to acquire resistance under hostile conditions and become a threat worldwide. It is involved in nosocomial infections. In the current study, potential novel drug targets against P. aeruginosa have been identified using core proteomic analysis and Protein-Protein Interactions (PPIs) studies. The non-redundant reference proteome of 68 strains having complete genome and latest assembly version of P. aeruginosa were downloaded from ftp NCBI RefSeq server in October 2016. The standalone CD-HIT tool was used to cluster ortholog proteins (having >=80% amino acid identity) present in all strains. The pan-proteome was clustered in 12,380 Clusters of Orthologous Proteins (COPs). By using in-house shell scripts, 3252 common COPs were extracted out and designated as clusters of core proteome. The core proteome of PAO1 strain was selected by fetching PAO1's proteome from common COPs. As a result, 1212 proteins were shortlisted that are non-homologous to the human but essential for the survival of the pathogen. Among these 1212 proteins, 321 proteins are conserved hypothetical proteins. Considering their potential as drug target, those 321 hypothetical proteins were selected and their probable functions were characterized. Based on the druggability criteria, 18 proteins were shortlisted. The interacting partners were identified by investigating the PPIs network using STRING v10 database. Subsequently, 8 proteins were shortlisted as 'hub proteins' and proposed as potential novel drug targets against P. aeruginosa. The study is interesting for the scientific community working to identify novel drug targets against MDR pathogens particularly P. aeruginosa. Copyright © 2018 Elsevier Ltd. All rights reserved.
Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis.

PubMed

Zhu, Yafeng; Engström, Pär G; Tellgren-Roth, Christian; Baudo, Charles D; Kennell, John C; Sun, Sheng; Billmyre, R Blake; Schröder, Markus S; Andersson, Anna; Holm, Tina; Sigurgeirsson, Benjamin; Wu, Guangxi; Sankaranarayanan, Sundar Ram; Siddharthan, Rahul; Sanyal, Kaustuv; Lundeberg, Joakim; Nystedt, Björn; Boekhout, Teun; Dawson, Thomas L; Heitman, Joseph; Scheynius, Annika; Lehtiö, Janne

2017-03-17

Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Farm animal genomics and informatics: an update

PubMed Central

Fadiel, Ahmed; Anidi, Ifeanyi; Eichenbaum, Kenneth D.

2005-01-01

Farm animal genomics is of interest to a wide audience of researchers because of the utility derived from understanding how genomics and proteomics function in various organisms. Applications such as xenotransplantation, increased livestock productivity, bioengineering new materials, products and even fabrics are several reasons for thriving farm animal genome activity. Currently mined in rapidly growing data warehouses, completed genomes of chicken, fish and cows are available but are largely stored in decentralized data repositories. In this paper, we provide an informatics primer on farm animal bioinformatics and genome project resources which drive attention to the most recent advances in the field. We hope to provide individuals in biotechnology and in the farming industry with information on resources and updates concerning farm animal genome projects. PMID:16275782
Genome-scale prediction of proteins with long intrinsically disordered regions.

PubMed

Peng, Zhenling; Mizianty, Marcin J; Kurgan, Lukasz

2014-01-01

Proteins with long disordered regions (LDRs), defined as having 30 or more consecutive disordered residues, are abundant in eukaryotes, and these regions are recognized as a distinct class of biologically functional domains. LDRs facilitate various cellular functions and are important for target selection in structural genomics. Motivated by the lack of methods that directly predict proteins with LDRs, we designed Super-fast predictor of proteins with Long Intrinsically DisordERed regions (SLIDER). SLIDER utilizes logistic regression that takes an empirically chosen set of numerical features, which consider selected physicochemical properties of amino acids, sequence complexity, and amino acid composition, as its inputs. Empirical tests show that SLIDER offers competitive predictive performance combined with low computational cost. It outperforms, by at least a modest margin, a comprehensive set of modern disorder predictors (that can indirectly predict LDRs) and is 16 times faster compared to the best currently available disorder predictor. Utilizing our time-efficient predictor, we characterized abundance and functional roles of proteins with LDRs over 110 eukaryotic proteomes. Similar to related studies, we found that eukaryotes have many (on average 30.3%) proteins with LDRs with majority of proteomes having between 25 and 40%, where higher abundance is characteristic to proteomes that have larger proteins. Our first-of-its-kind large-scale functional analysis shows that these proteins are enriched in a number of cellular functions and processes including certain binding events, regulation of catalytic activities, cellular component organization, biogenesis, biological regulation, and some metabolic and developmental processes. A webserver that implements SLIDER is available at http://biomine.ece.ualberta.ca/SLIDER/. Copyright © 2013 Wiley Periodicals, Inc.
Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger.

PubMed

Wright, James C; Sugden, Deana; Francis-McIntyre, Sue; Riba-Garcia, Isabel; Gaskell, Simon J; Grigoriev, Igor V; Baker, Scott E; Beynon, Robert J; Hubbard, Simon J

2009-02-04

Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI) and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS) were acquired from 1d gel electrophoresis bands and searched against all available gene models using Average Peptide Scoring (APS) and reverse database searching to produce confident identifications at an acceptable false discovery rate (FDR). 405 identified peptide sequences were mapped to 214 different A.niger genomic loci to which 4093 predicted gene models clustered, 2872 of which contained the mapped peptides. Interestingly, 13 (6%) of these loci either had no preferred predicted gene model or the genome annotators' chosen "best" model for that genomic locus was not found to be the most parsimonious match to the identified peptides. The peptides identified also boosted confidence in predicted gene structures spanning 54 introns from different gene models. This work highlights the potential of integrating experimental proteomics data into genomic annotation pipelines much as expressed sequence tag (EST) data has been. A comparison of the published genome from another strain of A.niger sequenced by DSM showed that a number of the gene models or proteins with proteomics evidence did not occur in both genomes, further highlighting the utility of the method.
Multi-Omics Driven Assembly and Annotation of the Sandalwood (Santalum album) Genome.

PubMed

Mahesh, Hirehally Basavarajegowda; Subba, Pratigya; Advani, Jayshree; Shirke, Meghana Deepak; Loganathan, Ramya Malarini; Chandana, Shankara Lingu; Shilpa, Siddappa; Chatterjee, Oishi; Pinto, Sneha Maria; Prasad, Thottethodi Subrahmanya Keshava; Gowda, Malali

2018-04-01

Indian sandalwood ( Santalum album ) is an important tropical evergreen tree known for its fragrant heartwood-derived essential oil and its valuable carving wood. Here, we applied an integrated genomic, transcriptomic, and proteomic approach to assemble and annotate the Indian sandalwood genome. Our genome sequencing resulted in the establishment of a draft map of the smallest genome for any woody tree species to date (221 Mb). The genome annotation predicted 38,119 protein-coding genes and 27.42% repetitive DNA elements. In-depth proteome analysis revealed the identities of 72,325 unique peptides, which confirmed 10,076 of the predicted genes. The addition of transcriptomic and proteogenomic approaches resulted in the identification of 53 novel proteins and 34 gene-correction events that were missed by genomic approaches. Proteogenomic analysis also helped in reassigning 1,348 potential noncoding RNAs as bona fide protein-coding messenger RNAs. Gene expression patterns at the RNA and protein levels indicated that peptide sequencing was useful in capturing proteins encoded by nuclear and organellar genomes alike. Mass spectrometry-based proteomic evidence provided an unbiased approach toward the identification of proteins encoded by organellar genomes. Such proteins are often missed in transcriptome data sets due to the enrichment of only messenger RNAs that contain poly(A) tails. Overall, the use of integrated omic approaches enhanced the quality of the assembly and annotation of this nonmodel plant genome. The availability of genomic, transcriptomic, and proteomic data will enhance genomics-assisted breeding, germplasm characterization, and conservation of sandalwood trees. © 2018 American Society of Plant Biologists. All Rights Reserved.
Sex-Specific Biology of the Human Malaria Parasite Revealed from the Proteomes of Mature Male and Female Gametocytes.

PubMed

Miao, Jun; Chen, Zhao; Wang, Zenglei; Shrestha, Sony; Li, Xiaolian; Li, Runze; Cui, Liwang

2017-04-01

The gametocytes of the malaria parasites are obligate for perpetuating the parasite's life cycle through mosquitoes, but the sex-specific biology of gametocytes is poorly understood. We generated a transgenic line in the human malaria parasite Plasmodium falciparum , which allowed us to accurately separate male and female gametocytes by flow cytometry. In-depth analysis of the proteomes by liquid chromatography-tandem mass spectrometry identified 1244 and 1387 proteins in mature male and female gametocytes, respectively. GFP-tagging of nine selected proteins confirmed their sex-partitions to be agreeable with the results from the proteomic analysis. The sex-specific proteomes showed significant differences that are consistent with the divergent functions of the two sexes. Although the male-specific proteome (119 proteins) is enriched in proteins associated with the flagella and genome replication, the female-specific proteome (262 proteins) is more abundant in proteins involved in metabolism, translation and organellar functions. Compared with the Plasmodium berghei sex-specific proteomes, this study revealed both extensive conservation and considerable divergence between these two species, which reflect the disparities between the two species in proteins involved in cytoskeleton, lipid metabolism and protein degradation. Comparison with three sex-specific proteomes allowed us to obtain high-confidence lists of 73 and 89 core male- and female-specific/biased proteins conserved in Plasmodium The identification of sex-specific/biased proteomes in Plasmodium lays a solid foundation for understanding the molecular mechanisms underlying the unique sex-specific biology in this early-branching eukaryote. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Sex-Specific Biology of the Human Malaria Parasite Revealed from the Proteomes of Mature Male and Female Gametocytes *

PubMed Central

Miao, Jun; Chen, Zhao; Wang, Zenglei; Shrestha, Sony; Li, Xiaolian; Li, Runze; Cui, Liwang

2017-01-01

The gametocytes of the malaria parasites are obligate for perpetuating the parasite's life cycle through mosquitoes, but the sex-specific biology of gametocytes is poorly understood. We generated a transgenic line in the human malaria parasite Plasmodium falciparum, which allowed us to accurately separate male and female gametocytes by flow cytometry. In-depth analysis of the proteomes by liquid chromatography-tandem mass spectrometry identified 1244 and 1387 proteins in mature male and female gametocytes, respectively. GFP-tagging of nine selected proteins confirmed their sex-partitions to be agreeable with the results from the proteomic analysis. The sex-specific proteomes showed significant differences that are consistent with the divergent functions of the two sexes. Although the male-specific proteome (119 proteins) is enriched in proteins associated with the flagella and genome replication, the female-specific proteome (262 proteins) is more abundant in proteins involved in metabolism, translation and organellar functions. Compared with the Plasmodium berghei sex-specific proteomes, this study revealed both extensive conservation and considerable divergence between these two species, which reflect the disparities between the two species in proteins involved in cytoskeleton, lipid metabolism and protein degradation. Comparison with three sex-specific proteomes allowed us to obtain high-confidence lists of 73 and 89 core male- and female-specific/biased proteins conserved in Plasmodium. The identification of sex-specific/biased proteomes in Plasmodium lays a solid foundation for understanding the molecular mechanisms underlying the unique sex-specific biology in this early-branching eukaryote. PMID:28126901
Highlights of recent articles on data mining in genomics & proteomics

USDA-ARS?s Scientific Manuscript database

This editorial elaborates on investigations consisting of different “OMICS” technologies and their application to biological sciences. In addition, advantages and recent development of the proteomic, genomic and data mining technologies are discussed. This information will be useful to scientists ...
77 FR 67381 - Government-Owned Inventions; Availability for Licensing

Federal Register 2010, 2011, 2012, 2013, 2014

2012-11-09

.... ``Computational and Experimental RNA Nanoparticle Design,'' in Automation in Genomics and Proteomics: An... and Experimental RNA Nanoparticle Design,'' in Automation in Genomics and Proteomics: An Engineering... Development Stage: Prototype Pre-clinical In vitro data available Inventors: Robert J. Crouch and Yutaka...
Background | Office of Cancer Clinical Proteomics Research

Cancer.gov

The term "proteomics" refers to a large-scale comprehensive study of a specific proteome resulting from its genome, including abundances of proteins, their variations and modifications, and interacting partners and networks in order to understand cellular processes involved. Similarly, “Cancer proteomics” refers to comprehensive analyses of proteins and their derivatives translated from a specific cancer genome using a human biospecimen or a preclinical model (e.g., cultured cell or animal model).
Genome-wide proteomics analysis on longissimus muscles in Qinchuan beef cattle.

PubMed

He, Hua; Chen, Si; Liang, Wei; Liu, Xiaolin

2017-04-01

To gain further insight into the molecular mechanism of bovine muscle development, we combined mass spectrometry characterization of proteins with Illumina deep sequencing of RNAs obtained from bovine longissimus muscle (LD) at prenatal and postnatal stages. For the proteomic study, each group of LD proteins was extracted and labeled using isobaric tags for relative and absolute quantitation (iTRAQ) method. Among the 1321 proteins identified from six samples, 390 proteins were differentially expressed in embryos at day 135 post-fertilization (Emb135d) vs. 30-month-old adult cattle (Emb135d vs. 30M) samples. Gene Ontology, Cluster of Orthologous Groups and Kyoto Encyclopedia of Genes and Genomes analyses were further conducted to better understand the different functions. Furthermore, we analyzed the relationship between transcript and protein regulation between samples by direct comparison of expression levels from transcriptomic and iTRAQ-based proteomics. Association results indicated that 1295 of 1321 proteins could be mapped to transcriptome sequencing data. This study provides the most comprehensive, targeted survey of bovine LD proteins to date and has shown the power of combining transcriptomic and proteomic approaches to provide molecular insights for understanding the developmental characteristics in bovine muscle, and even in other mammals. © 2016 Stichting International Foundation for Animal Genetics.
Global analyses of Ceratocystis cacaofunesta mitochondria: from genome to proteome.

PubMed

Ambrosio, Alinne Batista; do Nascimento, Leandro Costa; Oliveira, Bruno V; Teixeira, Paulo José P L; Tiburcio, Ricardo A; Toledo Thomazella, Daniela P; Leme, Adriana F P; Carazzolle, Marcelo F; Vidal, Ramon O; Mieczkowski, Piotr; Meinhardt, Lyndel W; Pereira, Gonçalo A G; Cabrera, Odalys G

2013-02-11

The ascomycete fungus Ceratocystis cacaofunesta is the causal agent of wilt disease in cacao, which results in significant economic losses in the affected producing areas. Despite the economic importance of the Ceratocystis complex of species, no genomic data are available for any of its members. Given that mitochondria play important roles in fungal virulence and the susceptibility/resistance of fungi to fungicides, we performed the first functional analysis of this organelle in Ceratocystis using integrated "omics" approaches. The C. cacaofunesta mitochondrial genome (mtDNA) consists of a single, 103,147-bp circular molecule, making this the second largest mtDNA among the Sordariomycetes. Bioinformatics analysis revealed the presence of 15 conserved genes and 37 intronic open reading frames in C. cacaofunesta mtDNA. Here, we predicted the mitochondrial proteome (mtProt) of C. cacaofunesta, which is comprised of 1,124 polypeptides - 52 proteins that are mitochondrially encoded and 1,072 that are nuclearly encoded. Transcriptome analysis revealed 33 probable novel genes. Comparisons among the Gene Ontology results of the predicted mtProt of C. cacaofunesta, Neurospora crassa and Saccharomyces cerevisiae revealed no significant differences. Moreover, C. cacaofunesta mitochondria were isolated, and the mtProt was subjected to mass spectrometric analysis. The experimental proteome validated 27% of the predicted mtProt. Our results confirmed the existence of 110 hypothetical proteins and 7 novel proteins of which 83 and 1, respectively, had putative mitochondrial localization. The present study provides the first partial genomic analysis of a species of the Ceratocystis genus and the first predicted mitochondrial protein inventory of a phytopathogenic fungus. In addition to the known mitochondrial role in pathogenicity, our results demonstrated that the global function analysis of this organelle is similar in pathogenic and non-pathogenic fungi, suggesting that its relevance in the lifestyle of these organisms should be based on a small number of specific proteins and/or with respect to differential gene regulation. In this regard, particular interest should be directed towards mitochondrial proteins with unknown function and the novel protein that might be specific to this species. Further functional characterization of these proteins could enhance our understanding of the role of mitochondria in phytopathogenicity.
Global analyses of Ceratocystis cacaofunesta mitochondria: from genome to proteome

PubMed Central

2013-01-01

Background The ascomycete fungus Ceratocystis cacaofunesta is the causal agent of wilt disease in cacao, which results in significant economic losses in the affected producing areas. Despite the economic importance of the Ceratocystis complex of species, no genomic data are available for any of its members. Given that mitochondria play important roles in fungal virulence and the susceptibility/resistance of fungi to fungicides, we performed the first functional analysis of this organelle in Ceratocystis using integrated “omics” approaches. Results The C. cacaofunesta mitochondrial genome (mtDNA) consists of a single, 103,147-bp circular molecule, making this the second largest mtDNA among the Sordariomycetes. Bioinformatics analysis revealed the presence of 15 conserved genes and 37 intronic open reading frames in C. cacaofunesta mtDNA. Here, we predicted the mitochondrial proteome (mtProt) of C. cacaofunesta, which is comprised of 1,124 polypeptides - 52 proteins that are mitochondrially encoded and 1,072 that are nuclearly encoded. Transcriptome analysis revealed 33 probable novel genes. Comparisons among the Gene Ontology results of the predicted mtProt of C. cacaofunesta, Neurospora crassa and Saccharomyces cerevisiae revealed no significant differences. Moreover, C. cacaofunesta mitochondria were isolated, and the mtProt was subjected to mass spectrometric analysis. The experimental proteome validated 27% of the predicted mtProt. Our results confirmed the existence of 110 hypothetical proteins and 7 novel proteins of which 83 and 1, respectively, had putative mitochondrial localization. Conclusions The present study provides the first partial genomic analysis of a species of the Ceratocystis genus and the first predicted mitochondrial protein inventory of a phytopathogenic fungus. In addition to the known mitochondrial role in pathogenicity, our results demonstrated that the global function analysis of this organelle is similar in pathogenic and non-pathogenic fungi, suggesting that its relevance in the lifestyle of these organisms should be based on a small number of specific proteins and/or with respect to differential gene regulation. In this regard, particular interest should be directed towards mitochondrial proteins with unknown function and the novel protein that might be specific to this species. Further functional characterization of these proteins could enhance our understanding of the role of mitochondria in phytopathogenicity. PMID:23394930
GENOMIC AND PROTEOMIC ANALYSIS OF SURROGATE TISSUES FOR ASSESSING TOXIC EXPOSURES AND DISEASE STATES

EPA Science Inventory

Genomic and Proteomic Analysis of Surrogate Tissues for Assessing Toxic Exposures and Disease States
David J. Dix and John C. Rockett
Reproductive Toxicology Division, National Health and Environmental Effects Research Laboratory, Office of Research and Development, USEPA, ...
A 2-D guinea pig lung proteome map

USDA-ARS?s Scientific Manuscript database

Guinea pigs represent an important model for a number of infectious and non-infectious pulmonary diseases. The guinea pig genome has recently been sequenced to full coverage, opening up new research avenues using genomics, transcriptomics and proteomics techniques in this species. In order to furth...
N-terminal Proteomics Assisted Profiling of the Unexplored Translation Initiation Landscape in Arabidopsis thaliana *

PubMed Central

Ndah, Elvis; Jonckheere, Veronique

2017-01-01

Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes. PMID:28432195

N-terminal Proteomics Assisted Profiling of the Unexplored Translation Initiation Landscape in Arabidopsis thaliana.

PubMed

Willems, Patrick; Ndah, Elvis; Jonckheere, Veronique; Stael, Simon; Sticker, Adriaan; Martens, Lennart; Van Breusegem, Frank; Gevaert, Kris; Van Damme, Petra

2017-06-01

Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Integrated Proteogenomic Characterization of Human High-Grade Serous Ovarian Cancer.

PubMed

Zhang, Hui; Liu, Tao; Zhang, Zhen; Payne, Samuel H; Zhang, Bai; McDermott, Jason E; Zhou, Jian-Ying; Petyuk, Vladislav A; Chen, Li; Ray, Debjit; Sun, Shisheng; Yang, Feng; Chen, Lijun; Wang, Jing; Shah, Punit; Cha, Seong Won; Aiyetan, Paul; Woo, Sunghee; Tian, Yuan; Gritsenko, Marina A; Clauss, Therese R; Choi, Caitlin; Monroe, Matthew E; Thomas, Stefani; Nie, Song; Wu, Chaochao; Moore, Ronald J; Yu, Kun-Hsing; Tabb, David L; Fenyö, David; Bafna, Vineet; Wang, Yue; Rodriguez, Henry; Boja, Emily S; Hiltke, Tara; Rivers, Robert C; Sokoll, Lori; Zhu, Heng; Shih, Ie-Ming; Cope, Leslie; Pandey, Akhilesh; Zhang, Bing; Snyder, Michael P; Levine, Douglas A; Smith, Richard D; Chan, Daniel W; Rodland, Karin D

2016-07-28

To provide a detailed analysis of the molecular components and underlying mechanisms associated with ovarian cancer, we performed a comprehensive mass-spectrometry-based proteomic characterization of 174 ovarian tumors previously analyzed by The Cancer Genome Atlas (TCGA), of which 169 were high-grade serous carcinomas (HGSCs). Integrating our proteomic measurements with the genomic data yielded a number of insights into disease, such as how different copy-number alternations influence the proteome, the proteins associated with chromosomal instability, the sets of signaling pathways that diverse genome rearrangements converge on, and the ones most associated with short overall survival. Specific protein acetylations associated with homologous recombination deficiency suggest a potential means for stratifying patients for therapy. In addition to providing a valuable resource, these findings provide a view of how the somatic genome drives the cancer proteome and associations between protein and post-translational modification levels and clinical outcomes in HGSC. VIDEO ABSTRACT. Copyright © 2016 Elsevier Inc. All rights reserved.
VESPA: Software to Facilitate Genomic Annotation of Prokaryotic Organisms Through Integration of Proteomic and Transcriptomic Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peterson, Elena S.; McCue, Lee Ann; Rutledge, Alexandra C.

2012-04-25

Visual Exploration and Statistics to Promote Annotation (VESPA) is an interactive visual analysis software tool that facilitates the discovery of structural mis-annotations in prokaryotic genomes. VESPA integrates high-throughput peptide-centric proteomics data and oligo-centric or RNA-Seq transcriptomics data into a genomic context. The data may be interrogated via visual analysis across multiple levels of genomic resolution, linked searches, exports and interaction with BLAST to rapidly identify location of interest within the genome and evaluate potential mis-annotations.
Application of metagenomics technologies for antimicrobial resistance and food safety research and beyond

USDA-ARS?s Scientific Manuscript database

Current developments in the field of metagenomics in biological sciences have demonstrated the need and potential usefulness of taxonomical and functional analyses of meta-omics data generated by genomics, transcriptomics, proteomics, and metabolomics. This review will provide a general overview of...
Mass spectrometry-based proteomics: from cancer biology to protein biomarkers, drug targets, and clinical applications.

PubMed

Jimenez, Connie R; Verheul, Henk M W

2014-01-01

Proteomics is optimally suited to bridge the gap between genomic information on the one hand and biologic functions and disease phenotypes at the other, since it studies the expression and/or post-translational modification (especially phosphorylation) of proteins--the major cellular players bringing about cellular functions--at a global level in biologic specimens. Mass spectrometry technology and (bio)informatic tools have matured to the extent that they can provide high-throughput, comprehensive, and quantitative protein inventories of cells, tissues, and biofluids in clinical samples at low level. In this article, we focus on next-generation proteomics employing nanoliquid chromatography coupled to high-resolution tandem mass spectrometry for in-depth (phospho)protein profiling of tumor tissues and (proximal) biofluids, with a focus on studies employing clinical material. In addition, we highlight emerging proteogenomic approaches for the identification of tumor-specific protein variants, and targeted multiplex mass spectrometry strategies for large-scale biomarker validation. Below we provide a discussion of recent progress, some research highlights, and challenges that remain for clinical translation of proteomic discoveries.
Functional Genomics Approaches to Studying Symbioses between Legumes and Nitrogen-Fixing Rhizobia.

PubMed

Lardi, Martina; Pessi, Gabriella

2018-05-18

Biological nitrogen fixation gives legumes a pronounced growth advantage in nitrogen-deprived soils and is of considerable ecological and economic interest. In exchange for reduced atmospheric nitrogen, typically given to the plant in the form of amides or ureides, the legume provides nitrogen-fixing rhizobia with nutrients and highly specialised root structures called nodules. To elucidate the molecular basis underlying physiological adaptations on a genome-wide scale, functional genomics approaches, such as transcriptomics, proteomics, and metabolomics, have been used. This review presents an overview of the different functional genomics approaches that have been performed on rhizobial symbiosis, with a focus on studies investigating the molecular mechanisms used by the bacterial partner to interact with the legume. While rhizobia belonging to the alpha-proteobacterial group (alpha-rhizobia) have been well studied, few studies to date have investigated this process in beta-proteobacteria (beta-rhizobia).
Highly Efficient Proteolysis Accelerated by Electromagnetic Waves for Peptide Mapping

PubMed Central

Chen, Qiwen; Liu, Ting; Chen, Gang

2011-01-01

Proteomics will contribute greatly to the understanding of gene functions in the post-genomic era. In proteome research, protein digestion is a key procedure prior to mass spectrometry identification. During the past decade, a variety of electromagnetic waves have been employed to accelerate proteolysis. This review focuses on the recent advances and the key strategies of these novel proteolysis approaches for digesting and identifying proteins. The subjects covered include microwave-accelerated protein digestion, infrared-assisted proteolysis, ultraviolet-enhanced protein digestion, laser-assisted proteolysis, and future prospects. It is expected that these novel proteolysis strategies accelerated by various electromagnetic waves will become powerful tools in proteome research and will find wide applications in high throughput protein digestion and identification. PMID:22379392
[Advances in the study of the nucleolus].

PubMed

Feng, Jin-Mei; Sun, Jun; Wen, Jian-Fan

2012-12-01

As the most prominent sub-nuclear compartment in the interphase nucleus and the site of ribosome biogenesis, the nucleolus synthesizes and processes rRNA and also assembles ribosomal subunits. Though several lines of research in recent years have indicated that the nucleolus might have additional functions-such as the assembling of signal recognition particles, the processing of mRNA, tRNA and telomerase activities, and regulating the cell cycle-proteomic analyses of the nucleolus in three representative eukaryotic species has shown that a plethora of proteins either have no association with ribosome biogenesis or are of presently unknown function. This phenomenon further indicates that the composition and function of the nucleolus is far more complicated than previously thought. Meanwhile, the available nucleolar proteome databases has provided new approaches and led to remarkable progress in understanding the nucleolus. Here, we have summarized recent advances in the study of the nucleolus, including new discoveries of its structure, function, genomics/proteomics as well as its origin and evolution. Moreover, we highlight several of the important unresolved issues in this field.
LIFEdb: a database for functional genomics experiments integrating information from external sources, and serving as a sample tracking system

PubMed Central

Bannasch, Detlev; Mehrle, Alexander; Glatting, Karl-Heinz; Pepperkok, Rainer; Poustka, Annemarie; Wiemann, Stefan

2004-01-01

We have implemented LIFEdb (http://www.dkfz.de/LIFEdb) to link information regarding novel human full-length cDNAs generated and sequenced by the German cDNA Consortium with functional information on the encoded proteins produced in functional genomics and proteomics approaches. The database also serves as a sample-tracking system to manage the process from cDNA to experimental read-out and data interpretation. A web interface enables the scientific community to explore and visualize features of the annotated cDNAs and ORFs combined with experimental results, and thus helps to unravel new features of proteins with as yet unknown functions. PMID:14681468
Functional insights from proteome-wide structural modeling of Treponema pallidum subspecies pallidum, the causative agent of syphilis.

PubMed

Houston, Simon; Lithgow, Karen Vivien; Osbak, Kara Krista; Kenyon, Chris Richard; Cameron, Caroline E

2018-05-16

Syphilis continues to be a major global health threat with 11 million new infections each year, and a global burden of 36 million cases. The causative agent of syphilis, Treponema pallidum subspecies pallidum, is a highly virulent bacterium, however the molecular mechanisms underlying T. pallidum pathogenesis remain to be definitively identified. This is due to the fact that T. pallidum is currently uncultivatable, inherently fragile and thus difficult to work with, and phylogenetically distinct with no conventional virulence factor homologs found in other pathogens. In fact, approximately 30% of its predicted protein-coding genes have no known orthologs or assigned functions. Here we employed a structural bioinformatics approach using Phyre2-based tertiary structure modeling to improve our understanding of T. pallidum protein function on a proteome-wide scale. Phyre2-based tertiary structure modeling generated high-confidence predictions for 80% of the T. pallidum proteome (780/978 predicted proteins). Tertiary structure modeling also inferred the same function as primary structure-based annotations from genome sequencing pipelines for 525/605 proteins (87%), which represents 54% (525/978) of all T. pallidum proteins. Of the 175 T. pallidum proteins modeled with high confidence that were not assigned functions in the previously annotated published proteome, 167 (95%) were able to be assigned predicted functions. Twenty-one of the 175 hypothetical proteins modeled with high confidence were also predicted to exhibit significant structural similarity with proteins experimentally confirmed to be required for virulence in other pathogens. Phyre2-based structural modeling is a powerful bioinformatics tool that has provided insight into the potential structure and function of the majority of T. pallidum proteins and helped validate the primary structure-based annotation of more than 50% of all T. pallidum proteins with high confidence. This work represents the first T. pallidum proteome-wide structural modeling study and is one of few studies to apply this approach for the functional annotation of a whole proteome.
A proteome-scale map of the human interactome network

PubMed Central

Rolland, Thomas; Taşan, Murat; Charloteaux, Benoit; Pevzner, Samuel J.; Zhong, Quan; Sahni, Nidhi; Yi, Song; Lemmens, Irma; Fontanillo, Celia; Mosca, Roberto; Kamburov, Atanas; Ghiassian, Susan D.; Yang, Xinping; Ghamsari, Lila; Balcha, Dawit; Begg, Bridget E.; Braun, Pascal; Brehme, Marc; Broly, Martin P.; Carvunis, Anne-Ruxandra; Convery-Zupan, Dan; Corominas, Roser; Coulombe-Huntington, Jasmin; Dann, Elizabeth; Dreze, Matija; Dricot, Amélie; Fan, Changyu; Franzosa, Eric; Gebreab, Fana; Gutierrez, Bryan J.; Hardy, Madeleine F.; Jin, Mike; Kang, Shuli; Kiros, Ruth; Lin, Guan Ning; Luck, Katja; MacWilliams, Andrew; Menche, Jörg; Murray, Ryan R.; Palagi, Alexandre; Poulin, Matthew M.; Rambout, Xavier; Rasla, John; Reichert, Patrick; Romero, Viviana; Ruyssinck, Elien; Sahalie, Julie M.; Scholz, Annemarie; Shah, Akash A.; Sharma, Amitabh; Shen, Yun; Spirohn, Kerstin; Tam, Stanley; Tejeda, Alexander O.; Trigg, Shelly A.; Twizere, Jean-Claude; Vega, Kerwin; Walsh, Jennifer; Cusick, Michael E.; Xia, Yu; Barabási, Albert-László; Iakoucheva, Lilia M.; Aloy, Patrick; De Las Rivas, Javier; Tavernier, Jan; Calderwood, Michael A.; Hill, David E.; Hao, Tong; Roth, Frederick P.; Vidal, Marc

2014-01-01

SUMMARY Just as reference genome sequences revolutionized human genetics, reference maps of interactome networks will be critical to fully understand genotype-phenotype relationships. Here, we describe a systematic map of ~14,000 high-quality human binary protein-protein interactions. At equal quality, this map is ~30% larger than what is available from small-scale studies published in the literature in the last few decades. While currently available information is highly biased and only covers a relatively small portion of the proteome, our systematic map appears strikingly more homogeneous, revealing a “broader” human interactome network than currently appreciated. The map also uncovers significant inter-connectivity between known and candidate cancer gene products, providing unbiased evidence for an expanded functional cancer landscape, while demonstrating how high quality interactome models will help “connect the dots” of the genomic revolution. PMID:25416956
Genomes, Proteomes and the Central Dogma

PubMed Central

Franklin, Sarah; Vondriska, Thomas M.

2011-01-01

Systems biology, with its associated technologies of proteomics, genomics and metabolomics, is driving the evolution of our understanding of cardiovascular physiology. Rather than studying individual molecules or even single reactions, a systems approach allows integration of orthogonal datasets from distinct tiers of biological data, including gene, RNA, protein, metabolite and other component networks. Together these networks give rise to emergent properties of cellular function and it is their reprogramming that causes disease. We present five observations regarding how systems biology is guiding a revisiting of the central dogma: (i) de-emphasizing the unidirectional flow of information from genes to proteins; (ii) revealing the role of modules of molecules as opposed to individual proteins acting in isolation; (iii) enabling discovery of novel emergent properties; (iv) demonstrating the importance of networks in biology; and (v) adding new dimensionality to the study of biological systems. PMID:22010165
Differential proteome analysis of diabetes mellitus type 2 and its pathophysiological complications.

PubMed

Sohail, Waleed; Majeed, Fatimah; Afroz, Amber

2018-06-11

The prevalence of Diabetes Mellitus Type 2 (DM 2) is increasing every passing year due to some global changes in lifestyles of people. The exact underlying mechanisms of the progression of this disease are not yet known. However recent advances in the combined omics more particularly in proteomics and genomics have opened a gateway towards the understanding of predetermined genetic factors, progression, complications and treatment of this disease. Here we shall review the recent advances in proteomics that have led to an early and better diagnostic approaches in controlling DM 2 more importantly the comparison of structural and functional protein biomarkers that are modified in the diseased state. By applying these advanced and promising proteomic strategies with bioinformatics applications and bio-statistical tools the prevalence of DM 2 and its associated disorders i-e nephropathy and retinopathy are expected to be controlled. Copyright © 2018 Diabetes India. Published by Elsevier Ltd. All rights reserved.
Linkage of exposure and effects using genomics, proteomics and metabolomics in small fish models (presentation)

EPA Science Inventory

This research project combines the use of whole organism endpoints, genomic, proteomic and metabolomic approaches, and computational modeling in a systems biology approach to 1) identify molecular indicators of exposure and biomarkers of effect to EDCs representing several modes/...
GAPP: A Proteogenomic Software for Genome Annotation and Global Profiling of Post-translational Modifications in Prokaryotes.

PubMed

Zhang, Jia; Yang, Ming-Kun; Zeng, Honghui; Ge, Feng

2016-11-01

Although the number of sequenced prokaryotic genomes is growing rapidly, experimentally verified annotation of prokaryotic genome remains patchy and challenging. To facilitate genome annotation efforts for prokaryotes, we developed an open source software called GAPP for genome annotation and global profiling of post-translational modifications (PTMs) in prokaryotes. With a single command, it provides a standard workflow to validate and refine predicted genetic models and discover diverse PTM events. We demonstrated the utility of GAPP using proteomic data from Helicobacter pylori, one of the major human pathogens that is responsible for many gastric diseases. Our results confirmed 84.9% of the existing predicted H. pylori proteins, identified 20 novel protein coding genes, and corrected four existing gene models with regard to translation initiation sites. In particular, GAPP revealed a large repertoire of PTMs using the same proteomic data and provided a rich resource that can be used to examine the functions of reversible modifications in this human pathogen. This software is a powerful tool for genome annotation and global discovery of PTMs and is applicable to any sequenced prokaryotic organism; we expect that it will become an integral part of ongoing genome annotation efforts for prokaryotes. GAPP is freely available at https://sourceforge.net/projects/gappproteogenomic/. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.

Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the “generalist” (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions, prediction errors for growth rate and metabolic fluxes were 69% and 14% lower, respectively. The sector-constrained ME model thusmore » represents a generalist ME model reflecting both growth rate maximization and “hedging” against uncertain environments and stresses, as indicated by significant enrichment of these sectors for the general stress response sigma factor σS. Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally-related protein groups) as demonstrated here. Furthermore, this flexible formalism provides an accessible approach for narrowing the gap between the complexity captured by omics data and governing principles of proteome allocation described by systems-level models.« less
Proteomic analysis of isolated chlamydomonas centrioles reveals orthologs of ciliary-disease genes.

PubMed

Keller, Lani C; Romijn, Edwin P; Zamora, Ivan; Yates, John R; Marshall, Wallace F

2005-06-21

The centriole is one of the most enigmatic organelles in the cell. Centrioles are cylindrical, microtubule-based barrels found in the core of the centrosome. Centrioles also act as basal bodies during interphase to nucleate the assembly of cilia and flagella. There are currently only a handful of known centriole proteins. We used mass-spectrometry-based MudPIT (multidimensional protein identification technology) to identify the protein composition of basal bodies (centrioles) isolated from the green alga Chlamydomonas reinhardtii. This analysis detected the majority of known centriole proteins, including centrin, epsilon tubulin, and the cartwheel protein BLD10p. By combining proteomic data with information about gene expression and comparative genomics, we identified 45 cross-validated centriole candidate proteins in two classes. Members of the first class of proteins (BUG1-BUG27) are encoded by genes whose expression correlates with flagellar assembly and which therefore may play a role in ciliogenesis-related functions of basal bodies. Members of the second class (POC1-POC18) are implicated by comparative-genomics and -proteomics studies to be conserved components of the centriole. We confirmed centriolar localization for the human homologs of four candidate proteins. Three of the cross-validated centriole candidate proteins are encoded by orthologs of genes (OFD1, NPHP-4, and PACRG) implicated in mammalian ciliary function and disease, suggesting that oral-facial-digital syndrome and nephronophthisis may involve a dysfunction of centrioles and/or basal bodies. By analyzing isolated Chlamydomonas basal bodies, we have been able to obtain the first reported proteomic analysis of the centriole.
Effects of Space Environment on Genome, Transcriptome, and Proteome of Klebsiella pneumoniae.

PubMed

Guo, Yinghua; Li, Jia; Liu, Jinwen; Wang, Tong; Li, Yinhu; Yuan, Yanting; Zhao, Jiao; Chang, De; Fang, Xiangqun; Li, Tianzhi; Wang, Junfeng; Dai, Wenkui; Fang, Chengxiang; Liu, Changting

2015-11-01

The aim of this study was to explore the effects of space flight on Klebsiella pneumoniae. A strain of K. pneumoniae was sent to space for 398 h aboard the ShenZhou VIII spacecraft during November 1, 2011-November 17, 2011. At the same time, a ground simulation with similar temperature conditions during the space flight was performed as a control. After the space mission, the flight and control strains were analyzed using phenotypic, genomic, transcriptomic and proteomic techniques. The flight strains LCT-KP289 exhibited a higher cotrimoxazole resistance level and changes in metabolism relative to the ground control strain LCT-KP214. After the space flight, 73 SNPs and a plasmid copy number variation were identified in the flight strain. Based on the transcriptomic analysis, there are 232 upregulated and 1879 downregulated genes, of which almost all were for metabolism. Proteomic analysis revealed that there were 57 upregulated and 125 downregulated proteins. These differentially expressed proteins had several functions that included energy production and conversion, carbohydrate transport and metabolism, translation, ribosomal structure and biogenesis, posttranslational modification, protein turnover, and chaperone functions. At a systems biology level, the ytfG gene had a synonymous mutation that resulted in significantly downregulated expression at both transcriptomic and proteomic levels. The mutation of the ytfG gene may influence fructose and mannose metabolic processes of K. pneumoniae during space flight, which may be beneficial to the field of space microbiology, providing potential therapeutic strategies to combat or prevent infection in astronauts. Copyright © 2015 IMSS. Published by Elsevier Inc. All rights reserved.
Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger

PubMed Central

Wright, James C; Sugden, Deana; Francis-McIntyre, Sue; Riba-Garcia, Isabel; Gaskell, Simon J; Grigoriev, Igor V; Baker, Scott E; Beynon, Robert J; Hubbard, Simon J

2009-01-01

Background Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI) and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS) were acquired from 1d gel electrophoresis bands and searched against all available gene models using Average Peptide Scoring (APS) and reverse database searching to produce confident identifications at an acceptable false discovery rate (FDR). Results 405 identified peptide sequences were mapped to 214 different A.niger genomic loci to which 4093 predicted gene models clustered, 2872 of which contained the mapped peptides. Interestingly, 13 (6%) of these loci either had no preferred predicted gene model or the genome annotators' chosen "best" model for that genomic locus was not found to be the most parsimonious match to the identified peptides. The peptides identified also boosted confidence in predicted gene structures spanning 54 introns from different gene models. Conclusion This work highlights the potential of integrating experimental proteomics data into genomic annotation pipelines much as expressed sequence tag (EST) data has been. A comparison of the published genome from another strain of A.niger sequenced by DSM showed that a number of the gene models or proteins with proteomics evidence did not occur in both genomes, further highlighting the utility of the method. PMID:19193216
HTAPP: High-Throughput Autonomous Proteomic Pipeline

PubMed Central

Yu, Kebing; Salomon, Arthur R.

2011-01-01

Recent advances in the speed and sensitivity of mass spectrometers and in analytical methods, the exponential acceleration of computer processing speeds, and the availability of genomic databases from an array of species and protein information databases have led to a deluge of proteomic data. The development of a lab-based automated proteomic software platform for the automated collection, processing, storage, and visualization of expansive proteomic datasets is critically important. The high-throughput autonomous proteomic pipeline (HTAPP) described here is designed from the ground up to provide critically important flexibility for diverse proteomic workflows and to streamline the total analysis of a complex proteomic sample. This tool is comprised of software that controls the acquisition of mass spectral data along with automation of post-acquisition tasks such as peptide quantification, clustered MS/MS spectral database searching, statistical validation, and data exploration within a user-configurable lab-based relational database. The software design of HTAPP focuses on accommodating diverse workflows and providing missing software functionality to a wide range of proteomic researchers to accelerate the extraction of biological meaning from immense proteomic data sets. Although individual software modules in our integrated technology platform may have some similarities to existing tools, the true novelty of the approach described here is in the synergistic and flexible combination of these tools to provide an integrated and efficient analysis of proteomic samples. PMID:20336676

GENOMIC AND PROTEOMIC BASIS FOR INTERSPECIES EXTRAPOLATIONS BASED UPON ESTROGEN AND ANDROGEN RECEPTORSTRUCTURE AND FUNCTION AMONG ANIMALS

EPA Science Inventory

Most in vitro hazard assessments for the screening and identification of endocrine disrupting compounds (EDCs), including those outlined in the EDSP Tier 1 Screening (T1S) protocols, use mammalian steroid hormone receptors. There is uncertainty, however, concerning differences th...
Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences.

PubMed

Traini, Alessandra; Iorizzo, Massimo; Mann, Harpartap; Bradeen, James M; Carputo, Domenico; Frusciante, Luigi; Chiusano, Maria Luisa

2013-01-01

Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT) markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.
The porcine translational research database: A manually curated, genomics and proteomics-based research resource

USDA-ARS?s Scientific Manuscript database

The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are...
USING GENOMICS AND PROTEOMICS TO DIAGNOSE EXPOSURE OF AQUATIC ORGANISMS TO ENVIRONMENTAL CONTAMINANTS

EPA Science Inventory

Advances in molecular biology allow the use of cutting-edge genomic and proteomic tools to assess the effects of environmental contaminants on aquatic organisms. Techniques are available to measure changes in expression of single genes (quantitative real-time PCR) or to measure g...
[The human variome project and its progress].

PubMed

Gao, Shan; Zhang, Ning; Zhang, Lei; Duan, Guang-You; Zhang, Tao

2010-11-01

The main goal of post genomics is to explain how the genome, the map of which has been constructed in the Human Genome Project, affacts activities of life. This leads to generate multiple "omics": structural genomics, functional genomics, proteomics, metabonomics, et al. In Jun. 2006, Melbourne, Australia, Human Genome Variation Society (HGVS) initiated the Human Variome Project (HVP) to collect all the sequence variation and polymorphism data worldwidely. HVP is to search and determine those mutations related with human diseases by association study between genetype and phenotype on the scale of genome level and other methods. Those results will be translated into clinical application. Considering the potential effects of this project on human health, this paper introduced its origin and main content in detail and discussed its meaning and prospect.
A predicted physicochemically distinct sub-proteome associated with the intracellular organelle of the anammox bacterium Kuenenia stuttgartiensis.

PubMed

Medema, Marnix H; Zhou, Miaomiao; van Hijum, Sacha A F T; Gloerich, Jolein; Wessels, Hans J C T; Siezen, Roland J; Strous, Marc

2010-05-12

Anaerobic ammonium-oxidizing (anammox) bacteria perform a key step in global nitrogen cycling. These bacteria make use of an organelle to oxidize ammonia anaerobically to nitrogen (N2) and so contribute approximately 50% of the nitrogen in the atmosphere. It is currently unknown which proteins constitute the organellar proteome and how anammox bacteria are able to specifically target organellar and cell-envelope proteins to their correct final destinations. Experimental approaches are complicated by the absence of pure cultures and genetic accessibility. However, the genome of the anammox bacterium Candidatus "Kuenenia stuttgartiensis" has recently been sequenced. Here, we make use of these genome data to predict the organellar sub-proteome and address the molecular basis of protein sorting in anammox bacteria. Two training sets representing organellar (30 proteins) and cell envelope (59 proteins) proteins were constructed based on previous experimental evidence and comparative genomics. Random forest (RF) classifiers trained on these two sets could differentiate between organellar and cell envelope proteins with ~89% accuracy using 400 features consisting of frequencies of two adjacent amino acid combinations. A physicochemically distinct organellar sub-proteome containing 562 proteins was predicted with the best RF classifier. This set included almost all catabolic and respiratory factors encoded in the genome. Apparently, the cytoplasmic membrane performs no catabolic functions. We predict that the Tat-translocation system is located exclusively in the organellar membrane, whereas the Sec-translocation system is located on both the organellar and cytoplasmic membranes. Canonical signal peptides were predicted and validated experimentally, but a specific (N- or C-terminal) signal that could be used for protein targeting to the organelle remained elusive. A physicochemically distinct organellar sub-proteome was predicted from the genome of the anammox bacterium K. stuttgartiensis. This result provides strong in silico support for the existing experimental evidence for the existence of an organelle in this bacterium, and is an important step forward in unravelling a geochemically relevant case of cytoplasmic differentiation in bacteria. The predicted dual location of the Sec-translocation system and the apparent absence of a specific N- or C-terminal signal in the organellar proteins suggests that additional chaperones may be necessary that act on an as-yet unknown property of the targeted proteins.
Why proteomics is not the new genomics and the future of mass spectrometry in cell biology.

PubMed

Sidoli, Simone; Kulej, Katarzyna; Garcia, Benjamin A

2017-01-02

Mass spectrometry (MS) is an essential part of the cell biologist's proteomics toolkit, allowing analyses at molecular and system-wide scales. However, proteomics still lag behind genomics in popularity and ease of use. We discuss key differences between MS-based -omics and other booming -omics technologies and highlight what we view as the future of MS and its role in our increasingly deep understanding of cell biology. © 2017 Sidoli et al.
Quantitative Clinical Chemistry Proteomics (qCCP) using mass spectrometry: general characteristics and application.

PubMed

Lehmann, Sylvain; Hoofnagle, Andrew; Hochstrasser, Denis; Brede, Cato; Glueckmann, Matthias; Cocho, José A; Ceglarek, Uta; Lenz, Christof; Vialaret, Jérôme; Scherl, Alexander; Hirtz, Christophe

2013-05-01

Proteomics studies typically aim to exhaustively detect peptides/proteins in a given biological sample. Over the past decade, the number of publications using proteomics methodologies has exploded. This was made possible due to the availability of high-quality genomic data and many technological advances in the fields of microfluidics and mass spectrometry. Proteomics in biomedical research was initially used in 'functional' studies for the identification of proteins involved in pathophysiological processes, complexes and networks. Improved sensitivity of instrumentation facilitated the analysis of even more complex sample types, including human biological fluids. It is at that point the field of clinical proteomics was born, and its fundamental aim was the discovery and (ideally) validation of biomarkers for the diagnosis, prognosis, or therapeutic monitoring of disease. Eventually, it was recognized that the technologies used in clinical proteomics studies [particularly liquid chromatography-tandem mass spectrometry (LC-MS/MS)] could represent an alternative to classical immunochemical assays. Prior to deploying MS in the measurement of peptides/proteins in the clinical laboratory, it seems likely that traditional proteomics workflows and data management systems will need to adapt to the clinical environment and meet in vitro diagnostic (IVD) regulatory constraints. This defines a new field, as reviewed in this article, that we have termed quantitative Clinical Chemistry Proteomics (qCCP).
Resources for Functional Genomics Studies in Drosophila melanogaster

PubMed Central

Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert

2014-01-01

Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003
Genome sequence diversity and clues to the evolution of variola (smallpox) virus.

PubMed

Esposito, Joseph J; Sammons, Scott A; Frace, A Michael; Osborne, John D; Olsen-Rasmussen, Melissa; Zhang, Ming; Govil, Dhwani; Damon, Inger K; Kline, Richard; Laker, Miriam; Li, Yu; Smith, Geoffrey L; Meyer, Hermann; Leduc, James W; Wohlhueter, Robert M

2006-08-11

Comparative genomics of 45 epidemiologically varied variola virus isolates from the past 30 years of the smallpox era indicate low sequence diversity, suggesting that there is probably little difference in the isolates' functional gene content. Phylogenetic clustering inferred three clades coincident with their geographical origin and case-fatality rate; the latter implicated putative proteins that mediate viral virulence differences. Analysis of the viral linear DNA genome suggests that its evolution involved direct descent and DNA end-region recombination events. Knowing the sequences will help understand the viral proteome and improve diagnostic test precision, therapeutics, and systems for their assessment.
CPTAC | Office of Cancer Clinical Proteomics Research

Cancer.gov

The National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) is a national effort to accelerate the understanding of the molecular basis of cancer through the application of large-scale proteome and genome analysis, or proteogenomics.
Comparative proteomic analysis reveals that T3SS, Tfp, and xanthan gum are key factors in initial stages of Citrus sinensis infection by Xanthomonas citri subsp. citri.

PubMed

Facincani, Agda P; Moreira, Leandro M; Soares, Márcia R; Ferreira, Cristiano B; Ferreira, Rafael M; Ferro, Maria I T; Ferro, Jesus A; Gozzo, Fabio C; de Oliveira, Julio C F

2014-03-01

The bacteria Xanthomonas citri subsp. citri (Xac) is the causal agent of citrus canker. The disease symptoms are characterized by localized host cell hyperplasia followed by tissue necrosis at the infected area. An arsenal of bacterial pathogenicity- and virulence-related proteins is expressed to ensure a successful infection process. At the post-genomic stage of Xac, we used a proteomic approach to analyze the proteins that are displayed differentially over time when the pathogen attacks the host plant. Protein extracts were prepared from infectious Xac grown in inducing medium (XAM1) for 24 h or from host citrus plants for 3 or 5 days after infection, detached times to evaluate the adaptation and virulence of the pathogen. The protein extracts were proteolyzed, and the peptides derived from tryptic digestion were investigated using liquid chromatography and tandem mass spectrometry. Changes in the protein expression profile were compared with the Xac genome and the proteome recently described under non-infectious conditions. An analysis of the proteome of Xac under infectious conditions revealed proteins directly involved in virulence such as the type III secretion system (T3SS) and effector proteins (T3SS-e), the type IV pilus (Tfp), and xanthan gum biosynthesis. Moreover, four new mutants related to proteins detected in the proteome and with different functions exhibited reduced virulence relative to the wild-type proteins. The results of the proteome analysis of infectious Xac define the processes of adaptation to the host and demonstrate the induction of the virulence factors of Xac involved in plant-pathogen interactions.
Search for sarcoidosis candidate genes by integration of data from genomic, transcriptomic and proteomic studies.

PubMed

Maver, Ales; Medica, Igor; Peterlin, Borut

2009-12-01

The search for gene candidates in multifactorial diseases such as sarcoidosis can be based on the integration of linkage association data, gene expression data, and protein profile data from genomic, transcriptomic and proteomic studies, respectively. In this study we performed a literature-based search for studies reporting such data, followed by integration of collected information. Different databases were examined--Medline, HugGE Navigator, ArrayExpress and Gene Expression Omnibus (GEO). Candidate genes were defined as genes which were reported in at least 2 different types of omics studies. Genes previously investigated in sarcoidosis were excluded from further analyses. We identified 177 genes associated with sarcoidosis as potential new candidate genes. Subsequently, 9 gene candidates identified to overlap in 2 different types of studies (genomic, transcriptomic and/or proteomic) were consistently reported in at least 3 studies: SERPINB1, FABP4, S100A8, HBEGF, IL7R, LRIG1, PTPN23, DPM2 and NUP214. These genes are involved in regulation of immune response, cellular proliferation, apoptosis, inhibition of protease activity, lipid metabolism. Exact biological functions of HBEGF, LRIG1, PTPN23, DPM2 and NUP214 remain to be completely elucidated. We propose 9 candidate genes: SERPINB1, FABP4, S100A8, HBEGF, IL7R, LRIG1, PTPN23, DPM2 and NUP214, as genes with high potential for association with sarcoidosis.
Analysis of secreted proteins from Aspergillus flavus.

PubMed

Medina, Martha L; Haynes, Paul A; Breci, Linda; Francisco, Wilson A

2005-08-01

MS/MS techniques in proteomics make possible the identification of proteins from organisms with little or no genome sequence information available. Peptide sequences are obtained from tandem mass spectra by matching peptide mass and fragmentation information to protein sequence information from related organisms, including unannotated genome sequence data. This peptide identification data can then be grouped and reconstructed into protein data. In this study, we have used this approach to study protein secretion by Aspergillus flavus, a filamentous fungus for which very little genome sequence information is available. A. flavus is capable of degrading the flavonoid rutin (quercetin 3-O-glycoside), as the only source of carbon via an extracellular enzyme system. In this continuing study, a proteomic analysis was used to identify secreted proteins from A. flavus when grown on rutin. The growth media glucose and potato dextrose were used to identify differentially expressed secreted proteins. The secreted proteins were analyzed by 1- and 2-DE and MS/MS. A total of 51 unique A. flavus secreted proteins were identified from the three growth conditions. Ten proteins were unique to rutin-, five to glucose- and one to potato dextrose-grown A. flavus. Sixteen secreted proteins were common to all three media. Fourteen identifications were of hypothetical proteins or proteins of unknown functions. To our knowledge, this is the first extensive proteomic study conducted to identify the secreted proteins from a filamentous fungus.
Activity-based proteomics of enzyme superfamilies: serine hydrolases as a case study.

PubMed

Simon, Gabriel M; Cravatt, Benjamin F

2010-04-09

Genome sequencing projects have uncovered thousands of uncharacterized enzymes in eukaryotic and prokaryotic organisms. Deciphering the physiological functions of enzymes requires tools to profile and perturb their activities in native biological systems. Activity-based protein profiling has emerged as a powerful chemoproteomic strategy to achieve these objectives through the use of chemical probes that target large swaths of enzymes that share active-site features. Here, we review activity-based protein profiling and its implementation to annotate the enzymatic proteome, with particular attention given to probes that target serine hydrolases, a diverse superfamily of enzymes replete with many uncharacterized members.
A DATABASE FOR TRACKING TOXICOGENOMIC SAMPLES AND PROCEDURES WITH GENOMIC, PROTEOMIC AND METABONOMIC COMPONENTS

EPA Science Inventory

A Database for Tracking Toxicogenomic Samples and Procedures with Genomic, Proteomic and Metabonomic Components
Wenjun Bao1, Jennifer Fostel2, Michael D. Waters2, B. Alex Merrick2, Drew Ekman3, Mitchell Kostich4, Judith Schmid1, David Dix1
Office of Research and Developmen...
Technological advances and genomics in metazoan parasites.

PubMed

Knox, D P

2004-02-01

Molecular biology has provided the means to identify parasite proteins, to define their function, patterns of expression and the means to produce them in quantity for subsequent functional analyses. Whole genome and expressed sequence tag programmes, and the parallel development of powerful bioinformatics tools, allow the execution of genome-wide between stage or species comparisons and meaningful gene-expression profiling. The latter can be undertaken with several new technologies such as DNA microarray and serial analysis of gene expression. Proteome analysis has come to the fore in recent years providing a crucial link between the gene and its protein product. RNA interference and ballistic gene transfer are exciting developments which can provide the means to precisely define the function of individual genes and, of importance in devising novel parasite control strategies, the effect that gene knockdown will have on parasite survival.
Recent developments in structural proteomics for protein structure determination.

PubMed

Liu, Hsuan-Liang; Hsu, Jyh-Ping

2005-05-01

The major challenges in structural proteomics include identifying all the proteins on the genome-wide scale, determining their structure-function relationships, and outlining the precise three-dimensional structures of the proteins. Protein structures are typically determined by experimental approaches such as X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy. However, the knowledge of three-dimensional space by these techniques is still limited. Thus, computational methods such as comparative and de novo approaches and molecular dynamic simulations are intensively used as alternative tools to predict the three-dimensional structures and dynamic behavior of proteins. This review summarizes recent developments in structural proteomics for protein structure determination; including instrumental methods such as X-ray crystallography and NMR spectroscopy, and computational methods such as comparative and de novo structure prediction and molecular dynamics simulations.
Evolution of complete proteomes: guanine-cytosine pressure, phylogeny and environmental influences blend the proteomic architecture

PubMed Central

2013-01-01

Background Guanine-cytosine (GC) composition is an important feature of genomes. Likewise, amino acid composition is a distinct, but less valued, feature of proteomes. A major concern is that it is not clear what valuable information can be acquired from amino acid composition data. To address this concern, in-depth analyses of the amino acid composition of the complete proteomes from 63 archaea, 270 bacteria, and 128 eukaryotes were performed. Results Principal component analysis of the amino acid matrices showed that the main contributors to proteomic architecture were genomic GC variation, phylogeny, and environmental influences. GC pressure drove positive selection on Ala, Arg, Gly, Pro, Trp, and Val, and adverse selection on Asn, Lys, Ile, Phe, and Tyr. The physico-chemical framework of the complete proteomes withstood GC pressure by frequency complementation of GC-dependent amino acid pairs with similar physico-chemical properties. Gln, His, Ser, and Val were responsible for phylogeny and their constituted components could differentiate archaea, bacteria, and eukaryotes. Environmental niche was also a significant factor in determining proteomic architecture, especially for archaea for which the main amino acids were Cys, Leu, and Thr. In archaea, hyperthermophiles, acidophiles, mesophiles, psychrophiles, and halophiles gathered successively along the environment-based principal component. Concordance between proteomic architecture and the genetic code was also related closely to genomic GC content, phylogeny, and lifestyles. Conclusions Large-scale analyses of the complete proteomes of a wide range of organisms suggested that amino acid composition retained the trace of GC variation, phylogeny, and environmental influences during evolution. The findings from this study will help in the development of a global understanding of proteome evolution, and even biological evolution. PMID:24088322
Comparison of Normal and Breast Cancer Cell lines using Proteome, Genome and Interactome data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Patwardhan, Anil J.; Strittmatter, Eric F.; Camp, David G.

2005-12-01

Normal and cancer cell line proteomes were profiled using high throughput mass spectrometry techniques. Application of both protein-level and peptide-level sample fractionation combined with LC-MS/MS analysis enabled the confident identification of 2,235 unmodified proteins representing a broad range of functional and compartmental classes. An iterative multi-step search strategy was used to identify post-translational modifications and detected several proteins that are preferentially modified in cancer cells. Information regarding both unmodified and modified protein forms was combined with publicly available gene expression and protein-protein interaction data. The resulting integrated dataset revealed several functionally related proteins that are differentially regulated between normal andmore » cancer cell lines.« less

Manifestations and mechanisms of stem cell aging

PubMed Central

Liu, Ling

2011-01-01

Adult stem cells exist in most mammalian organs and tissues and are indispensable for normal tissue homeostasis and repair. In most tissues, there is an age-related decline in stem cell functionality but not a depletion of stem cells. Such functional changes reflect deleterious effects of age on the genome, epigenome, and proteome, some of which arise cell autonomously and others of which are imposed by an age-related change in the local milieu or systemic environment. Notably, some of the changes, particularly epigenomic and proteomic, are potentially reversible, and both environmental and genetic interventions can result in the rejuvenation of aged stem cells. Such findings have profound implications for the stem cell–based therapy of age-related diseases. PMID:21502357
An Improved, Bias-Reduced Probabilistic Functional Gene Network of Baker's Yeast, Saccharomyces cerevisiae

PubMed Central

Lee, Insuk; Li, Zhihua; Marcotte, Edward M.

2007-01-01

Background Probabilistic functional gene networks are powerful theoretical frameworks for integrating heterogeneous functional genomics and proteomics data into objective models of cellular systems. Such networks provide syntheses of millions of discrete experimental observations, spanning DNA microarray experiments, physical protein interactions, genetic interactions, and comparative genomics; the resulting networks can then be easily applied to generate testable hypotheses regarding specific gene functions and associations. Methodology/Principal Findings We report a significantly improved version (v. 2) of a probabilistic functional gene network [1] of the baker's yeast, Saccharomyces cerevisiae. We describe our optimization methods and illustrate their effects in three major areas: the reduction of functional bias in network training reference sets, the application of a probabilistic model for calculating confidences in pair-wise protein physical or genetic interactions, and the introduction of simple thresholds that eliminate many false positive mRNA co-expression relationships. Using the network, we predict and experimentally verify the function of the yeast RNA binding protein Puf6 in 60S ribosomal subunit biogenesis. Conclusions/Significance YeastNet v. 2, constructed using these optimizations together with additional data, shows significant reduction in bias and improvements in precision and recall, in total covering 102,803 linkages among 5,483 yeast proteins (95% of the validated proteome). YeastNet is available from http://www.yeastnet.org. PMID:17912365
DOE Office of Scientific and Technical Information (OSTI.GOV)

Denef, Vincent; Shah, Manesh B; Verberkmoes, Nathan C

The recent surge in microbial genomic sequencing, combined with the development of high-throughput liquid chromatography-mass-spectrometry-based (LC/LC-MS/MS) proteomics, has raised the question of the extent to which genomic information of one strain or environmental sample can be used to profile proteomes of related strains or samples. Even with decreasing sequencing costs, it remains impractical to obtain genomic sequence for every strain or sample analyzed. Here, we evaluate how shotgun proteomics is affected by amino acid divergence between the sample and the genomic database using a probability-based model and a random mutation simulation model constrained by experimental data. To assess the effectsmore » of nonrandom distribution of mutations, we also evaluated identification levels using in silico peptide data from sequenced isolates with average amino acid identities (AAI) varying between 76 and 98%. We compared the predictions to experimental protein identification levels for a sample that was evaluated using a database that included genomic information for the dominant organism and for a closely related variant (95% AAI). The range of models set the boundaries at which half of the proteins in a proteomic experiment can be identified to be 77-92% AAI between orthologs in the sample and database. Consistent with this prediction, experimental data indicated loss of half the identifiable proteins at 90% AAI. Additional analysis indicated a 6.4% reduction of the initial protein coverage per 1% amino acid divergence and total identification loss at 86% AAI. Consequently, shotgun proteomics is capable of cross-strain identifications but avoids most crossspecies false positives.« less
Omics on bioleaching: current and future impacts.

PubMed

Martinez, Patricio; Vera, Mario; Bobadilla-Fazzini, Roberto A

2015-10-01

Bioleaching corresponds to the microbial-catalyzed process of conversion of insoluble metals into soluble forms. As an applied biotechnology globally used, it represents an extremely interesting field of research where omics techniques can be applied in terms of knowledge development, but moreover in terms of process design, control, and optimization. In this mini-review, the current state of genomics, proteomics, and metabolomics of bioleaching and the major impacts of these analytical methods at industrial scale are highlighted. In summary, genomics has been essential in the determination of the biodiversity of leaching processes and for development of conceptual and functional metabolic models. Proteomic impacts are mostly related to microbe-mineral interaction analysis, including copper resistance and biofilm formation. Early steps of metabolomics in the field of bioleaching have shown a significant potential for the use of metabolites as industrial biomarkers. Development directions are given in order to enhance the future impacts of the omics in biohydrometallurgy.
Post-genomics of microsporidia, with emphasis on a model of minimal eukaryotic proteome: a review.

PubMed

Texier, Catherine; Brosson, Damien; El Alaoui, Hicham; Méténier, Guy; Vivarès, Christian P

2005-05-01

The genome sequence of the microsporidian parasite Encephalitozoon cuniculi Levaditi, Nicolau et Schoen, 1923 contains about 2,000 genes that are representative of a non-redundant potential proteome composed of 1,909 protein chains. The purpose of this review is to relate some advances in the characterisation of this proteome through bioinformatics and experimental approaches. The reduced diversity of the set of E. cuniculi proteins is perceptible in all the compilations of predicted domains, orthologs, families and superfamilies, available in several public databases. The phyletic patterns of orthologs for seven eukaryotic organisms support an extensive gene loss in the fungal clade, with additional deletions in E. cuniculi. Most microsporidial orthologs are the smallest ones among eukaryotes, justifying an interest in the use of these compacted proteins to better discriminate between essential and non-essential regions. The three components of the E. cuniculi mRNA capping apparatus have been especially well characterized and the three-dimensional structure of the cap methyltransferase has been elucidated following the crystallisation of the microsporidial enzyme Ecm1. So far, our mass spectrometry-based analyses of the E. cuniculi spore proteome has led to the identification of about 170 proteins, one-quarter of these having no clearly predicted function. Immunocytochemical studies are in progress to determine the subcellular localisation of microsporidia-specific proteins. Post-translational modifications such as phosphorylation and glycosylation are expected to be soon explored.
An Integrated Proteomics/Transcriptomics Approach Points to Oxygen as the Main Electron Sink for Methanol Metabolism in Methylotenera mobilis▿†

PubMed Central

Beck, David A. C.; Hendrickson, Erik L.; Vorobev, Alexey; Wang, Tiansong; Lim, Sujung; Kalyuzhnaya, Marina G.; Lidstrom, Mary E.; Hackett, Murray; Chistoserdova, Ludmila

2011-01-01

Methylotenera species, unlike their close relatives in the genera Methylophilus, Methylobacillus, and Methylovorus, neither exhibit the activity of methanol dehydrogenase nor possess mxaFI genes encoding this enzyme, yet they are able to grow on methanol. In this work, we integrated a genome-wide proteomics approach, shotgun proteomics, and a genome-wide transcriptomics approach, shotgun transcriptome sequencing (RNA-seq), of Methylotenera mobilis JLW8 to identify genes and enzymes potentially involved in methanol oxidation, with special attention to alternative nitrogen sources, to address the question of whether nitrate could play a role as an electron acceptor in place of oxygen. Both proteomics and transcriptomics identified a limited number of genes and enzymes specifically responding to methanol. This set includes genes involved in oxidative stress response systems, a number of oxidoreductases, including XoxF-type alcohol dehydrogenases, a type II secretion system, and proteins without a predicted function. Nitrate stimulated expression of some genes in assimilatory nitrate reduction and denitrification pathways, while ammonium downregulated some of the nitrogen metabolism genes. However, none of these genes appeared to respond to methanol, which suggests that oxygen may be the main electron sink during growth on methanol. This study identifies initial targets for future focused physiological studies, including mutant analysis, which will provide further details into this novel process. PMID:21764938
Proteomic Approaches and Identification of Novel Therapeutic Targets for Alcoholism

PubMed Central

Gorini, Giorgio; Adron Harris, R; Dayne Mayfield, R

2014-01-01

Recent studies have shown that gene regulation is far more complex than previously believed and does not completely explain changes at the protein level. Therefore, the direct study of the proteome, considerably different in both complexity and dynamicity to the genome/transcriptome, has provided unique insights to an increasing number of researchers. During the past decade, extraordinary advances in proteomic techniques have changed the way we can analyze the composition, regulation, and function of protein complexes and pathways underlying altered neurobiological conditions. When combined with complementary approaches, these advances provide the contextual information for decoding large data sets into meaningful biologically adaptive processes. Neuroproteomics offers potential breakthroughs in the field of alcohol research by leading to a deeper understanding of how alcohol globally affects protein structure, function, interactions, and networks. The wealth of information gained from these advances can help pinpoint relevant biomarkers for early diagnosis and improved prognosis of alcoholism and identify future pharmacological targets for the treatment of this addiction. PMID:23900301
Proteomics and comparative genomics of Nitrososphaera viennensis reveal the core genome and adaptations of archaeal ammonia oxidizers

PubMed Central

Kerou, Melina; Offre, Pierre; Valledor, Luis; Abby, Sophie S.; Melcher, Michael; Nagler, Matthias; Weckwerth, Wolfram; Schleper, Christa

2016-01-01

Ammonia-oxidizing archaea (AOA) are among the most abundant microorganisms and key players in the global nitrogen and carbon cycles. They share a common energy metabolism but represent a heterogeneous group with respect to their environmental distribution and adaptions, growth requirements, and genome contents. We report here the genome and proteome of Nitrososphaera viennensis EN76, the type species of the archaeal class Nitrososphaeria of the phylum Thaumarchaeota encompassing all known AOA. N. viennensis is a soil organism with a 2.52-Mb genome and 3,123 predicted protein-coding genes. Proteomic analysis revealed that nearly 50% of the predicted genes were translated under standard laboratory growth conditions. Comparison with genomes of closely related species of the predominantly terrestrial Nitrososphaerales as well as the more streamlined marine Nitrosopumilales [Candidatus (Ca.) order] and the acidophile “Ca. Nitrosotalea devanaterra” revealed a core genome of AOA comprising 860 genes, which allowed for the reconstruction of central metabolic pathways common to all known AOA and expressed in the N. viennensis and “Ca. Nitrosopelagicus brevis” proteomes. Concomitantly, we were able to identify candidate proteins for as yet unidentified crucial steps in central metabolisms. In addition to unraveling aspects of core AOA metabolism, we identified specific metabolic innovations associated with the Nitrososphaerales mediating growth and survival in the soil milieu, including the capacity for biofilm formation, cell surface modifications and cell adhesion, and carbohydrate conversions as well as detoxification of aromatic compounds and drugs. PMID:27864514
Focused Metabolite Profiling for Dissecting Cellular and Molecular Processes of Living Organisms in Space Environments

NASA Technical Reports Server (NTRS)

2008-01-01

Regulatory control in biological systems is exerted at all levels within the central dogma of biology. Metabolites are the end products of all cellular regulatory processes and reflect the ultimate outcome of potential changes suggested by genomics and proteomics caused by an environmental stimulus or genetic modification. Following on the heels of genomics, transcriptomics, and proteomics, metabolomics has become an inevitable part of complete-system biology because none of the lower "-omics" alone provide direct information about how changes in mRNA or protein are coupled to changes in biological function. The challenges are much greater than those encountered in genomics because of the greater number of metabolites and the greater diversity of their chemical structures and properties. To meet these challenges, much developmental work is needed, including (1) methodologies for unbiased extraction of metabolites and subsequent quantification, (2) algorithms for systematic identification of metabolites, (3) expertise and competency in handling a large amount of information (data set), and (4) integration of metabolomics with other "omics" and data mining (implication of the information). This article reviews the project accomplishments.
PROTEOMICS OF THE AMNIOTIC FLUID IN ASSESSMENT OF THE PLACENTA – RELEVANCE FOR PRETERM BIRTH

PubMed Central

Buhimschi, Irina A.; Buhimschi, Catalin S.

2008-01-01

Proteomics is the study of expressed proteins and has emerged as a complement to genomic research. The major advantage of proteomics over DNA-RNA based technologies is that it more closely relates to phenotype and not the source code. Proteomics thus holds the promise of providing direct insight into the true mechanisms of human disease. Historically, examination of the placenta was the first modality to subclassify pathogenetical entities responsible for preterm birth. Because placenta is a key pathophysiological participant in several major obstetrical syndromes (preterm birth, preeclampsia, intrauterine growth restriction) identification of relevant biomarkers of placental function can profoundly impact on the prediction of fetal outcome and treatment efficacy. Proteomics is a young science and studies that associate proteomic patterns with long-term outcome require follow-up of children up to school age. In the interim, placental pathological footprints of cellular injury can be useful as intermediate outcomes. Furthermore, knowledge of the identity of the dys-regulated proteins may provide the necessary insight into novel pathophysiological pathways and unravel possible targets for therapeutic intervention that could not have been envisioned through hypothesis-driven approaches. PMID:18191197
Application of an Improved Proteomics Method for Abundant Protein Cleanup: Molecular and Genomic Mechanisms Study in Plant Defense*

PubMed Central

Zhang, Yixiang; Gao, Peng; Xing, Zhuo; Jin, Shumei; Chen, Zhide; Liu, Lantao; Constantino, Nasie; Wang, Xinwang; Shi, Weibing; Yuan, Joshua S.; Dai, Susie Y.

2013-01-01

High abundance proteins like ribulose-1,5-bisphosphate carboxylase oxygenase (Rubisco) impose a consistent challenge for the whole proteome characterization using shot-gun proteomics. To address this challenge, we developed and evaluated Polyethyleneimine Assisted Rubisco Cleanup (PARC) as a new method by combining both abundant protein removal and fractionation. The new approach was applied to a plant insect interaction study to validate the platform and investigate mechanisms for plant defense against herbivorous insects. Our results indicated that PARC can effectively remove Rubisco, improve the protein identification, and discover almost three times more differentially regulated proteins. The significantly enhanced shot-gun proteomics performance was translated into in-depth proteomic and molecular mechanisms for plant insect interaction, where carbon re-distribution was used to play an essential role. Moreover, the transcriptomic validation also confirmed the reliability of PARC analysis. Finally, functional studies were carried out for two differentially regulated genes as revealed by PARC analysis. Insect resistance was induced by over-expressing either jacalin-like or cupin-like genes in rice. The results further highlighted that PARC can serve as an effective strategy for proteomics analysis and gene discovery. PMID:23943779
Mining for Microbial Gems: Integrating Proteomics in the Postgenomic Natural Product Discovery Pipeline.

PubMed

Du, Chao; van Wezel, Gilles P

2018-04-30

Natural products (NPs) are a major source of compounds for medical, agricultural, and biotechnological industries. Many of these compounds are of microbial origin, and, in particular, from Actinobacteria or filamentous fungi. To successfully identify novel compounds that correlate to a bioactivity of interest, or discover new enzymes with desired functions, systematic multiomics approaches have been developed over the years. Bioinformatics tools harness the rapidly expanding wealth of genome sequence information, revealing previously unsuspected biosynthetic diversity. Varying growth conditions or application of elicitors are applied to activate cryptic biosynthetic gene clusters, and metabolomics provide detailed insights into the NPs they specify. Combining these technologies with proteomics-based approaches to profile the biosynthetic enzymes provides scientists with insights into the full biosynthetic potential of microorganisms. The proteomics approaches include enrichment strategies such as employing activity-based probes designed by chemical biology, as well as unbiased (quantitative) proteomics methods. In this review, the opportunities and challenges in microbial NP research are discussed, and, in particular, the application of proteomics to link biosynthetic enzymes to the molecules they produce, and vice versa. © 2018 The Authors. Proteomics Published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Identifying Bacterial Immune Evasion Proteins Using Phage Display.

PubMed

Fevre, Cindy; Scheepmaker, Lisette; Haas, Pieter-Jan

2017-01-01

Methods aimed at identification of immune evasion proteins are mainly rely on in silico prediction of sequence, structural homology to known evasion proteins or use a proteomics driven approach. Although proven successful these methods are limited by a low efficiency and or lack of functional identification. Here we describe a high-throughput genomic strategy to functionally identify bacterial immune evasion proteins using phage display technology. Genomic bacterial DNA is randomly fragmented and ligated into a phage display vector that is used to create a phage display library expressing bacterial secreted and membrane bound proteins. This library is used to select displayed bacterial secretome proteins that interact with host immune components.
Rare Disease Mechanisms Identified by Genealogical Proteomics of Copper Homeostasis Mutant Pedigrees.

PubMed

Zlatic, Stephanie A; Vrailas-Mortimer, Alysia; Gokhale, Avanti; Carey, Lucas J; Scott, Elizabeth; Burch, Reid; McCall, Morgan M; Rudin-Rush, Samantha; Davis, John Bowen; Hartwig, Cortnie; Werner, Erica; Li, Lian; Petris, Michael; Faundez, Victor

2018-03-28

Rare neurological diseases shed light onto universal neurobiological processes. However, molecular mechanisms connecting genetic defects to their disease phenotypes are elusive. Here, we obtain mechanistic information by comparing proteomes of cells from individuals with rare disorders with proteomes from their disease-free consanguineous relatives. We use triple-SILAC mass spectrometry to quantify proteomes from human pedigrees affected by mutations in ATP7A, which cause Menkes disease, a rare neurodegenerative and neurodevelopmental disorder stemming from systemic copper depletion. We identified 214 proteins whose expression was altered in ATP7A -/y fibroblasts. Bioinformatic analysis of ATP7A-mutant proteomes identified known phenotypes and processes affected in rare genetic diseases causing copper dyshomeostasis, including altered mitochondrial function. We found connections between copper dyshomeostasis and the UCHL1/PARK5 pathway of Parkinson disease, which we validated with mitochondrial respiration and Drosophila genetics assays. We propose that our genealogical "omics" strategy can be broadly applied to identify mechanisms linking a genomic locus to its phenotypes. Copyright © 2018 Elsevier Inc. All rights reserved.
High-throughput and targeted in-depth mass spectrometry-based approaches for biofluid profiling and biomarker discovery.

PubMed

Jimenez, Connie R; Piersma, Sander; Pham, Thang V

2007-12-01

Proteomics aims to create a link between genomic information, biological function and disease through global studies of protein expression, modification and protein-protein interactions. Recent advances in key proteomics tools, such as mass spectrometry (MS) and (bio)informatics, provide tremendous opportunities for biomarker-related clinical applications. In this review, we focus on two complementary MS-based approaches with high potential for the discovery of biomarker patterns and low-abundant candidate biomarkers in biofluids: high-throughput matrix-assisted laser desorption/ionization time-of-flight mass spectroscopy-based methods for peptidome profiling and label-free liquid chromatography-based methods coupled to MS for in-depth profiling of biofluids with a focus on subproteomes, including the low-molecular-weight proteome, carrier-bound proteome and N-linked glycoproteome. The two approaches differ in their aims, throughput and sensitivity. We discuss recent progress and challenges in the analysis of plasma/serum and proximal fluids using these strategies and highlight the potential of liquid chromatography-MS-based proteomics of cancer cell and tumor secretomes for the discovery of candidate blood-based biomarkers. Strategies for candidate validation are also described.
Purification and fractionation of membranes for proteomic analyses.

PubMed

Marmagne, Anne; Salvi, Daniel; Rolland, Norbert; Ephritikhine, Geneviève; Joyard, Jacques; Barbier-Brygoo, Hélène

2006-01-01

Proteomics is a very powerful approach to link the information contained in sequenced genomes, such as Arabidopsis, to the functional knowledge provided by studies of plant cell compartments. However, membrane proteomics remains a challenge. One way to bring into view the complex mixture of proteins present in a membrane is to develop proteomic analyses based on (1) the use of highly purified membrane fractions and (2) fractionation of membrane proteins to retrieve as many proteins as possible (from the most to the less hydrophobic ones). To illustrate such strategies, we choose two types of membranes, the plasma membrane and the chloroplast envelope membranes. Both types of membranes can be prepared in a reasonable degree of purity from different types of tissues: the plasma membrane from cultured cells and the chloroplast envelope membrane from whole plants. This article is restricted to the description of methods for the preparation of highly purified and characterized plant membrane fractions and the subsequent fractionation of these membrane proteins according to simple physicochemical criteria (i.e., chloroform/methanol extraction, alkaline or saline treatments) for further analyses using modern proteomic methodologies.
Post-genomics nanotechnology is gaining momentum: nanoproteomics and applications in life sciences.

PubMed

Kobeissy, Firas H; Gulbakan, Basri; Alawieh, Ali; Karam, Pierre; Zhang, Zhiqun; Guingab-Cagmat, Joy D; Mondello, Stefania; Tan, Weihong; Anagli, John; Wang, Kevin

2014-02-01

The post-genomics era has brought about new Omics biotechnologies, such as proteomics and metabolomics, as well as their novel applications to personal genomics and the quantified self. These advances are now also catalyzing other and newer post-genomics innovations, leading to convergences between Omics and nanotechnology. In this work, we systematically contextualize and exemplify an emerging strand of post-genomics life sciences, namely, nanoproteomics and its applications in health and integrative biological systems. Nanotechnology has been utilized as a complementary component to revolutionize proteomics through different kinds of nanotechnology applications, including nanoporous structures, functionalized nanoparticles, quantum dots, and polymeric nanostructures. Those applications, though still in their infancy, have led to several highly sensitive diagnostics and new methods of drug delivery and targeted therapy for clinical use. The present article differs from previous analyses of nanoproteomics in that it offers an in-depth and comparative evaluation of the attendant biotechnology portfolio and their applications as seen through the lens of post-genomics life sciences and biomedicine. These include: (1) immunosensors for inflammatory, pathogenic, and autoimmune markers for infectious and autoimmune diseases, (2) amplified immunoassays for detection of cancer biomarkers, and (3) methods for targeted therapy and automatically adjusted drug delivery such as in experimental stroke and brain injury studies. As nanoproteomics becomes available both to the clinician at the bedside and the citizens who are increasingly interested in access to novel post-genomics diagnostics through initiatives such as the quantified self, we anticipate further breakthroughs in personalized and targeted medicine.
Genome-wide identification, functional and evolutionary analysis of terpene synthases in pineapple.

PubMed

Chen, Xiaoe; Yang, Wei; Zhang, Liqin; Wu, Xianmiao; Cheng, Tian; Li, Guanglin

2017-10-01

Terpene synthases (TPSs) are vital for the biosynthesis of active terpenoids, which have important physiological, ecological and medicinal value. Although terpenoids have been reported in pineapple (Ananas comosus), genome-wide investigations of the TPS genes responsible for pineapple terpenoid synthesis are still lacking. By integrating pineapple genome and proteome data, twenty-one putative terpene synthase genes were found in pineapple and divided into five subfamilies. Tandem duplication is the cause of TPS gene family duplication. Furthermore, functional differentiation between each TPS subfamily may have occurred for several reasons. Sixty-two key amino acid sites were identified as being type-II functionally divergence between TPS-a and TPS-c subfamily. Finally, coevolution analysis indicated that multiple amino acid residues are involved in coevolutionary processes. In addition, the enzyme activity of two TPSs were tested. This genome-wide identification, functional and evolutionary analysis of pineapple TPS genes provide a new insight into understanding the roles of TPS family and lay the basis for further characterizing the function and evolution of TPS gene family. Copyright © 2017 Elsevier Ltd. All rights reserved.
Post-translational processing targets functionally diverse proteins in Mycoplasma hyopneumoniae

PubMed Central

Tacchi, Jessica L.; Raymond, Benjamin B. A.; Haynes, Paul A.; Berry, Iain J.; Widjaja, Michael; Bogema, Daniel R.; Woolley, Lauren K.; Jenkins, Cheryl; Minion, F. Chris; Padula, Matthew P.; Djordjevic, Steven P.

2016-01-01

Mycoplasma hyopneumoniae is a genome-reduced, cell wall-less, bacterial pathogen with a predicted coding capacity of less than 700 proteins and is one of the smallest self-replicating pathogens. The cell surface of M. hyopneumoniae is extensively modified by processing events that target the P97 and P102 adhesin families. Here, we present analyses of the proteome of M. hyopneumoniae-type strain J using protein-centric approaches (one- and two-dimensional GeLC–MS/MS) that enabled us to focus on global processing events in this species. While these approaches only identified 52% of the predicted proteome (347 proteins), our analyses identified 35 surface-associated proteins with widely divergent functions that were targets of unusual endoproteolytic processing events, including cell adhesins, lipoproteins and proteins with canonical functions in the cytosol that moonlight on the cell surface. Affinity chromatography assays that separately used heparin, fibronectin, actin and host epithelial cell surface proteins as bait recovered cleavage products derived from these processed proteins, suggesting these fragments interact directly with the bait proteins and display previously unrecognized adhesive functions. We hypothesize that protein processing is underestimated as a post-translational modification in genome-reduced bacteria and prokaryotes more broadly, and represents an important mechanism for creating cell surface protein diversity. PMID:26865024
A proteome view of structural, functional, and taxonomic characteristics of major protein domain clusters.

PubMed

Sun, Chia-Tsen; Chiang, Austin W T; Hwang, Ming-Jing

2017-10-27

Proteome-scale bioinformatics research is increasingly conducted as the number of completely sequenced genomes increases, but analysis of protein domains (PDs) usually relies on similarity in their amino acid sequences and/or three-dimensional structures. Here, we present results from a bi-clustering analysis on presence/absence data for 6,580 unique PDs in 2,134 species with a sequenced genome, thus covering a complete set of proteins, for the three superkingdoms of life, Bacteria, Archaea, and Eukarya. Our analysis revealed eight distinctive PD clusters, which, following an analysis of enrichment of Gene Ontology functions and CATH classification of protein structures, were shown to exhibit structural and functional properties that are taxa-characteristic. For examples, the largest cluster is ubiquitous in all three superkingdoms, constituting a set of 1,472 persistent domains created early in evolution and retained in living organisms and characterized by basic cellular functions and ancient structural architectures, while an Archaea and Eukarya bi-superkingdom cluster suggests its PDs may have existed in the ancestor of the two superkingdoms, and others are single superkingdom- or taxa (e.g. Fungi)-specific. These results contribute to increase our appreciation of PD diversity and our knowledge of how PDs are used in species, yielding implications on species evolution.

The Changing Face of Scientific Discourse: Analysis of Genomic and Proteomic Database Usage and Acceptance.

ERIC Educational Resources Information Center

Brown, Cecelia

2003-01-01

Discusses the growth in use and acceptance of Web-based genomic and proteomic databases (GPD) in scholarly communication. Confirms the role of GPD in the scientific literature cycle, suggests GPD are a storage and retrieval mechanism for molecular biology information, and recommends that existing models of scientific communication be updated to…
Comparative genomic and proteomic analyses of Clostridium acetobutylicum Rh8 and its parent strain DSM 1731 revealed new understandings on butanol tolerance

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bao, Guanhui; University of Chinese Academy of Sciences, Beijing; Dong, Hongjun

Highlights: • Genomes of a butanol tolerant strain and its parent strain were deciphered. • Comparative genomic and proteomic was applied to understand butanol tolerance. • None differentially expressed proteins have mutations in its corresponding genes. • Mutations in ribosome might be responsible for the global difference of proteomics. - Abstract: Clostridium acetobutylicum strain Rh8 is a butanol-tolerant mutant which can tolerate up to 19 g/L butanol, 46% higher than that of its parent strain DSM 1731. We previously performed comparative cytoplasm- and membrane-proteomic analyses to understand the mechanism underlying the improved butanol tolerance of strain Rh8. In this work,more » we further extended this comparison to the genomic level. Compared with the genome of the parent strain DSM 1731, two insertion sites, four deletion sites, and 67 single nucleotide variations (SNVs) are distributed throughout the genome of strain Rh8. Among the 67 SNVs, 16 SNVs are located in the predicted promoters and intergenic regions; while 29 SNVs are located in the coding sequence, affecting a total of 21 proteins involved in transport, cell structure, DNA replication, and protein translation. The remaining 22 SNVs are located in the ribosomal genes, affecting a total of 12 rRNA genes in different operons. Analysis of previous comparative proteomic data indicated that none of the differentially expressed proteins have mutations in its corresponding genes. Rchange Algorithms analysis indicated that the mutations occurred in the ribosomal genes might change the ribosome RNA thermodynamic characteristics, thus affect the translation strength of these proteins. Take together, the improved butanol tolerance of C. acetobutylicum strain Rh8 might be acquired through regulating the translational process to achieve different expression strength of genes involved in butanol tolerance.« less
Single Amino Acid Repeats in the Proteome World: Structural, Functional, and Evolutionary Insights

PubMed Central

Kumar, Amitha Sampath; Sowpati, Divya Tej; Mishra, Rakesh K.

2016-01-01

Microsatellites or simple sequence repeats (SSR) are abundant, highly diverse stretches of short DNA repeats present in all genomes. Tandem mono/tri/hexanucleotide repeats in the coding regions contribute to single amino acids repeats (SAARs) in the proteome. While SSRs in the coding region always result in amino acid repeats, a majority of SAARs arise due to a combination of various codons representing the same amino acid and not as a consequence of SSR events. Certain amino acids are abundant in repeat regions indicating a positive selection pressure behind the accumulation of SAARs. By analysing 22 proteomes including the human proteome, we explored the functional and structural relationship of amino acid repeats in an evolutionary context. Only ~15% of repeats are present in any known functional domain, while ~74% of repeats are present in the disordered regions, suggesting that SAARs add to the functionality of proteins by providing flexibility, stability and act as linker elements between domains. Comparison of SAAR containing proteins across species reveals that while shorter repeats are conserved among orthologs, proteins with longer repeats, >15 amino acids, are unique to the respective organism. Lysine repeats are well conserved among orthologs with respect to their length and number of occurrences in a protein. Other amino acids such as glutamic acid, proline, serine and alanine repeats are generally conserved among the orthologs with varying repeat lengths. These findings suggest that SAARs have accumulated in the proteome under positive selection pressure and that they provide flexibility for optimal folding of functional/structural domains of proteins. The insights gained from our observations can help in effective designing and engineering of proteins with novel features. PMID:27893794
New Funding Opportunity Announcements (FOAs): Reissuance of Clinical Proteomic Tumor Analysis Consortium (CPTAC) | Office of Cancer Clinical Proteomics Research

Cancer.gov

The National Cancer Institute is soliciting applications for the reissuance of its Clinical Proteomic Tumor Analysis Consortium (CPTAC) program. CPTAC will support broad efforts focused on several cancer types to explore further the complexities of cancer proteomes and their connections to abnormalities in cancer genomes.
Transcriptome- Assisted Label-Free Quantitative Proteomics Analysis Reveals Novel Insights into Piper nigrum—Phytophthora capsici Phytopathosystem

PubMed Central

Mahadevan, Chidambareswaren; Krishnan, Anu; Saraswathy, Gayathri G.; Surendran, Arun; Jaleel, Abdul; Sakuntala, Manjula

2016-01-01

Black pepper (Piper nigrum L.), a tropical spice crop of global acclaim, is susceptible to Phytophthora capsici, an oomycete pathogen which causes the highly destructive foot rot disease. A systematic understanding of this phytopathosystem has not been possible owing to lack of genome or proteome information. In this study, we explain an integrated transcriptome-assisted label-free quantitative proteomics pipeline to study the basal immune components of black pepper when challenged with P. capsici. We report a global identification of 532 novel leaf proteins from black pepper, of which 518 proteins were functionally annotated using BLAST2GO tool. A label-free quantitation of the protein datasets revealed 194 proteins common to diseased and control protein datasets of which 22 proteins showed significant up-regulation and 134 showed significant down-regulation. Ninety-three proteins were identified exclusively on P. capsici infected leaf tissues and 245 were expressed only in mock (control) infected samples. In-depth analysis of our data gives novel insights into the regulatory pathways of black pepper which are compromised during the infection. Differential down-regulation was observed in a number of critical pathways like carbon fixation in photosynthetic organism, cyano-amino acid metabolism, fructose, and mannose metabolism, glutathione metabolism, and phenylpropanoid biosynthesis. The proteomics results were validated with real-time qRT-PCR analysis. We were also able to identify the complete coding sequences for all the proteins of which few selected genes were cloned and sequence characterized for further confirmation. Our study is the first report of a quantitative proteomics dataset in black pepper which provides convincing evidence on the effectiveness of a transcriptome-based label-free proteomics approach for elucidating the host response to biotic stress in a non-model spice crop like P. nigrum, for which genome information is unavailable. Our dataset will serve as a useful resource for future studies in this plant. Data are available via ProteomeXchange with identifier PXD003887. PMID:27379110
Transcriptome- Assisted Label-Free Quantitative Proteomics Analysis Reveals Novel Insights into Piper nigrum-Phytophthora capsici Phytopathosystem.

PubMed

Mahadevan, Chidambareswaren; Krishnan, Anu; Saraswathy, Gayathri G; Surendran, Arun; Jaleel, Abdul; Sakuntala, Manjula

2016-01-01

Black pepper (Piper nigrum L.), a tropical spice crop of global acclaim, is susceptible to Phytophthora capsici, an oomycete pathogen which causes the highly destructive foot rot disease. A systematic understanding of this phytopathosystem has not been possible owing to lack of genome or proteome information. In this study, we explain an integrated transcriptome-assisted label-free quantitative proteomics pipeline to study the basal immune components of black pepper when challenged with P. capsici. We report a global identification of 532 novel leaf proteins from black pepper, of which 518 proteins were functionally annotated using BLAST2GO tool. A label-free quantitation of the protein datasets revealed 194 proteins common to diseased and control protein datasets of which 22 proteins showed significant up-regulation and 134 showed significant down-regulation. Ninety-three proteins were identified exclusively on P. capsici infected leaf tissues and 245 were expressed only in mock (control) infected samples. In-depth analysis of our data gives novel insights into the regulatory pathways of black pepper which are compromised during the infection. Differential down-regulation was observed in a number of critical pathways like carbon fixation in photosynthetic organism, cyano-amino acid metabolism, fructose, and mannose metabolism, glutathione metabolism, and phenylpropanoid biosynthesis. The proteomics results were validated with real-time qRT-PCR analysis. We were also able to identify the complete coding sequences for all the proteins of which few selected genes were cloned and sequence characterized for further confirmation. Our study is the first report of a quantitative proteomics dataset in black pepper which provides convincing evidence on the effectiveness of a transcriptome-based label-free proteomics approach for elucidating the host response to biotic stress in a non-model spice crop like P. nigrum, for which genome information is unavailable. Our dataset will serve as a useful resource for future studies in this plant. Data are available via ProteomeXchange with identifier PXD003887.
Novel phage group infecting Lactobacillus delbrueckii subsp. lactis, as revealed by genomic and proteomic analysis of bacteriophage Ldl1.

PubMed

Casey, Eoghan; Mahony, Jennifer; Neve, Horst; Noben, Jean-Paul; Dal Bello, Fabio; van Sinderen, Douwe

2015-02-01

Ldl1 is a virulent phage infecting the dairy starter Lactobacillus delbrueckii subsp. lactis LdlS. Electron microscopy analysis revealed that this phage exhibits a large head and a long tail and bears little resemblance to other characterized phages infecting Lactobacillus delbrueckii. In vitro propagation of this phage revealed a latent period of 30 to 40 min and a burst size of 59.9 +/- 1.9 phage particles. Comparative genomic and proteomic analyses showed remarkable similarity between the genome of Ldl1 and that of Lactobacillus plantarum phage ATCC 8014-B2. The genomic and proteomic characteristics of Ldl1 demonstrate that this phage does not belong to any of the four previously recognized L. delbrueckii phage groups, necessitating the creation of a new group, called group e, thus adding to the knowledge on the diversity of phages targeting strains of this industrially important lactic acid bacterial species.
Comprehensive Analysis of Cancer-Proteogenome to Identify Biomarkers for the Early Diagnosis and Prognosis of Cancer.

PubMed

Shukla, Hem D

2017-10-25

During the past century, our understanding of cancer diagnosis and treatment has been based on a monogenic approach, and as a consequence our knowledge of the clinical genetic underpinnings of cancer is incomplete. Since the completion of the human genome in 2003, it has steered us into therapeutic target discovery, enabling us to mine the genome using cutting edge proteogenomics tools. A number of novel and promising cancer targets have emerged from the genome project for diagnostics, therapeutics, and prognostic markers, which are being used to monitor response to cancer treatment. The heterogeneous nature of cancer has hindered progress in understanding the underlying mechanisms that lead to abnormal cellular growth. Since, the start of The Cancer Genome Atlas (TCGA), and the International Genome consortium projects, there has been tremendous progress in genome sequencing and immense numbers of cancer genomes have been completed, and this approach has transformed our understanding of the diagnosis and treatment of different types of cancers. By employing Genomics and proteomics technologies, an immense amount of genomic data is being generated on clinical tumors, which has transformed the cancer landscape and has the potential to transform cancer diagnosis and prognosis. A complete molecular view of the cancer landscape is necessary for understanding the underlying mechanisms of cancer initiation to improve diagnosis and prognosis, which ultimately will lead to personalized treatment. Interestingly, cancer proteome analysis has also allowed us to identify biomarkers to monitor drug and radiation resistance in patients undergoing cancer treatment. Further, TCGA-funded studies have allowed for the genomic and transcriptomic characterization of targeted cancers, this analysis aiding the development of targeted therapies for highly lethal malignancy. High-throughput technologies, such as complete proteome, epigenome, protein-protein interaction, and pharmacogenomics data, are indispensable to glean into the cancer genome and proteome and these approaches have generated multidimensional universal studies of genes and proteins (OMICS) data which has the potential to facilitate precision medicine. However, due to slow progress in computational technologies, the translation of big omics data into their clinical aspects have been slow. In this review, attempts have been made to describe the role of high-throughput genomic and proteomic technologies in identifying a panel of biomarkers which could be used for the early diagnosis and prognosis of cancer.
Proteome of Caulobacter crescentus cell cycle publicly accessible on SWICZ server.

PubMed

Vohradsky, Jiri; Janda, Ivan; Grünenfelder, Björn; Berndt, Peter; Röder, Daniel; Langen, Hanno; Weiser, Jaroslav; Jenal, Urs

2003-10-01

Here we present the Swiss-Czech Proteomics Server (SWICZ), which hosts the proteomic database summarizing information about the cell cycle of the aquatic bacterium Caulobacter crescentus. The database provides a searchable tool for easy access of global protein synthesis and protein stability data as examined during the C. crescentus cell cycle. Protein synthesis data collected from five different cell cycle stages were determined for each protein spot as a relative value of the total amount of [(35)S]methionine incorporation. Protein stability of pulse-labeled extracts were measured during a chase period equivalent to one cell cycle unit. Quantitative information for individual proteins together with descriptive data such as protein identities, apparent molecular masses and isoelectric points, were combined with information on protein function, genomic context, and the cell cycle stage, and were then assembled in a relational database with a world wide web interface (http://proteom.biomed.cas.cz), which allows the database records to be searched and displays the recovered information. A total of 1250 protein spots were reproducibly detected on two-dimensional gel electropherograms, 295 of which were identified by mass spectroscopy. The database is accessible either through clickable two-dimensional gel electrophoretic maps or by means of a set of dedicated search engines. Basic characterization of the experimental procedures, data processing, and a comprehensive description of the web site are presented. In its current state, the SWICZ proteome database provides a platform for the incorporation of new data emerging from extended functional studies on the C. crescentus proteome.
Comparative Omics-Driven Genome Annotation Refinement: Application across Yersiniae

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rutledge, Alexandra C.; Jones, Marcus B.; Chauhan, Sadhana

2012-03-27

Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. To date, the perceived value of manual curation for genome annotations is not offset by the real cost and time associated with the process. In order to balance the large number of sequences generated, the annotation process is now performed almost exclusively in an automated fashion for most genome sequencing projects. One possible way to reduce errors inherent to automated computational annotations is to apply data from 'omics' measurements (i.e. transcriptional and proteomic) to themore » un-annotated genome with a proteogenomic-based approach. This approach does require additional experimental and bioinformatics methods to include omics technologies; however, the approach is readily automatable and can benefit from rapid developments occurring in those research domains as well. The annotation process can be improved by experimental validation of transcription and translation and aid in the discovery of annotation errors. Here the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species, as is becoming common in sequencing efforts. Transcriptomic and proteomic data derived from three highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 previously incorrect protein-coding sequences (e.g., observed frameshifts, extended start sites, and translated pseudogenes) within the three current Yersinia genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus the discovery of many translated pseudogenes underscores a need for functional analyses to investigate hypotheses related to divergence. Refinements included the discovery of a seemingly essential ribosomal protein, several virulence-associated factors, and a transcriptional regulator, among other proteins, most of which are annotated as hypothetical, that were missed during annotation.« less
Functional genomic Landscape of Human Breast Cancer drivers, vulnerabilities, and resistance

PubMed Central

Marcotte, Richard; Sayad, Azin; Brown, Kevin R.; Sanchez-Garcia, Felix; Reimand, Jüri; Haider, Maliha; Virtanen, Carl; Bradner, James E.; Bader, Gary D.; Mills, Gordon B.; Pe’er, Dana; Moffat, Jason; Neel, Benjamin G.

2016-01-01

Summary Large-scale genomic studies have identified multiple somatic aberrations in breast cancer, including copy number alterations, and point mutations. Still, identifying causal variants and emergent vulnerabilities that arise as a consequence of genetic alterations remain major challenges. We performed whole genome shRNA “dropout screens” on 77 breast cancer cell lines. Using a hierarchical linear regression algorithm to score our screen results and integrate them with accompanying detailed genetic and proteomic information, we identify vulnerabilities in breast cancer, including candidate “drivers,” and reveal general functional genomic properties of cancer cells. Comparisons of gene essentiality with drug sensitivity data suggest potential resistance mechanisms, effects of existing anti-cancer drugs, and opportunities for combination therapy. Finally, we demonstrate the utility of this large dataset by identifying BRD4 as a potential target in luminal breast cancer, and PIK3CA mutations as a resistance determinant for BET-inhibitors. PMID:26771497
[Progress in stable isotope labeled quantitative proteomics methods].

PubMed

Zhou, Yuan; Shan, Yichu; Zhang, Lihua; Zhang, Yukui

2013-06-01

Quantitative proteomics is an important research field in post-genomics era. There are two strategies for proteome quantification: label-free methods and stable isotope labeling methods which have become the most important strategy for quantitative proteomics at present. In the past few years, a number of quantitative methods have been developed, which support the fast development in biology research. In this work, we discuss the progress in the stable isotope labeling methods for quantitative proteomics including relative and absolute quantitative proteomics, and then give our opinions on the outlook of proteome quantification methods.
Evolution of Clinical Proteomics and its Role in Medicine | Office of Cancer Clinical Proteomics Research

Cancer.gov

NCI's Office of Cancer Clinical Proteomics Research authored a review of the current state of clinical proteomics in the peer-reviewed Journal of Proteome Research. The review highlights outcomes from the CPTC program and also provides a thorough overview of the different technologies that have pushed the field forward. Additionally, the review provides a vision for moving the field forward through linking advances in genomic and proteomic analysis to develop new, molecularly targeted interventions.
pyGeno: A Python package for precision medicine and proteogenomics.

PubMed

Daouda, Tariq; Perreault, Claude; Lemieux, Sébastien

2016-01-01

pyGeno is a Python package mainly intended for precision medicine applications that revolve around genomics and proteomics. It integrates reference sequences and annotations from Ensembl, genomic polymorphisms from the dbSNP database and data from next-gen sequencing into an easy to use, memory-efficient and fast framework, therefore allowing the user to easily explore subject-specific genomes and proteomes. Compared to a standalone program, pyGeno gives the user access to the complete expressivity of Python, a general programming language. Its range of application therefore encompasses both short scripts and large scale genome-wide studies.
pyGeno: A Python package for precision medicine and proteogenomics

PubMed Central

Daouda, Tariq; Perreault, Claude; Lemieux, Sébastien

2016-01-01

pyGeno is a Python package mainly intended for precision medicine applications that revolve around genomics and proteomics. It integrates reference sequences and annotations from Ensembl, genomic polymorphisms from the dbSNP database and data from next-gen sequencing into an easy to use, memory-efficient and fast framework, therefore allowing the user to easily explore subject-specific genomes and proteomes. Compared to a standalone program, pyGeno gives the user access to the complete expressivity of Python, a general programming language. Its range of application therefore encompasses both short scripts and large scale genome-wide studies. PMID:27785359
UFO: a web server for ultra-fast functional profiling of whole genome protein sequences.

PubMed

Meinicke, Peter

2009-09-02

Functional profiling is a key technique to characterize and compare the functional potential of entire genomes. The estimation of profiles according to an assignment of sequences to functional categories is a computationally expensive task because it requires the comparison of all protein sequences from a genome with a usually large database of annotated sequences or sequence families. Based on machine learning techniques for Pfam domain detection, the UFO web server for ultra-fast functional profiling allows researchers to process large protein sequence collections instantaneously. Besides the frequencies of Pfam and GO categories, the user also obtains the sequence specific assignments to Pfam domain families. In addition, a comparison with existing genomes provides dissimilarity scores with respect to 821 reference proteomes. Considering the underlying UFO domain detection, the results on 206 test genomes indicate a high sensitivity of the approach. In comparison with current state-of-the-art HMMs, the runtime measurements show a considerable speed up in the range of four orders of magnitude. For an average size prokaryotic genome, the computation of a functional profile together with its comparison typically requires about 10 seconds of processing time. For the first time the UFO web server makes it possible to get a quick overview on the functional inventory of newly sequenced organisms. The genome scale comparison with a large number of precomputed profiles allows a first guess about functionally related organisms. The service is freely available and does not require user registration or specification of a valid email address.
Proteomics of eukaryotic microorganisms: The medically and biotechnologically important fungal genus Aspergillus.

PubMed

Kniemeyer, Olaf

2011-08-01

Fungal species of the genus Aspergillus play significant roles as model organisms in basic research, as "cell factories" for the production of organic acids, pharmaceuticals or industrially important enzymes and as pathogens causing superficial and invasive infections in animals and humans. The release of the genome sequences of several Aspergillus sp. has paved the way for global analyses of protein expression in Aspergilli including the characterisation of proteins, which have not designated any function. With the application of proteomic methods, particularly 2-D gel and LC-MS/MS-based methods, first insights into the composition of the proteome of Aspergilli under different growth and stress conditions could be gained. Putative targets of global regulators led to the improvement of industrially relevant Aspergillus strains and so far not described Aspergillus antigens have already been discovered. Here, I review the recent proteome data generated for the species Aspergillus nidulans, Aspergillus fumigatus, Aspergillus niger, Aspergillus terreus, Aspergillus flavus and Aspergillus oryzae. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Plant functional genomics

NASA Astrophysics Data System (ADS)

Holtorf, Hauke; Guitton, Marie-Christine; Reski, Ralf

2002-04-01

Functional genome analysis of plants has entered the high-throughput stage. The complete genome information from key species such as Arabidopsis thaliana and rice is now available and will further boost the application of a range of new technologies to functional plant gene analysis. To broadly assign functions to unknown genes, different fast and multiparallel approaches are currently used and developed. These new technologies are based on known methods but are adapted and improved to accommodate for comprehensive, large-scale gene analysis, i.e. such techniques are novel in the sense that their design allows researchers to analyse many genes at the same time and at an unprecedented pace. Such methods allow analysis of the different constituents of the cell that help to deduce gene function, namely the transcripts, proteins and metabolites. Similarly the phenotypic variations of entire mutant collections can now be analysed in a much faster and more efficient way than before. The different methodologies have developed to form their own fields within the functional genomics technological platform and are termed transcriptomics, proteomics, metabolomics and phenomics. Gene function, however, cannot solely be inferred by using only one such approach. Rather, it is only by bringing together all the information collected by different functional genomic tools that one will be able to unequivocally assign functions to unknown plant genes. This review focuses on current technical developments and their impact on the field of plant functional genomics. The lower plant Physcomitrella is introduced as a new model system for gene function analysis, owing to its high rate of homologous recombination.
The Cancer Genome Atlas Clinical Explorer: a web and mobile interface for identifying clinical-genomic driver associations.

PubMed

Lee, HoJoon; Palm, Jennifer; Grimes, Susan M; Ji, Hanlee P

2015-10-27

The Cancer Genome Atlas (TCGA) project has generated genomic data sets covering over 20 malignancies. These data provide valuable insights into the underlying genetic and genomic basis of cancer. However, exploring the relationship among TCGA genomic results and clinical phenotype remains a challenge, particularly for individuals lacking formal bioinformatics training. Overcoming this hurdle is an important step toward the wider clinical translation of cancer genomic/proteomic data and implementation of precision cancer medicine. Several websites such as the cBio portal or University of California Santa Cruz genome browser make TCGA data accessible but lack interactive features for querying clinically relevant phenotypic associations with cancer drivers. To enable exploration of the clinical-genomic driver associations from TCGA data, we developed the Cancer Genome Atlas Clinical Explorer. The Cancer Genome Atlas Clinical Explorer interface provides a straightforward platform to query TCGA data using one of the following methods: (1) searching for clinically relevant genes, micro RNAs, and proteins by name, cancer types, or clinical parameters; (2) searching for genomic/proteomic profile changes by clinical parameters in a cancer type; or (3) testing two-hit hypotheses. SQL queries run in the background and results are displayed on our portal in an easy-to-navigate interface according to user's input. To derive these associations, we relied on elastic-net estimates of optimal multiple linear regularized regression and clinical parameters in the space of multiple genomic/proteomic features provided by TCGA data. Moreover, we identified and ranked gene/micro RNA/protein predictors of each clinical parameter for each cancer. The robustness of the results was estimated by bootstrapping. Overall, we identify associations of potential clinical relevance among genes/micro RNAs/proteins using our statistical analysis from 25 cancer types and 18 clinical parameters that include clinical stage or smoking history. The Cancer Genome Atlas Clinical Explorer enables the cancer research community and others to explore clinically relevant associations inferred from TCGA data. With its accessible web and mobile interface, users can examine queries and test hypothesis regarding genomic/proteomic alterations across a broad spectrum of malignancies.
Impact of SNPs on Protein Phosphorylation Status in Rice (Oryza sativa L.).

PubMed

Lin, Shoukai; Chen, Lijuan; Tao, Huan; Huang, Jian; Xu, Chaoqun; Li, Lin; Ma, Shiwei; Tian, Tian; Liu, Wei; Xue, Lichun; Ai, Yufang; He, Huaqin

2016-11-11

Single nucleotide polymorphisms (SNPs) are widely used in functional genomics and genetics research work. The high-quality sequence of rice genome has provided a genome-wide SNP and proteome resource. However, the impact of SNPs on protein phosphorylation status in rice is not fully understood. In this paper, we firstly updated rice SNP resource based on the new rice genome Ver. 7.0, then systematically analyzed the potential impact of Non-synonymous SNPs (nsSNPs) on the protein phosphorylation status. There were 3,897,312 SNPs in Ver. 7.0 rice genome, among which 9.9% was nsSNPs. Whilst, a total 2,508,261 phosphorylated sites were predicted in rice proteome. Interestingly, we observed that 150,197 (39.1%) nsSNPs could influence protein phosphorylation status, among which 52.2% might induce changes of protein kinase (PK) types for adjacent phosphorylation sites. We constructed a database, SNP_rice, to deposit the updated rice SNP resource and phosSNPs information. It was freely available to academic researchers at http://bioinformatics.fafu.edu.cn. As a case study, we detected five nsSNPs that potentially influenced heterotrimeric G proteins phosphorylation status in rice, indicating that genetic polymorphisms showed impact on the signal transduction by influencing the phosphorylation status of heterotrimeric G proteins. The results in this work could be a useful resource for future experimental identification and provide interesting information for better rice breeding.

CyanOmics: an integrated database of omics for the model cyanobacterium Synechococcus sp. PCC 7002.

PubMed

Yang, Yaohua; Feng, Jie; Li, Tao; Ge, Feng; Zhao, Jindong

2015-01-01

Cyanobacteria are an important group of organisms that carry out oxygenic photosynthesis and play vital roles in both the carbon and nitrogen cycles of the Earth. The annotated genome of Synechococcus sp. PCC 7002, as an ideal model cyanobacterium, is available. A series of transcriptomic and proteomic studies of Synechococcus sp. PCC 7002 cells grown under different conditions have been reported. However, no database of such integrated omics studies has been constructed. Here we present CyanOmics, a database based on the results of Synechococcus sp. PCC 7002 omics studies. CyanOmics comprises one genomic dataset, 29 transcriptomic datasets and one proteomic dataset and should prove useful for systematic and comprehensive analysis of all those data. Powerful browsing and searching tools are integrated to help users directly access information of interest with enhanced visualization of the analytical results. Furthermore, Blast is included for sequence-based similarity searching and Cluster 3.0, as well as the R hclust function is provided for cluster analyses, to increase CyanOmics's usefulness. To the best of our knowledge, it is the first integrated omics analysis database for cyanobacteria. This database should further understanding of the transcriptional patterns, and proteomic profiling of Synechococcus sp. PCC 7002 and other cyanobacteria. Additionally, the entire database framework is applicable to any sequenced prokaryotic genome and could be applied to other integrated omics analysis projects. Database URL: http://lag.ihb.ac.cn/cyanomics. © The Author(s) 2015. Published by Oxford University Press.
Exploration of Panviral Proteome: High-Throughput Cloning and Functional Implications in Virus-host Interactions

PubMed Central

Yu, Xiaobo; Bian, Xiaofang; Throop, Andrea; Song, Lusheng; Moral, Lerys Del; Park, Jin; Seiler, Catherine; Fiacco, Michael; Steel, Jason; Hunter, Preston; Saul, Justin; Wang, Jie; Qiu, Ji; Pipas, James M.; LaBaer, Joshua

2014-01-01

Throughout the long history of virus-host co-evolution, viruses have developed delicate strategies to facilitate their invasion and replication of their genome, while silencing the host immune responses through various mechanisms. The systematic characterization of viral protein-host interactions would yield invaluable information in the understanding of viral invasion/evasion, diagnosis and therapeutic treatment of a viral infection, and mechanisms of host biology. With more than 2,000 viral genomes sequenced, only a small percent of them are well investigated. The access of these viral open reading frames (ORFs) in a flexible cloning format would greatly facilitate both in vitro and in vivo virus-host interaction studies. However, the overall progress of viral ORF cloning has been slow. To facilitate viral studies, we are releasing the initiation of our panviral proteome collection of 2,035 ORF clones from 830 viral genes in the Gateway® recombinational cloning system. Here, we demonstrate several uses of our viral collection including highly efficient production of viral proteins using human cell-free expression system in vitro, global identification of host targets for rubella virus using Nucleic Acid Programmable Protein Arrays (NAPPA) containing 10,000 unique human proteins, and detection of host serological responses using micro-fluidic multiplexed immunoassays. The studies presented here begin to elucidate host-viral protein interactions with our systemic utilization of viral ORFs, high-throughput cloning, and proteomic technologies. These valuable plasmid resources will be available to the research community to enable continued viral functional studies. PMID:24955142
Exploration of panviral proteome: high-throughput cloning and functional implications in virus-host interactions.

PubMed

Yu, Xiaobo; Bian, Xiaofang; Throop, Andrea; Song, Lusheng; Moral, Lerys Del; Park, Jin; Seiler, Catherine; Fiacco, Michael; Steel, Jason; Hunter, Preston; Saul, Justin; Wang, Jie; Qiu, Ji; Pipas, James M; LaBaer, Joshua

2014-01-01

Throughout the long history of virus-host co-evolution, viruses have developed delicate strategies to facilitate their invasion and replication of their genome, while silencing the host immune responses through various mechanisms. The systematic characterization of viral protein-host interactions would yield invaluable information in the understanding of viral invasion/evasion, diagnosis and therapeutic treatment of a viral infection, and mechanisms of host biology. With more than 2,000 viral genomes sequenced, only a small percent of them are well investigated. The access of these viral open reading frames (ORFs) in a flexible cloning format would greatly facilitate both in vitro and in vivo virus-host interaction studies. However, the overall progress of viral ORF cloning has been slow. To facilitate viral studies, we are releasing the initiation of our panviral proteome collection of 2,035 ORF clones from 830 viral genes in the Gateway® recombinational cloning system. Here, we demonstrate several uses of our viral collection including highly efficient production of viral proteins using human cell-free expression system in vitro, global identification of host targets for rubella virus using Nucleic Acid Programmable Protein Arrays (NAPPA) containing 10,000 unique human proteins, and detection of host serological responses using micro-fluidic multiplexed immunoassays. The studies presented here begin to elucidate host-viral protein interactions with our systemic utilization of viral ORFs, high-throughput cloning, and proteomic technologies. These valuable plasmid resources will be available to the research community to enable continued viral functional studies.
Stepwise Evolution of Coral Biomineralization Revealed with Genome-Wide Proteomics and Transcriptomics

PubMed Central

Sawada, Hitoshi; Satoh, Noriyuki

2016-01-01

Despite the importance of stony corals in many research fields related to global issues, such as marine ecology, climate change, paleoclimatogy, and metazoan evolution, very little is known about the evolutionary origin of coral skeleton formation. In order to investigate the evolution of coral biomineralization, we have identified skeletal organic matrix proteins (SOMPs) in the skeletal proteome of the scleractinian coral, Acropora digitifera, for which large genomic and transcriptomic datasets are available. Scrupulous gene annotation was conducted based on comparisons of functional domain structures among metazoans. We found that SOMPs include not only coral-specific proteins, but also protein families that are widely conserved among cnidarians and other metazoans. We also identified several conserved transmembrane proteins in the skeletal proteome. Gene expression analysis revealed that expression of these conserved genes continues throughout development. Therefore, these genes are involved not only skeleton formation, but also in basic cellular functions, such as cell-cell interaction and signaling. On the other hand, genes encoding coral-specific proteins, including extracellular matrix domain-containing proteins, galaxins, and acidic proteins, were prominently expressed in post-settlement stages, indicating their role in skeleton formation. Taken together, the process of coral skeleton formation is hypothesized as: 1) formation of initial extracellular matrix between epithelial cells and substrate, employing pre-existing transmembrane proteins; 2) additional extracellular matrix formation using novel proteins that have emerged by domain shuffling and rapid molecular evolution and; 3) calcification controlled by coral-specific SOMPs. PMID:27253604
Genome-wide identification of pathogenicity factors of the free-living amoeba Naegleria fowleri.

PubMed

Zysset-Burri, Denise C; Müller, Norbert; Beuret, Christian; Heller, Manfred; Schürch, Nadia; Gottstein, Bruno; Wittwer, Matthias

2014-06-19

The free-living amoeba Naegleria fowleri is the causative agent of the rapidly progressing and typically fatal primary amoebic meningoencephalitis (PAM) in humans. Despite the devastating nature of this disease, which results in > 97% mortality, knowledge of the pathogenic mechanisms of the amoeba is incomplete. This work presents a comparative proteomic approach based on an experimental model in which the pathogenic potential of N. fowleri trophozoites is influenced by the compositions of different media. As a scaffold for proteomic analysis, we sequenced the genome and transcriptome of N. fowleri. Since the sequence similarity of the recently published genome of Naegleria gruberi was far lower than the close taxonomic relationship of these species would suggest, a de novo sequencing approach was chosen. After excluding cell regulatory mechanisms originating from different media compositions, we identified 22 proteins with a potential role in the pathogenesis of PAM. Functional annotation of these proteins revealed, that the membrane is the major location where the amoeba exerts its pathogenic potential, possibly involving actin-dependent processes such as intracellular trafficking via vesicles. This study describes for the first time the 30 Mb-genome and the transcriptome sequence of N. fowleri and provides the basis for the further definition of effective intervention strategies against the rare but highly fatal form of amoebic meningoencephalitis.
Directed proteomic analysis of the human nucleolus.

PubMed

Andersen, Jens S; Lyon, Carol E; Fox, Archa H; Leung, Anthony K L; Lam, Yun Wah; Steen, Hanno; Mann, Matthias; Lamond, Angus I

2002-01-08

The nucleolus is a subnuclear organelle containing the ribosomal RNA gene clusters and ribosome biogenesis factors. Recent studies suggest it may also have roles in RNA transport, RNA modification, and cell cycle regulation. Despite over 150 years of research into nucleoli, many aspects of their structure and function remain uncharacterized. We report a proteomic analysis of human nucleoli. Using a combination of mass spectrometry (MS) and sequence database searches, including online analysis of the draft human genome sequence, 271 proteins were identified. Over 30% of the nucleolar proteins were encoded by novel or uncharacterized genes, while the known proteins included several unexpected factors with no previously known nucleolar functions. MS analysis of nucleoli isolated from HeLa cells in which transcription had been inhibited showed that a subset of proteins was enriched. These data highlight the dynamic nature of the nucleolar proteome and show that proteins can either associate with nucleoli transiently or accumulate only under specific metabolic conditions. This extensive proteomic analysis shows that nucleoli have a surprisingly large protein complexity. The many novel factors and separate classes of proteins identified support the view that the nucleolus may perform additional functions beyond its known role in ribosome subunit biogenesis. The data also show that the protein composition of nucleoli is not static and can alter significantly in response to the metabolic state of the cell.
MEGGASENSE - The Metagenome/Genome Annotated Sequence Natural Language Search Engine: A Platform for  the Construction of Sequence Data Warehouses.

PubMed

Gacesa, Ranko; Zucko, Jurica; Petursdottir, Solveig K; Gudmundsdottir, Elisabet Eik; Fridjonsson, Olafur H; Diminic, Janko; Long, Paul F; Cullum, John; Hranueli, Daslav; Hreggvidsson, Gudmundur O; Starcevic, Antonio

2017-06-01

The MEGGASENSE platform constructs relational databases of DNA or protein sequences. The default functional analysis uses 14 106 hidden Markov model (HMM) profiles based on sequences in the KEGG database. The Solr search engine allows sophisticated queries and a BLAST search function is also incorporated. These standard capabilities were used to generate the SCATT database from the predicted proteome of Streptomyces cattleya . The implementation of a specialised metagenome database (AMYLOMICS) for bioprospecting of carbohydrate-modifying enzymes is described. In addition to standard assembly of reads, a novel 'functional' assembly was developed, in which screening of reads with the HMM profiles occurs before the assembly. The AMYLOMICS database incorporates additional HMM profiles for carbohydrate-modifying enzymes and it is illustrated how the combination of HMM and BLAST analyses helps identify interesting genes. A variety of different proteome and metagenome databases have been generated by MEGGASENSE.
Exploring hepsin functional genetic variation association with disease specific protein expression in bipolar disorder: Applications of a proteomic informed genomic approach.

PubMed

Nassan, Malik; Jia, Yun-Fang; Jenkins, Greg; Colby, Colin; Feeder, Scott; Choi, Doo-Sup; Veldic, Marin; McElroy, Susan L; Bond, David J; Weinshilboum, Richard; Biernacka, Joanna M; Frye, Mark A

2017-12-01

In a prior discovery study, increased levels of serum Growth Differentiation Factor 15 (GDF15), Hepsin (HPN), and Matrix Metalloproteinase-7 (MMP7) were observed in bipolar depressed patients vs controls. This exploratory post-hoc analysis applied a proteomic-informed genomic research strategy to study the potential functional role of these proteins in bipolar disorder (BP). Utilizing the Genotype-Tissue Expression (GTEx) database to identify cis-acting blood expression quantitative trait loci (cis-eQTLs), five eQTL variants from the HPN gene were analyzed for association with BP cases using genotype data of cases from the discovery study (n = 58) versus healthy controls (n = 777). After adjusting for relevant covariates, we analyzed the relationship between these 5 cis-eQTLs and HPN serum level in the BP cases. All 5 cis-eQTL minor alleles were significantly more frequent in BP cases vs controls [(rs62122114, OR = 1.6, p = 0.02), (rs67003112, OR = 1.6, p = 0.02), (rs4997929, OR = 1.7, p = 0.01), (rs12610663, OR = 1.7, p = 0.01), (rs62122148, OR = 1.7, P = 0.01)]. The minor allele (A) in rs62122114 was significantly associated with increased serum HPN level in BP cases (Beta = 0.12, P = 0.049). However, this same minor allele was associated with reduced gene expression in GTEx controls. These exploratory analyses suggest that genetic variation in/near the gene encoding for hepsin protein may influence risk of bipolar disorder. This genetic variation, at least for the rs62122114-A allele, may have functional impact (i.e. differential expression) as evidenced by serum HPN protein expression. Although limited by small sample size, this study highlights the merits of proteomic informed functional genomic studies as a tool to investigate with greater precision the genetic risk of bipolar disorder and secondary relationships to protein expression recognizing, and encouraging in subsequent studies, high likelihood of epigenetic modification of genetic disease risk. Copyright © 2017. Published by Elsevier Ltd.
The most common technologies and tools for functional genome analysis.

PubMed

Gasperskaja, Evelina; Kučinskas, Vaidutis

2017-01-01

Since the sequence of the human genome is complete, the main issue is how to understand the information written in the DNA sequence. Despite numerous genome-wide studies that have already been performed, the challenge to determine the function of genes, gene products, and also their interaction is still open. As changes in the human genome are highly likely to cause pathological conditions, functional analysis is vitally important for human health. For many years there have been a variety of technologies and tools used in functional genome analysis. However, only in the past decade there has been rapid revolutionizing progress and improvement in high-throughput methods, which are ranging from traditional real-time polymerase chain reaction to more complex systems, such as next-generation sequencing or mass spectrometry. Furthermore, not only laboratory investigation, but also accurate bioinformatic analysis is required for reliable scientific results. These methods give an opportunity for accurate and comprehensive functional analysis that involves various fields of studies: genomics, epigenomics, proteomics, and interactomics. This is essential for filling the gaps in the knowledge about dynamic biological processes at both cellular and organismal level. However, each method has both advantages and limitations that should be taken into account before choosing the right method for particular research in order to ensure successful study. For this reason, the present review paper aims to describe the most frequent and widely-used methods for the comprehensive functional analysis.
Open reading frames associated with cancer in the dark matter of the human genome.

PubMed

Delgado, Ana Paula; Brandao, Pamela; Chapado, Maria Julia; Hamid, Sheilin; Narayanan, Ramaswamy

2014-01-01

The uncharacterized proteins (open reading frames, ORFs) in the human genome offer an opportunity to discover novel targets for cancer. A systematic analysis of the dark matter of the human proteome for druggability and biomarker discovery is crucial to mining the genome. Numerous data mining tools are available to mine these ORFs to develop a comprehensive knowledge base for future target discovery and validation. Using the Genetic Association Database, the ORFs of the human dark matter proteome were screened for evidence of association with neoplasms. The Phenome-Genome Integrator tool was used to establish phenotypic association with disease traits including cancer. Batch analysis of the tools for protein expression analysis, gene ontology and motifs and domains was used to characterize the ORFs. Sixty-two ORFs were identified for neoplasm association. The expression Quantitative Trait Loci (eQTL) analysis identified thirteen ORFs related to cancer traits. Protein expression, motifs and domain analysis and genome-wide association studies verified the relevance of these OncoORFs in diverse tumors. The OncoORFs are also associated with a wide variety of human diseases and disorders. Our results link the OncoORFs to diverse diseases and disorders. This suggests a complex landscape of the uncharacterized proteome in human diseases. These results open the dark matter of the proteome to novel cancer target research. Copyright© 2014, International Institute of Anticancer Research (Dr. John G. Delinasios), All rights reserved.
Hands-on workshops as an effective means of learning advanced technologies including genomics, proteomics and bioinformatics.

PubMed

Reisdorph, Nichole; Stearman, Robert; Kechris, Katerina; Phang, Tzu Lip; Reisdorph, Richard; Prenni, Jessica; Erle, David J; Coldren, Christopher; Schey, Kevin; Nesvizhskii, Alexey; Geraci, Mark

2013-12-01

Genomics and proteomics have emerged as key technologies in biomedical research, resulting in a surge of interest in training by investigators keen to incorporate these technologies into their research. At least two types of training can be envisioned in order to produce meaningful results, quality publications and successful grant applications: (1) immediate short-term training workshops and (2) long-term graduate education or visiting scientist programs. We aimed to fill the former need by providing a comprehensive hands-on training course in genomics, proteomics and informatics in a coherent, experimentally-based framework. This was accomplished through a National Heart, Lung, and Blood Institute (NHLBI)-sponsored 10-day Genomics and Proteomics Hands-on Workshop held at National Jewish Health (NJH) and the University of Colorado School of Medicine (UCD). The course content included comprehensive lectures and laboratories in mass spectrometry and genomics technologies, extensive hands-on experience with instrumentation and software, video demonstrations, optional workshops, online sessions, invited keynote speakers, and local and national guest faculty. Here we describe the detailed curriculum and present the results of short- and long-term evaluations from course attendees. Our educational program consistently received positive reviews from participants and had a substantial impact on grant writing and review, manuscript submissions and publications. Copyright © 2013. Production and hosting by Elsevier Ltd.
Proteome Characterization Centers - TCGA

Cancer.gov

The centers, a component of NCI’s Clinical Proteomic Tumor Analysis Consortium, will analyze a subset of TCGA samples to define proteins translated from cancer genomes and their related biological processes.
Development of Advanced Technologies for Complete Genomic and Proteomic Characterization of Quantized Human Tumor Cells

DTIC Science & Technology

2015-09-01

glioblastoma . We have successfully established several patient-derived cell lines from glioblastoma tumors and further established a number of...and single-cell technologies. Although the focus of this research is glioblastoma , the proposed tools are generally applicable to all cancer-based...studies. 15. SUBJECT TERMS Human cohorts, Glioblastoma , Genomic, Proteomic, Single-cell technologies, Hypothesis-driven, integrative systems approach
Impact of nanoscale topography on genomics and proteomics of adherent bacteria.

PubMed

Rizzello, Loris; Sorce, Barbara; Sabella, Stefania; Vecchio, Giuseppe; Galeone, Antonio; Brunetti, Virgilio; Cingolani, Roberto; Pompa, Pier Paolo

2011-03-22

Bacterial adhesion onto inorganic/nanoengineered surfaces is a key issue in biotechnology and medicine, because it is one of the first necessary steps to determine a general pathogenic event. Understanding the molecular mechanisms of bacteria-surface interaction represents a milestone for planning a new generation of devices with unanimously certified antibacterial characteristics. Here, we show how highly controlled nanostructured substrates impact the bacterial behavior in terms of morphological, genomic, and proteomic response. We observed by atomic force microscopy (AFM) and scanning electron microscopy (SEM) that type-1 fimbriae typically disappear in Escherichia coli adherent onto nanostructured substrates, as opposed to bacteria onto reference glass or flat gold surfaces. A genetic variation of the fimbrial operon regulation was consistently identified by real time qPCR in bacteria interacting with the nanorough substrates. To gain a deeper insight into the molecular basis of the interaction mechanisms, we explored the entire proteomic profile of E. coli by 2D-DIGE, finding significant changes in the bacteria adherent onto the nanorough substrates, such as regulations of proteins involved in stress processes and defense mechanisms. We thus demonstrated that a pure physical stimulus, that is, a nanoscale variation of surface topography, may play per se a significant role in determining the morphological, genetic, and proteomic profile of bacteria. These data suggest that in depth investigations of the molecular processes of microorganisms adhering to surfaces are of great importance for the design of innovative biomaterials with active biological functionalities.
Genome-resolved metaproteomic characterization of preterm infant gut microbiota development reveals species-specific metabolic shifts and variabilities during early life

DOE PAGES

Xiong, Weili; Brown, Christopher T.; Morowitz, Michael J.; ...

2017-07-10

Establishment of the human gut microbiota begins at birth. This early-life microbiota development can impact host physiology during infancy and even across an entire life span. But, the functional stability and population structure of the gut microbiota during initial colonization remain poorly understood. Metaproteomics is an emerging technology for the large-scale characterization of metabolic functions in complex microbial communities (gut microbiota). We applied a metagenome-informed metaproteomic approach to study the temporal and inter-individual differences of metabolic functions during microbial colonization of preterm human infants’ gut. By analyzing 30 individual fecal samples, we identified up to 12,568 protein groups for eachmore » of four infants, including both human and microbial proteins. With genome-resolved matched metagenomics, proteins were confidently identified at the species/strain level. The maximum percentage of the proteome detected for the abundant organisms was ~45%. A time-dependent increase in the relative abundance of microbial versus human proteins suggested increasing microbial colonization during the first few weeks of early life. We observed remarkable variations and temporal shifts in the relative protein abundances of each organism in these preterm gut communities. Given the dissimilarity of the communities, only 81 microbial EggNOG orthologous groups and 57 human proteins were observed across all samples. These conserved microbial proteins were involved in carbohydrate, energy, amino acid and nucleotide metabolism while conserved human proteins were related to immune response and mucosal maturation. We also identified seven proteome clusters for the communities and showed infant gut proteome profiles were unstable across time and not individual-specific. By applying a gut-specific metabolic module (GMM) analysis, we found that gut communities varied primarily in the contribution of nutrient (carbohydrates, lipids, and amino acids) utilization and short-chain fatty acid production. Overall, this study reports species-specific proteome profiles and metabolic functions of human gut microbiota during early colonization. In particular, our work contributes to reveal microbiota-associated shifts and variations in the metabolism of three major nutrient sources and short-chain fatty acid during colonization of preterm infant gut.« less
Genome-resolved metaproteomic characterization of preterm infant gut microbiota development reveals species-specific metabolic shifts and variabilities during early life

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xiong, Weili; Brown, Christopher T.; Morowitz, Michael J.

Establishment of the human gut microbiota begins at birth. This early-life microbiota development can impact host physiology during infancy and even across an entire life span. But, the functional stability and population structure of the gut microbiota during initial colonization remain poorly understood. Metaproteomics is an emerging technology for the large-scale characterization of metabolic functions in complex microbial communities (gut microbiota). We applied a metagenome-informed metaproteomic approach to study the temporal and inter-individual differences of metabolic functions during microbial colonization of preterm human infants’ gut. By analyzing 30 individual fecal samples, we identified up to 12,568 protein groups for eachmore » of four infants, including both human and microbial proteins. With genome-resolved matched metagenomics, proteins were confidently identified at the species/strain level. The maximum percentage of the proteome detected for the abundant organisms was ~45%. A time-dependent increase in the relative abundance of microbial versus human proteins suggested increasing microbial colonization during the first few weeks of early life. We observed remarkable variations and temporal shifts in the relative protein abundances of each organism in these preterm gut communities. Given the dissimilarity of the communities, only 81 microbial EggNOG orthologous groups and 57 human proteins were observed across all samples. These conserved microbial proteins were involved in carbohydrate, energy, amino acid and nucleotide metabolism while conserved human proteins were related to immune response and mucosal maturation. We also identified seven proteome clusters for the communities and showed infant gut proteome profiles were unstable across time and not individual-specific. By applying a gut-specific metabolic module (GMM) analysis, we found that gut communities varied primarily in the contribution of nutrient (carbohydrates, lipids, and amino acids) utilization and short-chain fatty acid production. Overall, this study reports species-specific proteome profiles and metabolic functions of human gut microbiota during early colonization. In particular, our work contributes to reveal microbiota-associated shifts and variations in the metabolism of three major nutrient sources and short-chain fatty acid during colonization of preterm infant gut.« less
Genome-resolved metaproteomic characterization of preterm infant gut microbiota development reveals species-specific metabolic shifts and variabilities during early life.

PubMed

Xiong, Weili; Brown, Christopher T; Morowitz, Michael J; Banfield, Jillian F; Hettich, Robert L

2017-07-10

Establishment of the human gut microbiota begins at birth. This early-life microbiota development can impact host physiology during infancy and even across an entire life span. However, the functional stability and population structure of the gut microbiota during initial colonization remain poorly understood. Metaproteomics is an emerging technology for the large-scale characterization of metabolic functions in complex microbial communities (gut microbiota). We applied a metagenome-informed metaproteomic approach to study the temporal and inter-individual differences of metabolic functions during microbial colonization of preterm human infants' gut. By analyzing 30 individual fecal samples, we identified up to 12,568 protein groups for each of four infants, including both human and microbial proteins. With genome-resolved matched metagenomics, proteins were confidently identified at the species/strain level. The maximum percentage of the proteome detected for the abundant organisms was ~45%. A time-dependent increase in the relative abundance of microbial versus human proteins suggested increasing microbial colonization during the first few weeks of early life. We observed remarkable variations and temporal shifts in the relative protein abundances of each organism in these preterm gut communities. Given the dissimilarity of the communities, only 81 microbial EggNOG orthologous groups and 57 human proteins were observed across all samples. These conserved microbial proteins were involved in carbohydrate, energy, amino acid and nucleotide metabolism while conserved human proteins were related to immune response and mucosal maturation. We identified seven proteome clusters for the communities and showed infant gut proteome profiles were unstable across time and not individual-specific. Applying a gut-specific metabolic module (GMM) analysis, we found that gut communities varied primarily in the contribution of nutrient (carbohydrates, lipids, and amino acids) utilization and short-chain fatty acid production. Overall, this study reports species-specific proteome profiles and metabolic functions of human gut microbiota during early colonization. In particular, our work contributes to reveal microbiota-associated shifts and variations in the metabolism of three major nutrient sources and short-chain fatty acid during colonization of preterm infant gut.
Ascribing Functions to Genes: Journey Towards Genetic Improvement of Rice Via Functional Genomics

PubMed Central

Mustafiz, Ananda; Kumari, Sumita; Karan, Ratna

2016-01-01

Rice, one of the most important cereal crops for mankind, feeds more than half the world population. Rice has been heralded as a model cereal owing to its small genome size, amenability to easy transformation, high synteny to other cereal crops and availability of complete genome sequence. Moreover, sequence wealth in rice is getting more refined and precise due to resequencing efforts. This humungous resource of sequence data has confronted research fraternity with a herculean challenge as well as an excellent opportunity to functionally validate expressed as well as regulatory portions of the genome. This will not only help us in understanding the genetic basis of plant architecture and physiology but would also steer us towards developing improved cultivars. No single technique can achieve such a mammoth task. Functional genomics through its diverse tools viz. loss and gain of function mutants, multifarious omics strategies like transcriptomics, proteomics, metabolomics and phenomics provide us with the necessary handle. A paradigm shift in technological advances in functional genomics strategies has been instrumental in generating considerable amount of information w.r.t functionality of rice genome. We now have several databases and online resources for functionally validated genes but despite that we are far from reaching the desired milestone of functionally characterizing each and every rice gene. There is an urgent need for a common platform, for information already available in rice, and collaborative efforts between researchers in a concerted manner as well as healthy public-private partnership, for genetic improvement of rice crop better able to handle the pressures of climate change and exponentially increasing population. PMID:27252584
Responding to a Zoonotic Emergency with Multi-omics Research: Pentatrichomonas hominis Hydrogenosomal Protein Characterization with Use of RNA Sequencing and Proteomics.

PubMed

Fang, Yi-Kai; Chien, Kun-Yi; Huang, Kuo-Yang; Cheng, Wei-Hung; Ku, Fu-Mann; Lin, Rose; Chen, Ting-Wen; Huang, Po-Jung; Chiu, Cheng-Hsun; Tang, Petrus

2016-11-01

Pentatrichomonas hominis is an anaerobic flagellated protist that colonizes the large intestine of a number of mammals, including cats, dogs, nonhuman primates, and humans. The wide host range of this organism is alarming and suggests a rising zoonotic emergency. However, knowledge on in-depth biology of this protist is still limited. Similar to the human pathogen, Trichomonas vaginalis, P. hominis possesses hydrogenosomes instead of mitochondria. Studies in T. vaginalis indicated that hydrogenosome is essential for cell survival and associated with numerous pivotal biological functions, including drug resistance. To further decipher the biology of this important organelle, we undertook proteomic research in P. hominis hydrogenosomes. Lacking a decoded P. hominis genome, we utilized an RNA sequencing (RNA-seq) data set generated from P. hominis axenic culture as the reference for proteome analysis. Using this in-house reference data set and mass spectrometry (MS), we identified 442 putative hydrogenosomal proteins. Interestingly, the composition of the P. hominis hydrogenosomal proteins is very similar to that of T. vaginalis, but proteins such as Hmp36, Pam16, Pam18, and Isd11 are absent based on both MS and the RNA-seq. Our data underscore that P. hominis expresses different homologs of multiple gene families from T. vaginalis. To the best of our knowledge, we present here the first hydrogenosome proteome in a protist other than T. vaginalis that offers crucial new scholarship for global health, therapeutics, diagnostics, and veterinary medicine research. In addition, the research strategy used here using RNA sequencing and proteomics might inform future multi-omics research in other understudied organisms without decoded genomes.
Proteomic profiling of the planarian Schmidtea mediterranea and its mucous reveals similarities with human secretions and those predicted for parasitic flatworms.

PubMed

Bocchinfuso, Donald G; Taylor, Paul; Ross, Eric; Ignatchenko, Alex; Ignatchenko, Vladimir; Kislinger, Thomas; Pearson, Bret J; Moran, Michael F

2012-09-01

The freshwater planarian Schmidtea mediterranea has been used in research for over 100 years, and is an emerging stem cell model because of its capability of regenerating large portions of missing body parts. Exteriorly, planarians are covered in mucous secretions of unknown composition, implicated in locomotion, predation, innate immunity, and substrate adhesion. Although the planarian genome has been sequenced, it remains mostly unannotated, challenging both genomic and proteomic analyses. The goal of the current study was to annotate the proteome of the whole planarian and its mucous fraction. The S. mediterranea proteome was analyzed via mass spectrometry by using multidimensional protein identification technology with whole-worm tryptic digests. By using a proteogenomics approach, MS data were searched against an in silico translated planarian transcript database, and by using the Swiss-Prot BLAST algorithm to identify proteins similar to planarian queries. A total of 1604 proteins were identified. The mucous subproteome was defined through analysis of a mucous trail fraction and an extract obtained by treating whole worms with the mucolytic agent N-acetylcysteine. Gene Ontology analysis confirmed that the mucous fractions were enriched with secreted proteins. The S. mediterranea proteome is highly similar to that predicted for the trematode Schistosoma mansoni associated with intestinal schistosomiasis, with the mucous subproteome particularly highly conserved. Remarkably, orthologs of 119 planarian mucous proteins are present in human mucosal secretions and tear fluid. We suggest planarians have potential to be a model system for the characterization of mucous protein function and relevant to parasitic flatworm infections and diseases underlined by mucous aberrancies, such as cystic fibrosis, asthma, and other lung diseases.

Post-Genomics Nanotechnology Is Gaining Momentum: Nanoproteomics and Applications in Life Sciences

PubMed Central

Kobeissy, Firas H.; Gulbakan, Basri; Alawieh, Ali; Karam, Pierre; Zhang, Zhiqun; Guingab-Cagmat, Joy D.; Mondello, Stefania; Tan, Weihong; Anagli, John

2014-01-01

Abstract The post-genomics era has brought about new Omics biotechnologies, such as proteomics and metabolomics, as well as their novel applications to personal genomics and the quantified self. These advances are now also catalyzing other and newer post-genomics innovations, leading to convergences between Omics and nanotechnology. In this work, we systematically contextualize and exemplify an emerging strand of post-genomics life sciences, namely, nanoproteomics and its applications in health and integrative biological systems. Nanotechnology has been utilized as a complementary component to revolutionize proteomics through different kinds of nanotechnology applications, including nanoporous structures, functionalized nanoparticles, quantum dots, and polymeric nanostructures. Those applications, though still in their infancy, have led to several highly sensitive diagnostics and new methods of drug delivery and targeted therapy for clinical use. The present article differs from previous analyses of nanoproteomics in that it offers an in-depth and comparative evaluation of the attendant biotechnology portfolio and their applications as seen through the lens of post-genomics life sciences and biomedicine. These include: (1) immunosensors for inflammatory, pathogenic, and autoimmune markers for infectious and autoimmune diseases, (2) amplified immunoassays for detection of cancer biomarkers, and (3) methods for targeted therapy and automatically adjusted drug delivery such as in experimental stroke and brain injury studies. As nanoproteomics becomes available both to the clinician at the bedside and the citizens who are increasingly interested in access to novel post-genomics diagnostics through initiatives such as the quantified self, we anticipate further breakthroughs in personalized and targeted medicine. PMID:24410486
Development of proteome-wide binding reagents for research and diagnostics.

PubMed

Taussig, Michael J; Schmidt, Ronny; Cook, Elizabeth A; Stoevesandt, Oda

2013-12-01

Alongside MS, antibodies and other specific protein-binding molecules have a special place in proteomics as affinity reagents in a toolbox of applications for determining protein location, quantitative distribution and function (affinity proteomics). The realisation that the range of research antibodies available, while apparently vast is nevertheless still very incomplete and frequently of uncertain quality, has stimulated projects with an objective of raising comprehensive, proteome-wide sets of protein binders. With progress in automation and throughput, a remarkable number of recent publications refer to the practical possibility of selecting binders to every protein encoded in the genome. Here we review the requirements of a pipeline of production of protein binders for the human proteome, including target prioritisation, antigen design, 'next generation' methods, databases and the approaches taken by ongoing projects in Europe and the USA. While the task of generating affinity reagents for all human proteins is complex and demanding, the benefits of well-characterised and quality-controlled pan-proteome binder resources for biomedical research, industry and life sciences in general would be enormous and justify the effort. Given the technical, personnel and financial resources needed to fulfil this aim, expansion of current efforts may best be addressed through large-scale international collaboration. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Novel Phage Group Infecting Lactobacillus delbrueckii subsp. lactis, as Revealed by Genomic and Proteomic Analysis of Bacteriophage Ldl1

PubMed Central

Casey, Eoghan; Mahony, Jennifer; Neve, Horst; Noben, Jean-Paul; Dal Bello, Fabio

2014-01-01

Ldl1 is a virulent phage infecting the dairy starter Lactobacillus delbrueckii subsp. lactis LdlS. Electron microscopy analysis revealed that this phage exhibits a large head and a long tail and bears little resemblance to other characterized phages infecting Lactobacillus delbrueckii. In vitro propagation of this phage revealed a latent period of 30 to 40 min and a burst size of 59.9 ± 1.9 phage particles. Comparative genomic and proteomic analyses showed remarkable similarity between the genome of Ldl1 and that of Lactobacillus plantarum phage ATCC 8014-B2. The genomic and proteomic characteristics of Ldl1 demonstrate that this phage does not belong to any of the four previously recognized L. delbrueckii phage groups, necessitating the creation of a new group, called group e, thus adding to the knowledge on the diversity of phages targeting strains of this industrially important lactic acid bacterial species. PMID:25501478
The complete genome sequence and proteomics of Yersinia pestis phage Yep-phi.

PubMed

Zhao, Xiangna; Wu, Weili; Qi, Zhizhen; Cui, Yujun; Yan, Yanfeng; Guo, Zhaobiao; Wang, Zuyun; Wang, Hu; Deng, Haijun; Xue, Yan; Chen, Weijun; Wang, Xiaoyi; Yang, Ruifu

2011-01-01

Yep-phi, a lytic phage of Yersinia pestis, was isolated in China and is routinely used as a diagnostic phage for the identification of the plague pathogen. Yep-phi has an isometric hexagonal head containing dsDNA and a short non-contractile conical tail. In this study, we sequenced the Yep-phi genome (GenBank accession no. HQ333270) and performed proteomics analysis. The genome consists of 38 ,616 bp of DNA, including direct terminal repeats of 222 bp, and is predicted to contain 45 ORFs. Most structural proteins were identified by proteomics analysis. Compared with the three available genome sequences of lytic phages for Y. pestis, the phages could be divided into two subgroups. Yep-phi displays marked homology to the bacteriophages Berlin (GenBank accession no. AM183667) and Yepe2 (GenBank accession no. EU734170), and these comprise one subgroup. The other subgroup is represented by bacteriophage ΦA1122 (GenBank accession no. AY247822). Potential recombination was detected among the Yep-phi subgroup.
An Extremely Halophilic Proteobacterium Combines a Highly Acidic Proteome with a Low Cytoplasmic Potassium Content*

PubMed Central

Deole, Ratnakar; Challacombe, Jean; Raiford, Douglas W.; Hoff, Wouter D.

2013-01-01

Halophilic archaea accumulate molar concentrations of KCl in their cytoplasm as an osmoprotectant and have evolved highly acidic proteomes that function only at high salinity. We examined osmoprotection in the photosynthetic Proteobacteria Halorhodospira halophila and Halorhodospira halochloris. Genome sequencing and isoelectric focusing gel electrophoresis showed that the proteome of H. halophila is acidic. In line with this finding, H. halophila accumulated molar concentrations of KCl when grown in high salt medium as detected by x-ray microanalysis and plasma emission spectrometry. This result extends the taxonomic range of organisms using KCl as a main osmoprotectant to the Proteobacteria. The closely related organism H. halochloris does not exhibit an acidic proteome, matching its inability to accumulate K+. This observation indicates recent evolutionary changes in the osmoprotection strategy of these organisms. Upon growth of H. halophila in low salt medium, its cytoplasmic K+ content matches that of Escherichia coli, revealing an acidic proteome that can function in the absence of high cytoplasmic salt concentrations. These findings necessitate a reassessment of two central aspects of theories for understanding extreme halophiles. First, we conclude that proteome acidity is not driven by stabilizing interactions between K+ ions and acidic side chains but by the need for maintaining sufficient solvation and hydration of the protein surface at high salinity through strongly hydrated carboxylates. Second, we propose that obligate protein halophilicity is a non-adaptive property resulting from genetic drift in which constructive neutral evolution progressively incorporates weakly stabilizing K+-binding sites on an increasingly acidic protein surface. PMID:23144460
Challenges for proteomics core facilities.

PubMed

Lilley, Kathryn S; Deery, Michael J; Gatto, Laurent

2011-03-01

Many analytical techniques have been executed by core facilities established within academic, pharmaceutical and other industrial institutions. The centralization of such facilities ensures a level of expertise and hardware which often cannot be supported by individual laboratories. The establishment of a core facility thus makes the technology available for multiple researchers in the same institution. Often, the services within the core facility are also opened out to researchers from other institutions, frequently with a fee being levied for the service provided. In the 1990s, with the onset of the age of genomics, there was an abundance of DNA analysis facilities, many of which have since disappeared from institutions and are now available through commercial sources. Ten years on, as proteomics was beginning to be utilized by many researchers, this technology found itself an ideal candidate for being placed within a core facility. We discuss what in our view are the daily challenges of proteomics core facilities. We also examine the potential unmet needs of the proteomics core facility that may also be applicable to proteomics laboratories which do not function as core facilities. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Adaptation, ecology, and evolution of the halophilic stromatolite archaeon Halococcus hamelinensis inferred through genome analyses.

PubMed

Gudhka, Reema K; Neilan, Brett A; Burns, Brendan P

2015-01-01

Halococcus hamelinensis was the first archaeon isolated from stromatolites. These geomicrobial ecosystems are thought to be some of the earliest known on Earth, yet, despite their evolutionary significance, the role of Archaea in these systems is still not well understood. Detailed here is the genome sequencing and analysis of an archaeon isolated from stromatolites. The genome of H. hamelinensis consisted of 3,133,046 base pairs with an average G+C content of 60.08% and contained 3,150 predicted coding sequences or ORFs, 2,196 (68.67%) of which were protein-coding genes with functional assignments and 954 (29.83%) of which were of unknown function. Codon usage of the H. hamelinensis genome was consistent with a highly acidic proteome, a major adaptive mechanism towards high salinity. Amino acid transport and metabolism, inorganic ion transport and metabolism, energy production and conversion, ribosomal structure, and unknown function COG genes were overrepresented. The genome of H. hamelinensis also revealed characteristics reflecting its survival in its extreme environment, including putative genes/pathways involved in osmoprotection, oxidative stress response, and UV damage repair. Finally, genome analyses indicated the presence of putative transposases as well as positive matches of genes of H. hamelinensis against various genomes of Bacteria, Archaea, and viruses, suggesting the potential for horizontal gene transfer.
Comparative proteomic exploration of whey proteins in human and bovine colostrum and mature milk using iTRAQ-coupled LC-MS/MS.

PubMed

Yang, Mei; Cao, Xueyan; Wu, Rina; Liu, Biao; Ye, Wenhui; Yue, Xiqing; Wu, Junrui

2017-09-01

Whey, an essential source of dietary nutrients, is widely used in dairy foods for infants. A total of 584 whey proteins in human and bovine colostrum and mature milk were identified and quantified by the isobaric tag for relative and absolute quantification (iTRAQ) proteomic method. The 424 differentially expressed whey proteins were identified and analyzed according to gene ontology (GO) annotation, Kyoto encyclopedia of genes and genomes (KEGG) pathway, and multivariate statistical analysis. Biological processes principally involved biological regulation and response to stimulus. Major cellular components were extracellular region part and extracellular space. The most prevalent molecular function was protein binding. Twenty immune-related proteins and 13 proteins related to enzyme regulatory activity were differentially expressed in human and bovine milk. Differentially expressed whey proteins participated in many KEGG pathways, including major complement and coagulation cascades and in phagosomes. Whey proteins show obvious differences in expression in human and bovine colostrum and mature milk, with consequences for biological function. The results here increase our understanding of different whey proteomes, which could provide useful information for the development and manufacture of dairy products and nutrient food for infants. The advanced iTRAQ proteomic approach was used to analyze differentially expressed whey proteins in human and bovine colostrum and mature milk.
Proteomic analysis of the phytopathogenic soilborne fungus Verticillium dahliae reveals differential protein expression in isolates that differ in aggressiveness.

PubMed

El-Bebany, Ahmed F; Rampitsch, Christof; Daayf, Fouad

2010-01-01

Verticillium dahliae is a soilborne fungus that causes a vascular wilt disease of plants and losses in a broad range of economically important crops worldwide. In this study, we compared the proteomes of highly (Vd1396-9) and weakly (Vs06-14) aggressive isolates of V. dahliae to identify protein factors that may contribute to pathogenicity. Twenty-five protein spots were consistently observed as differential in the proteome profiles of the two isolates. The protein sequences in the spots were identified by LC-ESI-MS/MS and MASCOT database searches. Some of the identified sequences shared homology with fungal proteins that have roles in stress response, colonization, melanin biosynthesis, microsclerotia formation, antibiotic resistance, and fungal penetration. These are important functions for infection of the host and survival of the pathogen in soil. One protein found only in the highly aggressive isolate was identified as isochorismatase hydrolase, a potential plant-defense suppressor. This enzyme may inhibit the production of salicylic acid, which is important for plant defense response signaling. Other sequences corresponding to potential pathogenicity factors were identified in the highly aggressive isolate. This work indicates that, in combination with functional genomics, proteomics-based analyses can provide additional insights into pathogenesis and potential management strategies for this disease.
Accelerating the design of biomimetic materials by integrating RNA-seq with proteomics and materials science.

PubMed

Guerette, Paul A; Hoon, Shawn; Seow, Yiqi; Raida, Manfred; Masic, Admir; Wong, Fong T; Ho, Vincent H B; Kong, Kiat Whye; Demirel, Melik C; Pena-Francesch, Abdon; Amini, Shahrouz; Tay, Gavin Z; Ding, Dawei; Miserez, Ali

2013-10-01

Efforts to engineer new materials inspired by biological structures are hampered by the lack of genomic data from many model organisms studied in biomimetic research. Here we show that biomimetic engineering can be accelerated by integrating high-throughput RNA-seq with proteomics and advanced materials characterization. This approach can be applied to a broad range of systems, as we illustrate by investigating diverse high-performance biological materials involved in embryo protection, adhesion and predation. In one example, we rapidly engineer recombinant squid sucker ring teeth proteins into a range of structural and functional materials, including nanopatterned surfaces and photo-cross-linked films that exceed the mechanical properties of most natural and synthetic polymers. Integrating RNA-seq with proteomics and materials science facilitates the molecular characterization of natural materials and the effective translation of their molecular designs into a wide range of bio-inspired materials.
Genomics-informed isolation and characterization of a symbiotic Nanoarchaeota system from a terrestrial geothermal environment

PubMed Central

Wurch, Louie; Giannone, Richard J.; Belisle, Bernard S.; Swift, Carolyn; Utturkar, Sagar; Hettich, Robert L.; Reysenbach, Anna-Louise; Podar, Mircea

2016-01-01

Biological features can be inferred, based on genomic data, for many microbial lineages that remain uncultured. However, cultivation is important for characterizing an organism's physiology and testing its genome-encoded potential. Here we use single-cell genomics to infer cultivation conditions for the isolation of an ectosymbiotic Nanoarchaeota (‘Nanopusillus acidilobi') and its host (Acidilobus, a crenarchaeote) from a terrestrial geothermal environment. The cells of ‘Nanopusillus' are among the smallest known cellular organisms (100–300 nm). They appear to have a complete genetic information processing machinery, but lack almost all primary biosynthetic functions as well as respiration and ATP synthesis. Genomic and proteomic comparison with its distant relative, the marine Nanoarchaeum equitans illustrate an ancient, common evolutionary history of adaptation of the Nanoarchaeota to ectosymbiosis, so far unique among the Archaea. PMID:27378076
Genomics-informed isolation and characterization of a symbiotic Nanoarchaeota system from a terrestrial geothermal environment.

PubMed

Wurch, Louie; Giannone, Richard J; Belisle, Bernard S; Swift, Carolyn; Utturkar, Sagar; Hettich, Robert L; Reysenbach, Anna-Louise; Podar, Mircea

2016-07-05

Biological features can be inferred, based on genomic data, for many microbial lineages that remain uncultured. However, cultivation is important for characterizing an organism's physiology and testing its genome-encoded potential. Here we use single-cell genomics to infer cultivation conditions for the isolation of an ectosymbiotic Nanoarchaeota ('Nanopusillus acidilobi') and its host (Acidilobus, a crenarchaeote) from a terrestrial geothermal environment. The cells of 'Nanopusillus' are among the smallest known cellular organisms (100-300 nm). They appear to have a complete genetic information processing machinery, but lack almost all primary biosynthetic functions as well as respiration and ATP synthesis. Genomic and proteomic comparison with its distant relative, the marine Nanoarchaeum equitans illustrate an ancient, common evolutionary history of adaptation of the Nanoarchaeota to ectosymbiosis, so far unique among the Archaea.
Schizophrenia proteomics: biomarkers on the path to laboratory medicine?

PubMed Central

Lakhan, Shaheen Emmanuel

2006-01-01

Over two million Americans are afflicted with schizophrenia, a debilitating mental health disorder with a unique symptomatic and epidemiological profile. Genomics studies have hinted towards candidate schizophrenia susceptibility chromosomal loci and genes. Modern proteomic tools, particularly mass spectrometry and expression scanning, aim to identify both pathogenic-revealing and diagnostically significant biomarkers. Only a few studies on basic proteomics have been conducted for psychiatric disorders relative to the plethora of cancer specific experiments. One such proteomic utility enables the discovery of proteins and biological marker fingerprinting profiling techniques (SELDI-TOF-MS), and then subjects them to tandem mass spectrometric fragmentation and de novo protein sequencing (MALDI-TOF/TOF-MS) for the accurate identification and characterization of the proteins. Such utilities can explain the pathogenesis of neuro-psychiatric disease, provide more objective testing methods, and further demonstrate a biological basis to mental illness. Although clinical proteomics in schizophrenia have yet to reveal a biomarker with diagnostic specificity, methods that better characterize the disorder using endophenotypes can advance findings. Schizophrenia biomarkers could potentially revolutionize its psychopharmacology, changing it into a more hypothesis and genomic/proteomic-driven science. PMID:16846510
Role for protein–protein interaction databases in human genetics

PubMed Central

Pattin, Kristine A; Moore, Jason H

2010-01-01

Proteomics and the study of protein–protein interactions are becoming increasingly important in our effort to understand human diseases on a system-wide level. Thanks to the development and curation of protein-interaction databases, up-to-date information on these interaction networks is accessible and publicly available to the scientific community. As our knowledge of protein–protein interactions increases, it is important to give thought to the different ways that these resources can impact biomedical research. In this article, we highlight the importance of protein–protein interactions in human genetics and genetic epidemiology. Since protein–protein interactions demonstrate one of the strongest functional relationships between genes, combining genomic data with available proteomic data may provide us with a more in-depth understanding of common human diseases. In this review, we will discuss some of the fundamentals of protein interactions, the databases that are publicly available and how information from these databases can be used to facilitate genome-wide genetic studies. PMID:19929610
Sma3s: A universal tool for easy functional annotation of proteomes and transcriptomes.

PubMed

Casimiro-Soriguer, Carlos S; Muñoz-Mérida, Antonio; Pérez-Pulido, Antonio J

2017-06-01

The current cheapening of next-generation sequencing has led to an enormous growth in the number of sequenced genomes and transcriptomes, allowing wet labs to get the sequences from their organisms of study. To make the most of these data, one of the first things that should be done is the functional annotation of the protein-coding genes. But it used to be a slow and tedious step that can involve the characterization of thousands of sequences. Sma3s is an accurate computational tool for annotating proteins in an unattended way. Now, we have developed a completely new version, which includes functionalities that will be of utility for fundamental and applied science. Currently, the results provide functional categories such as biological processes, which become useful for both characterizing particular sequence datasets and comparing results from different projects. But one of the most important implemented innovations is that it has now low computational requirements, and the complete annotation of a simple proteome or transcriptome usually takes around 24 hours in a personal computer. Sma3s has been tested with a large amount of complete proteomes and transcriptomes, and it has demonstrated its potential in health science and other specific projects. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Confronting the catalytic dark matter encoded by sequenced genomes

PubMed Central

Ellens, Kenneth W.; Christian, Nils; Singh, Charandeep; Satagopam, Venkata P.

2017-01-01

Abstract The post-genomic era has provided researchers with a deluge of protein sequences. However, a significant fraction of the proteins encoded by sequenced genomes remains without an identified function. Here, we aim at determining how many enzymes of uncertain or unknown function are still present in the Saccharomyces cerevisiae and human proteomes. Using information available in the Swiss-Prot, BRENDA and KEGG databases in combination with a Hidden Markov Model-based method, we estimate that >600 yeast and 2000 human proteins (>30% of their proteins of unknown function) are enzymes whose precise function(s) remain(s) to be determined. This illustrates the impressive scale of the ‘unknown enzyme problem’. We extensively review classical biochemical as well as more recent systematic experimental and computational approaches that can be used to support enzyme function discovery research. Finally, we discuss the possible roles of the elusive catalysts in light of recent developments in the fields of enzymology and metabolism as well as the significance of the unknown enzyme problem in the context of metabolic modeling, metabolic engineering and rare disease research. PMID:29059321
In Planta Proteomics and Proteogenomics of the Biotrophic Barley Fungal Pathogen Blumeria graminis f. sp. hordei*

PubMed Central

Bindschedler, Laurence V.; Burgis, Timothy A.; Mills, Davinia J. S.; Ho, Jenny T. C.; Cramer, Rainer; Spanu, Pietro D.

2009-01-01

To further our understanding of powdery mildew biology during infection, we undertook a systematic shotgun proteomics analysis of the obligate biotroph Blumeria graminis f. sp. hordei at different stages of development in the host. Moreover we used a proteogenomics approach to feed information into the annotation of the newly sequenced genome. We analyzed and compared the proteomes from three stages of development representing different functions during the plant-dependent vegetative life cycle of this fungus. We identified 441 proteins in ungerminated spores, 775 proteins in epiphytic sporulating hyphae, and 47 proteins from haustoria inside barley leaf epidermal cells and used the data to aid annotation of the B. graminis f. sp. hordei genome. We also compared the differences in the protein complement of these key stages. Although confirming some of the previously reported findings and models derived from the analysis of transcriptome dynamics, our results also suggest that the intracellular haustoria are subject to stress possibly as a result of the plant defense strategy, including the production of reactive oxygen species. In addition, a number of small haustorial proteins with a predicted N-terminal signal peptide for secretion were identified in infected tissues: these represent candidate effector proteins that may play a role in controlling host metabolism and immunity. PMID:19602707
Proteogenomics connects somatic mutations to signalling in breast cancer.

PubMed

Mertins, Philipp; Mani, D R; Ruggles, Kelly V; Gillette, Michael A; Clauser, Karl R; Wang, Pei; Wang, Xianlong; Qiao, Jana W; Cao, Song; Petralia, Francesca; Kawaler, Emily; Mundt, Filip; Krug, Karsten; Tu, Zhidong; Lei, Jonathan T; Gatza, Michael L; Wilkerson, Matthew; Perou, Charles M; Yellapantula, Venkata; Huang, Kuan-lin; Lin, Chenwei; McLellan, Michael D; Yan, Ping; Davies, Sherri R; Townsend, R Reid; Skates, Steven J; Wang, Jing; Zhang, Bing; Kinsinger, Christopher R; Mesri, Mehdi; Rodriguez, Henry; Ding, Li; Paulovich, Amanda G; Fenyö, David; Ellis, Matthew J; Carr, Steven A

2016-06-02

Somatic mutations have been extensively characterized in breast cancer, but the effects of these genetic alterations on the proteomic landscape remain poorly understood. Here we describe quantitative mass-spectrometry-based proteomic and phosphoproteomic analyses of 105 genomically annotated breast cancers, of which 77 provided high-quality data. Integrated analyses provided insights into the somatic cancer genome including the consequences of chromosomal loss, such as the 5q deletion characteristic of basal-like breast cancer. Interrogation of the 5q trans-effects against the Library of Integrated Network-based Cellular Signatures, connected loss of CETN3 and SKP1 to elevated expression of epidermal growth factor receptor (EGFR), and SKP1 loss also to increased SRC tyrosine kinase. Global proteomic data confirmed a stromal-enriched group of proteins in addition to basal and luminal clusters, and pathway analysis of the phosphoproteome identified a G-protein-coupled receptor cluster that was not readily identified at the mRNA level. In addition to ERBB2, other amplicon-associated highly phosphorylated kinases were identified, including CDK12, PAK1, PTK2, RIPK2 and TLK2. We demonstrate that proteogenomic analysis of breast cancer elucidates the functional consequences of somatic mutations, narrows candidate nominations for driver genes within large deletions and amplified regions, and identifies therapeutic targets.
Laser Capture Microdissection in the Genomic and Proteomic Era: Targeting the Genetic Basis of Cancer

PubMed Central

Domazet, Barbara; MacLennan, Gregory T.; Lopez-Beltran, Antonio; Montironi, Rodolfo; Cheng, Liang

2008-01-01

The advent of new technologies has enabled deeper insight into processes atsubcellular levels, which will ultimately improve diagnostic procedures and patient outcome. Thanks to cell enrichment methods, it is now possible to study cells in their native environment. This has greatly contributed to a rapid growth in several areas, such as gene expression analysis, proteomics, and metabolonomics. Laser capture microdissection (LCM) as a method of procuring subpopulations of cells under direct visual inspection is playing an important role in these areas. This review provides an overview of existing LCM technology and its downstream applications in genomics, proteomics, diagnostics and therapy. PMID:18787684
Laser capture microdissection in the genomic and proteomic era: targeting the genetic basis of cancer.

PubMed

Domazet, Barbara; Maclennan, Gregory T; Lopez-Beltran, Antonio; Montironi, Rodolfo; Cheng, Liang

2008-03-15

The advent of new technologies has enabled deeper insight into processes at subcellular levels, which will ultimately improve diagnostic procedures and patient outcome. Thanks to cell enrichment methods, it is now possible to study cells in their native environment. This has greatly contributed to a rapid growth in several areas, such as gene expression analysis, proteomics, and metabolonomics. Laser capture microdissection (LCM) as a method of procuring subpopulations of cells under direct visual inspection is playing an important role in these areas. This review provides an overview of existing LCM technology and its downstream applications in genomics, proteomics, diagnostics and therapy.

Emerging techniques for the discovery and validation of therapeutic targets for skeletal diseases.

PubMed

Cho, Christine H; Nuttall, Mark E

2002-12-01

Advances in genomics and proteomics have revolutionised the drug discovery process and target validation. Identification of novel therapeutic targets for chronic skeletal diseases is an extremely challenging process based on the difficulty of obtaining high-quality human diseased versus normal tissue samples. The quality of tissue and genomic information obtained from the sample is critical to identifying disease-related genes. Using a genomics-based approach, novel genes or genes with similar homology to existing genes can be identified from cDNA libraries generated from normal versus diseased tissue. High-quality cDNA libraries are prepared from uncontaminated homogeneous cell populations harvested from tissue sections of interest. Localised gene expression analysis and confirmation are obtained through in situ hybridisation or immunohistochemical studies. Cells overexpressing the recombinant protein are subsequently designed for primary cell-based high-throughput assays that are capable of screening large compound banks for potential hits. Afterwards, secondary functional assays are used to test promising compounds. The same overexpressing cells are used in the secondary assay to test protein activity and functionality as well as screen for small-molecule agonists or antagonists. Once a hit is generated, a structure-activity relationship of the compound is optimised for better oral bioavailability and pharmacokinetics allowing the compound to progress into development. Parallel efforts from proteomics, as well as genetics/transgenics, bioinformatics and combinatorial chemistry, and improvements in high-throughput automation technologies, allow the drug discovery process to meet the demands of the medicinal market. This review discusses and illustrates how different approaches are incorporated into the discovery and validation of novel targets and, consequently, the development of potentially therapeutic agents in the areas of osteoporosis and osteoarthritis. While current treatments exist in the form of hormone replacement therapy, antiresorptive and anabolic agents for osteoporosis, there are no disease-modifying therapies for the treatment of the most common human joint disease, osteoarthritis. A massive market potential for improved options with better safety and efficacy still remains. Therefore, the application of genomics and proteomics for both diseases should provide much needed novel therapeutic approaches to treating these major world health problems.
Integration of multi-omics data of a genome-reduced bacterium: Prevalence of post-transcriptional regulation and its correlation with protein abundances

PubMed Central

Chen, Wei-Hua; van Noort, Vera; Lluch-Senar, Maria; Hennrich, Marco L.; H. Wodke, Judith A.; Yus, Eva; Alibés, Andreu; Roma, Guglielmo; Mende, Daniel R.; Pesavento, Christina; Typas, Athanasios; Gavin, Anne-Claude; Serrano, Luis; Bork, Peer

2016-01-01

We developed a comprehensive resource for the genome-reduced bacterium Mycoplasma pneumoniae comprising 1748 consistently generated ‘-omics’ data sets, and used it to quantify the power of antisense non-coding RNAs (ncRNAs), lysine acetylation, and protein phosphorylation in predicting protein abundance (11%, 24% and 8%, respectively). These factors taken together are four times more predictive of the proteome abundance than of mRNA abundance. In bacteria, post-translational modifications (PTMs) and ncRNA transcription were both found to increase with decreasing genomic GC-content and genome size. Thus, the evolutionary forces constraining genome size and GC-content modify the relative contributions of the different regulatory layers to proteome homeostasis, and impact more genomic and genetic features than previously appreciated. Indeed, these scaling principles will enable us to develop more informed approaches when engineering minimal synthetic genomes. PMID:26773059
GENOMICS AND ENVIRONMENTAL RESEARCH

EPA Science Inventory

The impact of recently developed and emerging genomics technologies on environmental sciences has significant implications for human and ecological risk assessment issues. The linkage of data generated from genomics, transcriptomics, proteomics, metabalomics, and ecology can be ...
A computational pipeline for the development of multi-marker bio-signature panels and ensemble classifiers

PubMed Central

2012-01-01

Background Biomarker panels derived separately from genomic and proteomic data and with a variety of computational methods have demonstrated promising classification performance in various diseases. An open question is how to create effective proteo-genomic panels. The framework of ensemble classifiers has been applied successfully in various analytical domains to combine classifiers so that the performance of the ensemble exceeds the performance of individual classifiers. Using blood-based diagnosis of acute renal allograft rejection as a case study, we address the following question in this paper: Can acute rejection classification performance be improved by combining individual genomic and proteomic classifiers in an ensemble? Results The first part of the paper presents a computational biomarker development pipeline for genomic and proteomic data. The pipeline begins with data acquisition (e.g., from bio-samples to microarray data), quality control, statistical analysis and mining of the data, and finally various forms of validation. The pipeline ensures that the various classifiers to be combined later in an ensemble are diverse and adequate for clinical use. Five mRNA genomic and five proteomic classifiers were developed independently using single time-point blood samples from 11 acute-rejection and 22 non-rejection renal transplant patients. The second part of the paper examines five ensembles ranging in size from two to 10 individual classifiers. Performance of ensembles is characterized by area under the curve (AUC), sensitivity, and specificity, as derived from the probability of acute rejection for individual classifiers in the ensemble in combination with one of two aggregation methods: (1) Average Probability or (2) Vote Threshold. One ensemble demonstrated superior performance and was able to improve sensitivity and AUC beyond the best values observed for any of the individual classifiers in the ensemble, while staying within the range of observed specificity. The Vote Threshold aggregation method achieved improved sensitivity for all 5 ensembles, but typically at the cost of decreased specificity. Conclusion Proteo-genomic biomarker ensemble classifiers show promise in the diagnosis of acute renal allograft rejection and can improve classification performance beyond that of individual genomic or proteomic classifiers alone. Validation of our results in an international multicenter study is currently underway. PMID:23216969
A computational pipeline for the development of multi-marker bio-signature panels and ensemble classifiers.

PubMed

Günther, Oliver P; Chen, Virginia; Freue, Gabriela Cohen; Balshaw, Robert F; Tebbutt, Scott J; Hollander, Zsuzsanna; Takhar, Mandeep; McMaster, W Robert; McManus, Bruce M; Keown, Paul A; Ng, Raymond T

2012-12-08

Biomarker panels derived separately from genomic and proteomic data and with a variety of computational methods have demonstrated promising classification performance in various diseases. An open question is how to create effective proteo-genomic panels. The framework of ensemble classifiers has been applied successfully in various analytical domains to combine classifiers so that the performance of the ensemble exceeds the performance of individual classifiers. Using blood-based diagnosis of acute renal allograft rejection as a case study, we address the following question in this paper: Can acute rejection classification performance be improved by combining individual genomic and proteomic classifiers in an ensemble? The first part of the paper presents a computational biomarker development pipeline for genomic and proteomic data. The pipeline begins with data acquisition (e.g., from bio-samples to microarray data), quality control, statistical analysis and mining of the data, and finally various forms of validation. The pipeline ensures that the various classifiers to be combined later in an ensemble are diverse and adequate for clinical use. Five mRNA genomic and five proteomic classifiers were developed independently using single time-point blood samples from 11 acute-rejection and 22 non-rejection renal transplant patients. The second part of the paper examines five ensembles ranging in size from two to 10 individual classifiers. Performance of ensembles is characterized by area under the curve (AUC), sensitivity, and specificity, as derived from the probability of acute rejection for individual classifiers in the ensemble in combination with one of two aggregation methods: (1) Average Probability or (2) Vote Threshold. One ensemble demonstrated superior performance and was able to improve sensitivity and AUC beyond the best values observed for any of the individual classifiers in the ensemble, while staying within the range of observed specificity. The Vote Threshold aggregation method achieved improved sensitivity for all 5 ensembles, but typically at the cost of decreased specificity. Proteo-genomic biomarker ensemble classifiers show promise in the diagnosis of acute renal allograft rejection and can improve classification performance beyond that of individual genomic or proteomic classifiers alone. Validation of our results in an international multicenter study is currently underway.
Proteomic Assessment of Fluid Shifts and Association with Visual Impairment and Intracranial Pressure in Twin Astronauts

NASA Technical Reports Server (NTRS)

Rana, Brinda K.; Stenger, Michael B.; Lee, Stuart M. C.; Macias, Brandon R.; Siamwala, Jamila; Piening, Brian Donald; Hook, Vivian; Ebert, Doug; Patel, Hemal; Smith, Scott;

2016-01-01

BACKGROUND: Astronauts participating in long duration space missions are at an increased risk of physiological disruptions. The development of visual impairment and intracranial pressure (VIIP) syndrome is one of the leading health concerns for crew members on long-duration space missions; microgravity-induced fluid shifts and chronic elevated cabin CO2 may be contributing factors. By studying physiological and molecular changes in one identical twin during his 1-year ISS mission and his ground-based co-twin, this work extends a current NASA-funded investigation to assess space flight induced "Fluid Shifts" in association with the development of VIIP. This twin study uniquely integrates physiological and -omic signatures to further our understanding of the molecular mechanisms underlying space flight-induced VIIP. We are: (i) conducting longitudinal proteomic assessments of plasma to identify fluid regulation-related molecular pathways altered by long-term space flight; and (ii) integrating physiological and proteomic data with genomic data to understand the genomic mechanism by which these proteomic signatures are regulated. PURPOSE: We are exploring proteomic signatures and genomic mechanisms underlying space flight-induced VIIP symptoms with the future goal of developing early biomarkers to detect and monitor the progression of VIIP. This study is first to employ a male monozygous twin pair to systematically determine the impact of fluid distribution in microgravity, integrating a comprehensive set of structural and functional measures with proteomic, metabolomic and genomic data. This project has a broader impact on Earth-based clinical areas, such as traumatic brain injury-induced elevations of intracranial pressure, hydrocephalus, and glaucoma. HYPOTHESIS: We predict that the space-flown twin will experience a space flight-induced alteration in proteins and peptides related to fluid balance, fluid control and brain injury as compared to his pre-flight protein/peptide signatures. Conversely, the trajectory of these protein signatures will remain relatively constant in his ground based co-twin. METHODS: We are using proteomic and standard immunoelectrophoresis techniques to delineate the change in protein signatures throughout the course of a long duration space flight in relation to the development of VIIP. We are also applying a novel cell-based metaboloic organ system assay ("Organs on a Plate") to address how these circulating biomarkers affect physiological processes at the cellular and organ level which could result in VIIP symptoms. These molecular data will be correlated with physiological measures (eg. extra and intracellular fluid volume, vascular filling/flow patterns, MRI, and Optic Coherence Tomography. DISCUSSION: Pre- and in-flight data collection is in progress for the space-flown twin, and similar data have been obtained from the ground-based twin. Biosamples will be batch processed when received from ISS after the conclusion of the 1-year mission. Omic and Physiological measures from the twin astronauts will be compared to similar data being collected on twin subjects who participated in simulated microgravity study. bed rest study.

Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers.

PubMed

Galpert, Deborah; Fernández, Alberto; Herrera, Francisco; Antunes, Agostinho; Molina-Ruiz, Reinaldo; Agüero-Chapin, Guillermin

2018-05-03

The development of new ortholog detection algorithms and the improvement of existing ones are of major importance in functional genomics. We have previously introduced a successful supervised pairwise ortholog classification approach implemented in a big data platform that considered several pairwise protein features and the low ortholog pair ratios found between two annotated proteomes (Galpert, D et al., BioMed Research International, 2015). The supervised models were built and tested using a Saccharomycete yeast benchmark dataset proposed by Salichos and Rokas (2011). Despite several pairwise protein features being combined in a supervised big data approach; they all, to some extent were alignment-based features and the proposed algorithms were evaluated on a unique test set. Here, we aim to evaluate the impact of alignment-free features on the performance of supervised models implemented in the Spark big data platform for pairwise ortholog detection in several related yeast proteomes. The Spark Random Forest and Decision Trees with oversampling and undersampling techniques, and built with only alignment-based similarity measures or combined with several alignment-free pairwise protein features showed the highest classification performance for ortholog detection in three yeast proteome pairs. Although such supervised approaches outperformed traditional methods, there were no significant differences between the exclusive use of alignment-based similarity measures and their combination with alignment-free features, even within the twilight zone of the studied proteomes. Just when alignment-based and alignment-free features were combined in Spark Decision Trees with imbalance management, a higher success rate (98.71%) within the twilight zone could be achieved for a yeast proteome pair that underwent a whole genome duplication. The feature selection study showed that alignment-based features were top-ranked for the best classifiers while the runners-up were alignment-free features related to amino acid composition. The incorporation of alignment-free features in supervised big data models did not significantly improve ortholog detection in yeast proteomes regarding the classification qualities achieved with just alignment-based similarity measures. However, the similarity of their classification performance to that of traditional ortholog detection methods encourages the evaluation of other alignment-free protein pair descriptors in future research.
Yeast Genomics for Bread, Beer, Biology, Bucks and Breath

NASA Astrophysics Data System (ADS)

Sakharkar, Kishore R.; Sakharkar, Meena K.

The rapid advances and scale up of projects in DNA sequencing dur ing the past two decades have produced complete genome sequences of several eukaryotic species. The versatile genetic malleability of the yeast, and the high degree of conservation between its cellular processes and those of human cells have made it a model of choice for pioneering research in molecular and cell biology. The complete sequence of yeast genome has proven to be extremely useful as a reference towards the sequences of human and for providing systems to explore key gene functions. Yeast has been a ‘legendary model’ for new technologies and gaining new biological insights into basic biological sciences and biotechnology. This chapter describes the awesome power of yeast genetics, genomics and proteomics in understanding of biological function. The applications of yeast as a screening tool to the field of drug discovery and development are highlighted and the traditional importance of yeast for bakers and brewers is discussed.
Functional Genomic Landscape of Human Breast Cancer Drivers, Vulnerabilities, and Resistance.

PubMed

Marcotte, Richard; Sayad, Azin; Brown, Kevin R; Sanchez-Garcia, Felix; Reimand, Jüri; Haider, Maliha; Virtanen, Carl; Bradner, James E; Bader, Gary D; Mills, Gordon B; Pe'er, Dana; Moffat, Jason; Neel, Benjamin G

2016-01-14

Large-scale genomic studies have identified multiple somatic aberrations in breast cancer, including copy number alterations and point mutations. Still, identifying causal variants and emergent vulnerabilities that arise as a consequence of genetic alterations remain major challenges. We performed whole-genome small hairpin RNA (shRNA) "dropout screens" on 77 breast cancer cell lines. Using a hierarchical linear regression algorithm to score our screen results and integrate them with accompanying detailed genetic and proteomic information, we identify vulnerabilities in breast cancer, including candidate "drivers," and reveal general functional genomic properties of cancer cells. Comparisons of gene essentiality with drug sensitivity data suggest potential resistance mechanisms, effects of existing anti-cancer drugs, and opportunities for combination therapy. Finally, we demonstrate the utility of this large dataset by identifying BRD4 as a potential target in luminal breast cancer and PIK3CA mutations as a resistance determinant for BET-inhibitors. Copyright © 2016 Elsevier Inc. All rights reserved.
Natural Microbial Assemblages Reflect Distinct Organismal and Functional Partitioning

NASA Astrophysics Data System (ADS)

Wilmes, P.; Andersson, A.; Kalnejais, L. H.; Verberkmoes, N. C.; Lefsrud, M. G.; Wexler, M.; Singer, S. W.; Shah, M.; Bond, P. L.; Thelen, M. P.; Hettich, R. L.; Banfield, J. F.

2007-12-01

The ability to link microbial community structure to function has long been a primary focus of environmental microbiology. With the advent of community genomic and proteomic techniques, along with advances in microscopic imaging techniques, it is now possible to gain insights into the organismal and functional makeup of microbial communities. Biofilms growing within highly acidic solutions inside the Richmond Mine (Iron Mountain, Redding, California) exhibit distinct macro- and microscopic morphologies. They are composed of microorganisms belonging to the three domains of life, including archaea, bacteria and eukarya. The proportion of each organismal type depends on sampling location and developmental stage. For example, mature biofilms floating on top of acid mine drainage (AMD) pools exhibit layers consisting of a densely packed bottom layer of the chemoautolithotroph Leptospirillum group II, a less dense top layer composed mainly of archaea, and fungal filaments spanning across the entire biofilm. The expression of cytochrome 579 (the most highly abundant protein in the biofilm, believed to be central to iron oxidation and encoded by Leptospirillum group II) is localized at the interface of the biofilm with the AMD solution, highlighting that biofilm architecture is reflected at the functional gene expression level. Distinct functional partitioning is also apparent in a biological wastewater treatment system that selects for distinct polyphosphate accumulating organisms. Community genomic data from " Candidatus Accumulibacter phosphatis" dominated activated sludge has enabled high mass-accuracy shotgun proteomics for identification of key metabolic pathways. Comprehensive genome-wide alignment of orthologous proteins suggests distinct partitioning of protein variants involved in both core-metabolism and specific metabolic pathways among the dominant population and closely related species. In addition, strain- resolved proteogenomic analysis of the AMD biofilms also highlights the importance of strain heterogeneity for the maintenance of community structure and function. These findings explain the importance of genetic diversity in facilitating the stable performance of complex microbial processes. Furthermore, although very different in terms of habitat, both microbial communities exhibit distinct functional compartmentalization and demonstrate its role in sustaining microbial community structure.
Teaching Expression Proteomics: From the Wet-Lab to the Laptop

ERIC Educational Resources Information Center

Teixeira, Miguel C.; Santos, Pedro M.; Rodrigues, Catarina; Sa-Correia, Isabel

2009-01-01

Expression proteomics has become, in recent years, a key genome-wide expression approach in fundamental and applied life sciences. This postgenomic technology aims the quantitative analysis of all the proteins or protein forms (the so-called proteome) of a given organism in a given environmental and genetic context. It is a challenge to provide…
Host–Environment Medicine

PubMed Central

Rabinowitz, Peter M; Poljak, Alex

2003-01-01

Rapid developments in genomic and proteomic testing promise to impact the way in which clinicians assess disease risk and drug selection in their patients. Because most diseases result from host–environment interactions, however, primary care providers will need to avoid the trap of biological determinism by examining the important role of environmental factors in their clinical assessments and interventions. This article discusses the application of host–environment concepts to recent developments in the areas of genomics and proteomics. PMID:12648255
Methods, Tools and Current Perspectives in Proteogenomics *

PubMed Central

Ruggles, Kelly V.; Krug, Karsten; Wang, Xiaojing; Clauser, Karl R.; Wang, Jing; Payne, Samuel H.; Fenyö, David; Zhang, Bing; Mani, D. R.

2017-01-01

With combined technological advancements in high-throughput next-generation sequencing and deep mass spectrometry-based proteomics, proteogenomics, i.e. the integrative analysis of proteomic and genomic data, has emerged as a new research field. Early efforts in the field were focused on improving protein identification using sample-specific genomic and transcriptomic sequencing data. More recently, integrative analysis of quantitative measurements from genomic and proteomic studies have identified novel insights into gene expression regulation, cell signaling, and disease. Many methods and tools have been developed or adapted to enable an array of integrative proteogenomic approaches and in this article, we systematically classify published methods and tools into four major categories, (1) Sequence-centric proteogenomics; (2) Analysis of proteogenomic relationships; (3) Integrative modeling of proteogenomic data; and (4) Data sharing and visualization. We provide a comprehensive review of methods and available tools in each category and highlight their typical applications. PMID:28456751
Functional Genomics of Allergen Gene Families in Fruits

PubMed Central

Maghuly, Fatemeh; Marzban, Gorji; Laimer, Margit

2009-01-01

Fruit consumption is encouraged for health reasons; however, fruits may harbour a series of allergenic proteins that may cause discomfort or even represent serious threats to certain individuals. Thus, the identification and characterization of allergens in fruits requires novel approaches involving genomic and proteomic tools. Since avoidance of fruits also negatively affects the quality of patients’ lives, biotechnological interventions are ongoing to produce low allergenic fruits by down regulating specific genes. In this respect, the control of proteins associated with allergenicity could be achieved by fine tuning the spatial and temporal expression of the relevant genes. PMID:22253972
Imaging and the new biology: What's wrong with this picture?

NASA Astrophysics Data System (ADS)

Vannier, Michael W.

2004-05-01

The Human Genome has been defined, giving us one part of the equation that stems from the central dogma of molecular biology. Despite this awesome scientific achievement, the correspondence between genomics and imaging is weak, since we cannot predict an organism's phenotype from even perfect knowledge of its genetic complement. Biological knowledge comes in several forms, and the genome is perhaps the best known and most completely understood type. Imaging creates another form of biological information, providing the ability to study morphology, growth and development, metabolic processes, and diseases in vitro and in vivo at many levels of scale. The principal challenge in biomedical imaging for the future lies in the need to reconcile the data provided by one or multiple modalities with other forms of biological knowledge, most importantly the genome, proteome, physiome, and other "-ome's." To date, the imaging science community has not set a high priority on the unification of their results with genomics, proteomics, and physiological functions in most published work. Images are relatively isolated from other forms of biological data, impairing our ability to conceive and address many fundamental questions in research and clinical practice. This presentation will explain the challenge of biological knowledge integration in basic research and clinical applications from the standpoint of imaging and image processing. The impediments to progress, isolation of the imaging community, and mainstream of new and future biological science will be identified, so the critical and immediate need for change can be highlighted.
Proteogenomic insights into uranium tolerance of a Chernobyl's Microbacterium bacterial isolate.

PubMed

Gallois, Nicolas; Alpha-Bazin, Béatrice; Ortet, Philippe; Barakat, Mohamed; Piette, Laurie; Long, Justine; Berthomieu, Catherine; Armengaud, Jean; Chapon, Virginie

2018-04-15

Microbacterium oleivorans A9 is a uranium-tolerant actinobacteria isolated from the trench T22 located near the Chernobyl nuclear power plant. This site is contaminated with different radionuclides including uranium. To observe the molecular changes at the proteome level occurring in this strain upon uranyl exposure and understand molecular mechanisms explaining its uranium tolerance, we established its draft genome and used this raw information to perform an in-depth proteogenomics study. High-throughput proteomics were performed on cells exposed or not to 10μM uranyl nitrate sampled at three previously identified phases of uranyl tolerance. We experimentally detected and annotated 1532 proteins and highlighted a total of 591 proteins for which abundances were significantly differing between conditions. Notably, proteins involved in phosphate and iron metabolisms show high dynamics. A large ratio of proteins more abundant upon uranyl stress, are distant from functionally-annotated known proteins, highlighting the lack of fundamental knowledge regarding numerous key molecular players from soil bacteria. Microbacterium oleivorans A9 is an interesting environmental model to understand biological processes engaged in tolerance to radionuclides. Using an innovative proteogenomics approach, we explored its molecular mechanisms involved in uranium tolerance. We sequenced its genome, interpreted high-throughput proteomic data against a six-reading frame ORF database deduced from the draft genome, annotated the identified proteins and compared protein abundances from cells exposed or not to uranyl stress after a cascade search. These data show that a complex cellular response to uranium occurs in Microbacterium oleivorans A9, where one third of the experimental proteome is modified. In particular, the uranyl stress perturbed the phosphate and iron metabolic pathways. Furthermore, several transporters have been identified to be specifically associated to uranyl stress, paving the way to the development of biotechnological tools for uranium decontamination. Copyright © 2017. Published by Elsevier B.V.
Quantitative Proteomics of the Infectious and Replicative Forms of Chlamydia trachomatis

PubMed Central

Skipp, Paul J. S.; Hughes, Chris; McKenna, Thérèse; Edwards, Richard; Langridge, James; Thomson, Nicholas R.; Clarke, Ian N.

2016-01-01

The obligate intracellular developmental cycle of Chlamydia trachomatis presents significant challenges in defining its proteome. In this study we have applied quantitative proteomics to both the intracellular reticulate body (RB) and the extracellular elementary body (EB) from C. trachomatis. We used C. trachomatis L2 as a model chlamydial isolate for our study since it has a high infectivity:particle ratio and there is an excellent quality genome sequence. EBs and RBs (>99% pure) were quantified by chromosomal and plasmid copy number using PCR, from which the concentrations of chlamydial proteins per bacterial cell/genome were determined. RBs harvested at 15h post infection (PI) were purified by three successive rounds of gradient centrifugation. This is the earliest possible time to obtain purified RBs, free from host cell components in quantity, within the constraints of the technology. EBs were purified at 48h PI. We then used two-dimensional reverse phase UPLC to fractionate RB or EB peptides before mass spectroscopic analysis, providing absolute amount estimates of chlamydial proteins. The ability to express the data as molecules per cell gave ranking in both abundance and energy requirements for synthesis, allowing meaningful identification of rate-limiting components. The study assigned 562 proteins with high confidence and provided absolute estimates of protein concentration for 489 proteins. Interestingly, the data showed an increase in TTS capacity at 15h PI. Most of the enzymes involved in peptidoglycan biosynthesis were detected along with high levels of muramidase (in EBs) suggesting breakdown of peptidoglycan occurs in the non-dividing form of the microorganism. All the genome-encoded enzymes for glycolysis, pentose phosphate pathway and tricarboxylic acid cycle were identified and quantified; these data supported the observation that the EB is metabolically active. The availability of detailed, accurate quantitative proteomic data will be invaluable for investigations into gene regulation and function. PMID:26871455
The state of proteome profiling in the fungal genus Aspergillus.

PubMed

Kim, Yonghyun; Nandakumar, M P; Marten, Mark R

2008-03-01

Aspergilli are an important genus of filamentous fungi that contribute to a multibillion dollar industry. Since many fungal genome sequencing were recently completed, it would be advantageous to profile their proteome to better understand the fungal cell factory. Here, we review proteomic data generated for the Aspergilli in recent years. Thus far, a combined total of 28 cell surface, 102 secreted and 139 intracellular proteins have been identified based on 10 different studies on Aspergillus proteomics. A summary proteome map highlighting identified proteins in major metabolic pathway is presented.
Microbial genomics, transcriptomics and proteomics: new discoveries in decomposition research using complementary methods.

PubMed

Baldrian, Petr; López-Mondéjar, Rubén

2014-02-01

Molecular methods for the analysis of biomolecules have undergone rapid technological development in the last decade. The advent of next-generation sequencing methods and improvements in instrumental resolution enabled the analysis of complex transcriptome, proteome and metabolome data, as well as a detailed annotation of microbial genomes. The mechanisms of decomposition by model fungi have been described in unprecedented detail by the combination of genome sequencing, transcriptomics and proteomics. The increasing number of available genomes for fungi and bacteria shows that the genetic potential for decomposition of organic matter is widespread among taxonomically diverse microbial taxa, while expression studies document the importance of the regulation of expression in decomposition efficiency. Importantly, high-throughput methods of nucleic acid analysis used for the analysis of metagenomes and metatranscriptomes indicate the high diversity of decomposer communities in natural habitats and their taxonomic composition. Today, the metaproteomics of natural habitats is of interest. In combination with advanced analytical techniques to explore the products of decomposition and the accumulation of information on the genomes of environmentally relevant microorganisms, advanced methods in microbial ecophysiology should increase our understanding of the complex processes of organic matter transformation.
Frequently Asked Questions about Genetic and Genomic Science

MedlinePlus

... of the new genetic and genomic techniques and technologies? Proteomics The suffix "-ome" comes from the Greek ... pharmacogenomics is one of the large-scale "omic" technologies, it can examine the entirety of the genome, ...

Smooth Muscle Cell Genome Browser: Enabling the Identification of Novel Serum Response Factor Target Genes

PubMed Central

Lee, Moon Young; Park, Chanjae; Berent, Robyn M.; Park, Paul J.; Fuchs, Robert; Syn, Hannah; Chin, Albert; Townsend, Jared; Benson, Craig C.; Redelman, Doug; Shen, Tsai-wei; Park, Jong Kun; Miano, Joseph M.; Sanders, Kenton M.; Ro, Seungil

2015-01-01

Genome-scale expression data on the absolute numbers of gene isoforms offers essential clues in cellular functions and biological processes. Smooth muscle cells (SMCs) perform a unique contractile function through expression of specific genes controlled by serum response factor (SRF), a transcription factor that binds to DNA sites known as the CArG boxes. To identify SRF-regulated genes specifically expressed in SMCs, we isolated SMC populations from mouse small intestine and colon, obtained their transcriptomes, and constructed an interactive SMC genome and CArGome browser. To our knowledge, this is the first online resource that provides a comprehensive library of all genetic transcripts expressed in primary SMCs. The browser also serves as the first genome-wide map of SRF binding sites. The browser analysis revealed novel SMC-specific transcriptional variants and SRF target genes, which provided new and unique insights into the cellular and biological functions of the cells in gastrointestinal (GI) physiology. The SRF target genes in SMCs, which were discovered in silico, were confirmed by proteomic analysis of SMC-specific Srf knockout mice. Our genome browser offers a new perspective into the alternative expression of genes in the context of SRF binding sites in SMCs and provides a valuable reference for future functional studies. PMID:26241044
Macromolecule Mass Spectrometry: Citation Mining of User Documents

DTIC Science & Technology

2003-11-14

MCLUCKEY SA PURDUE UNIV USA 541 MANN M UNIV SO DENMARK DENMARK 450 BIEMANN K MIT USA 343 CHOWDHURY SK SANOFI WINTHROP INC USA 302 COVEY TR SCIEX LTD CANADA...glycopeptid 0.7, residu 0.7) (36) Cluster 8 (proteom 10.8, technolog 5.8, protein 5.7, genom 5.5, function 2.7, advanc 1.5, vaccin 1.2, new 1.1, biolog 1.1
HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.

PubMed

Bradshaw, Charles Richard; Surendranath, Vineeth; Henschel, Robert; Mueller, Matthias Stefan; Habermann, Bianca Hermine

2011-03-10

Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de.
HMMerThread: Detecting Remote, Functional Conserved Domains in Entire Genomes by Combining Relaxed Sequence-Database Searches with Fold Recognition

PubMed Central

Bradshaw, Charles Richard; Surendranath, Vineeth; Henschel, Robert; Mueller, Matthias Stefan; Habermann, Bianca Hermine

2011-01-01

Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de. PMID:21423752
Proteomics of industrial fungi: trends and insights for biotechnology.

PubMed

de Oliveira, José Miguel P Ferreira; de Graaff, Leo H

2011-01-01

Filamentous fungi are widely known for their industrial applications, namely, the production of food-processing enzymes and metabolites such as antibiotics and organic acids. In the past decade, the full genome sequencing of filamentous fungi increased the potential to predict encoded proteins enormously, namely, hydrolytic enzymes or proteins involved in the biosynthesis of metabolites of interest. The integration of genome sequence information with possible phenotypes requires, however, the knowledge of all the proteins in the cell in a system-wise manner, given by proteomics. This review summarises the progress of proteomics and its importance for the study of biotechnological processes in filamentous fungi. A major step forward in proteomics was to couple protein separation with high-resolution mass spectrometry, allowing accurate protein quantification. Despite the fact that most fungal proteomic studies have been focused on proteins from mycelial extracts, many proteins are related to processes which are compartmentalised in the fungal cell, e.g. β-lactam antibiotic production in the microbody. For the study of such processes, a targeted approach is required, e.g. by organelle proteomics. Typical workflows for sample preparation in fungal organelle proteomics are discussed, including homogenisation and sub-cellular fractionation. Finally, examples are presented of fungal organelle proteomic studies, which have enlarged the knowledge on areas of interest to biotechnology, such as protein secretion, energy production or antibiotic biosynthesis.
Proteomics of industrial fungi: trends and insights for biotechnology

PubMed Central

de Oliveira, José Miguel P. Ferreira

2010-01-01

Filamentous fungi are widely known for their industrial applications, namely, the production of food-processing enzymes and metabolites such as antibiotics and organic acids. In the past decade, the full genome sequencing of filamentous fungi increased the potential to predict encoded proteins enormously, namely, hydrolytic enzymes or proteins involved in the biosynthesis of metabolites of interest. The integration of genome sequence information with possible phenotypes requires, however, the knowledge of all the proteins in the cell in a system-wise manner, given by proteomics. This review summarises the progress of proteomics and its importance for the study of biotechnological processes in filamentous fungi. A major step forward in proteomics was to couple protein separation with high-resolution mass spectrometry, allowing accurate protein quantification. Despite the fact that most fungal proteomic studies have been focused on proteins from mycelial extracts, many proteins are related to processes which are compartmentalised in the fungal cell, e.g. β-lactam antibiotic production in the microbody. For the study of such processes, a targeted approach is required, e.g. by organelle proteomics. Typical workflows for sample preparation in fungal organelle proteomics are discussed, including homogenisation and sub-cellular fractionation. Finally, examples are presented of fungal organelle proteomic studies, which have enlarged the knowledge on areas of interest to biotechnology, such as protein secretion, energy production or antibiotic biosynthesis. PMID:20922379
Organellar proteomics reveals hundreds of novel nuclear proteins in the malaria parasite Plasmodium falciparum

PubMed Central

2012-01-01

Background The post-genomic era of malaria research provided unprecedented insights into the biology of Plasmodium parasites. Due to the large evolutionary distance to model eukaryotes, however, we lack a profound understanding of many processes in Plasmodium biology. One example is the cell nucleus, which controls the parasite genome in a development- and cell cycle-specific manner through mostly unknown mechanisms. To study this important organelle in detail, we conducted an integrative analysis of the P. falciparum nuclear proteome. Results We combined high accuracy mass spectrometry and bioinformatic approaches to present for the first time an experimentally determined core nuclear proteome for P. falciparum. Besides a large number of factors implicated in known nuclear processes, one-third of all detected proteins carry no functional annotation, including many phylum- or genus-specific factors. Importantly, extensive experimental validation using 30 transgenic cell lines confirmed the high specificity of this inventory, and revealed distinct nuclear localization patterns of hitherto uncharacterized proteins. Further, our detailed analysis identified novel protein domains potentially implicated in gene transcription pathways, and sheds important new light on nuclear compartments and processes including regulatory complexes, the nucleolus, nuclear pores, and nuclear import pathways. Conclusion Our study provides comprehensive new insight into the biology of the Plasmodium nucleus and will serve as an important platform for dissecting general and parasite-specific nuclear processes in malaria parasites. Moreover, as the first nuclear proteome characterized in any protist organism, it will provide an important resource for studying evolutionary aspects of nuclear biology. PMID:23181666
A complete mass spectrometric map for the analysis of the yeast proteome and its application to quantitative trait analysis

PubMed Central

Picotti, Paola; Clement-Ziza, Mathieu; Lam, Henry; Campbell, David S.; Schmidt, Alexander; Deutsch, Eric W.; Röst, Hannes; Sun, Zhi; Rinner, Oliver; Reiter, Lukas; Shen, Qin; Michaelson, Jacob J.; Frei, Andreas; Alberti, Simon; Kusebauch, Ulrike; Wollscheid, Bernd; Moritz, Robert; Beyer, Andreas; Aebersold, Ruedi

2013-01-01

Complete reference maps or datasets, like the genomic map of an organism, are highly beneficial tools for biological and biomedical research. Attempts to generate such reference datasets for a proteome so far failed to reach complete proteome coverage, with saturation apparent at approximately two thirds of the proteomes tested, even for the most thoroughly characterized proteomes. Here, we used a strategy based on high-throughput peptide synthesis and mass spectrometry to generate a close to complete reference map (97% of the genome-predicted proteins) of the S. cerevisiae proteome. We generated two versions of this mass spectrometric map one supporting discovery- (shotgun) and the other hypothesis-driven (targeted) proteomic measurements. The two versions of the map, therefore, constitute a complete set of proteomic assays to support most studies performed with contemporary proteomic technologies. The reference libraries can be browsed via a web-based repository and associated navigation tools. To demonstrate the utility of the reference libraries we applied them to a protein quantitative trait locus (pQTL) analysis, which requires measurement of the same peptides over a large number of samples with high precision. Protein measurements over a set of 78 S. cerevisiae strains revealed a complex relationship between independent genetic loci, impacting on the levels of related proteins. Our results suggest that selective pressure favors the acquisition of sets of polymorphisms that maintain the stoichiometry of protein complexes and pathways. PMID:23334424
Integrative proteomics, genomics, and translational immunology approaches reveal mutated forms of Proteolipid Protein 1 (PLP1) and mutant-specific immune response in multiple sclerosis.

PubMed

Qendro, Veneta; Bugos, Grace A; Lundgren, Debbie H; Glynn, John; Han, May H; Han, David K

2017-03-01

In order to gain mechanistic insights into multiple sclerosis (MS) pathogenesis, we utilized a multi-dimensional approach to test the hypothesis that mutations in myelin proteins lead to immune activation and central nervous system autoimmunity in MS. Mass spectrometry-based proteomic analysis of human MS brain lesions revealed seven unique mutations of PLP1; a key myelin protein that is known to be destroyed in MS. Surprisingly, in-depth genomic analysis of two MS patients at the genomic DNA and mRNA confirmed mutated PLP1 in RNA, but not in the genomic DNA. Quantification of wild type and mutant PLP RNA levels by qPCR further validated the presence of mutant PLP RNA in the MS patients. To seek evidence linking mutations in abundant myelin proteins and immune-mediated destruction of myelin, specific immune response against mutant PLP1 in MS patients was examined. Thus, we have designed paired, wild type and mutant peptide microarrays, and examined antibody response to multiple mutated PLP1 in sera from MS patients. Consistent with the idea of different patients exhibiting unique mutation profiles, we found that 13 out of 20 MS patients showed antibody responses against specific but not against all the mutant-PLP1 peptides. Interestingly, we found mutant PLP-directed antibody response against specific mutant peptides in the sera of pre-MS controls. The results from integrative proteomic, genomic, and immune analyses reveal a possible mechanism of mutation-driven pathogenesis in human MS. The study also highlights the need for integrative genomic and proteomic analyses for uncovering pathogenic mechanisms of human diseases. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Prediction of vaccine candidates against Pseudomonas aeruginosa: An integrated genomics and proteomics approach.

PubMed

Rashid, Muhammad Ibrahim; Naz, Anam; Ali, Amjad; Andleeb, Saadia

2017-07-01

Pseudomonas aeruginosa is among top critical nosocomial infectious agents due to its persistent infections and tendency for acquiring drug resistance mechanisms. To date, there is no vaccine available for this pathogen. We attempted to exploit the genomic and proteomic information of P. aeruginosa though reverse-vaccinology approaches to unveil the prospective vaccine candidates. P. aeruginosa strain PAO1 genome was subjected to sequential prioritization approach following genomic, proteomics and structural analyses. Among, the predicted vaccine candidates: surface components of antibiotic efflux pumps (Q9HY88, PA2837), chaperone-usher pathway components (CupC2, CupB3), penicillin binding protein of bacterial cell wall (PBP1a/mrcA), extracellular component of Type 3 secretory system (PscC) and three uncharacterized secretory proteins (PA0629, PA2822, PA0978) were identified as potential candidates qualifying all the set criteria. These proteins were then analyzed for potential immunogenic surface exposed epitopes. These predicted epitopes may provide a basis for development of a reliable subunit vaccine against P. aeruginosa. Copyright © 2017 Elsevier Inc. All rights reserved.
From the genome sequence to the protein inventory of Bacillus subtilis.

PubMed

Becher, Dörte; Büttner, Knut; Moche, Martin; Hessling, Bernd; Hecker, Michael

2011-08-01

Owing to the low number of proteins necessary to render a bacterial cell viable, bacteria are extremely attractive model systems to understand how the genome sequence is translated into actual life processes. One of the most intensively investigated model organisms is Bacillus subtilis. It has attracted world-wide research interest, addressing cell differentiation and adaptation on a molecular scale as well as biotechnological production processes. Meanwhile, we are looking back on more than 25 years of B. subtilis proteomics. A wide range of methods have been developed during this period for the large-scale qualitative and quantitative proteome analysis. Currently, it is possible to identify and quantify more than 50% of the predicted proteome in different cellular subfractions. In this review, we summarize the development of B. subtilis proteomics during the past 25 years. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Proteomic Analysis of Pigeonpea (Cajanus cajan) Seeds Reveals the Accumulation of Numerous Stress-Related Proteins.

PubMed

Krishnan, Hari B; Natarajan, Savithiry S; Oehrle, Nathan W; Garrett, Wesley M; Darwish, Omar

2017-06-14

Pigeonpea is one of the major sources of dietary protein for more than a billion people living in South Asia. This hardy legume is often grown in low-input and risk-prone marginal environments. Considerable research effort has been devoted by a global research consortium to develop genomic resources for the improvement of this legume crop. These efforts have resulted in the elucidation of the complete genome sequence of pigeonpea. Despite these developments, little is known about the seed proteome of this important crop. Here, we report the proteome of pigeonpea seed. To enable the isolation of maximum number of seed proteins, including those that are present in very low amounts, three different protein fractions were obtained by employing different extraction media. High-resolution two-dimensional (2-D) electrophoresis followed by MALDI-TOF-TOF-MS/MS analysis of these protein fractions resulted in the identification of 373 pigeonpea seed proteins. Consistent with the reported high degree of synteny between the pigeonpea and soybean genomes, a large number of pigeonpea seed proteins exhibited significant amino acid homology with soybean seed proteins. Our proteomic analysis identified a large number of stress-related proteins, presumably due to its adaptation to drought-prone environments. The availability of a pigeonpea seed proteome reference map should shed light on the roles of these identified proteins in various biological processes and facilitate the improvement of seed composition.
Omics approaches in food safety: fulfilling the promise?

PubMed Central

Bergholz, Teresa M.; Moreno Switt, Andrea I.; Wiedmann, Martin

2014-01-01

Genomics, transcriptomics, and proteomics are rapidly transforming our approaches to detection, prevention and treatment of foodborne pathogens. Microbial genome sequencing in particular has evolved from a research tool into an approach that can be used to characterize foodborne pathogen isolates as part of routine surveillance systems. Genome sequencing efforts will not only improve outbreak detection and source tracking, but will also create large amounts of foodborne pathogen genome sequence data, which will be available for data mining efforts that could facilitate better source attribution and provide new insights into foodborne pathogen biology and transmission. While practical uses and application of metagenomics, transcriptomics, and proteomics data and associated tools are less prominent, these tools are also starting to yield practical food safety solutions. PMID:24572764
Global functional atlas of Escherichia coli encompassing previously uncharacterized proteins.

PubMed

Hu, Pingzhao; Janga, Sarath Chandra; Babu, Mohan; Díaz-Mejía, J Javier; Butland, Gareth; Yang, Wenhong; Pogoutse, Oxana; Guo, Xinghua; Phanse, Sadhna; Wong, Peter; Chandran, Shamanta; Christopoulos, Constantine; Nazarians-Armavil, Anaies; Nasseri, Negin Karimi; Musso, Gabriel; Ali, Mehrab; Nazemof, Nazila; Eroukova, Veronika; Golshani, Ashkan; Paccanaro, Alberto; Greenblatt, Jack F; Moreno-Hagelsieb, Gabriel; Emili, Andrew

2009-04-28

One-third of the 4,225 protein-coding genes of Escherichia coli K-12 remain functionally unannotated (orphans). Many map to distant clades such as Archaea, suggesting involvement in basic prokaryotic traits, whereas others appear restricted to E. coli, including pathogenic strains. To elucidate the orphans' biological roles, we performed an extensive proteomic survey using affinity-tagged E. coli strains and generated comprehensive genomic context inferences to derive a high-confidence compendium for virtually the entire proteome consisting of 5,993 putative physical interactions and 74,776 putative functional associations, most of which are novel. Clustering of the respective probabilistic networks revealed putative orphan membership in discrete multiprotein complexes and functional modules together with annotated gene products, whereas a machine-learning strategy based on network integration implicated the orphans in specific biological processes. We provide additional experimental evidence supporting orphan participation in protein synthesis, amino acid metabolism, biofilm formation, motility, and assembly of the bacterial cell envelope. This resource provides a "systems-wide" functional blueprint of a model microbe, with insights into the biological and evolutionary significance of previously uncharacterized proteins.
Large Scale Proteomic Data and Network-Based Systems Biology Approaches to Explore the Plant World.

PubMed

Di Silvestre, Dario; Bergamaschi, Andrea; Bellini, Edoardo; Mauri, PierLuigi

2018-06-03

The investigation of plant organisms by means of data-derived systems biology approaches based on network modeling is mainly characterized by genomic data, while the potential of proteomics is largely unexplored. This delay is mainly caused by the paucity of plant genomic/proteomic sequences and annotations which are fundamental to perform mass-spectrometry (MS) data interpretation. However, Next Generation Sequencing (NGS) techniques are contributing to filling this gap and an increasing number of studies are focusing on plant proteome profiling and protein-protein interactions (PPIs) identification. Interesting results were obtained by evaluating the topology of PPI networks in the context of organ-associated biological processes as well as plant-pathogen relationships. These examples foreshadow well the benefits that these approaches may provide to plant research. Thus, in addition to providing an overview of the main-omic technologies recently used on plant organisms, we will focus on studies that rely on concepts of module, hub and shortest path, and how they can contribute to the plant discovery processes. In this scenario, we will also consider gene co-expression networks, and some examples of integration with metabolomic data and genome-wide association studies (GWAS) to select candidate genes will be mentioned.
Gene duplications are extensive and contribute significantly to the toxic proteome of nematocysts isolated from Acropora digitifera (Cnidaria: Anthozoa: Scleractinia).

PubMed

Gacesa, Ranko; Chung, Ray; Dunn, Simon R; Weston, Andrew J; Jaimes-Becerra, Adrian; Marques, Antonio C; Morandini, André C; Hranueli, Daslav; Starcevic, Antonio; Ward, Malcolm; Long, Paul F

2015-10-13

Gene duplication followed by adaptive selection is a well-accepted process leading to toxin diversification in venoms. However, emergent genomic, transcriptomic and proteomic evidence now challenges this role to be at best equivocal to other processess . Cnidaria are arguably the most ancient phylum of the extant metazoa that are venomous and such provide a definitive ancestral anchor to examine the evolution of this trait. Here we compare predicted toxins from the translated genome of the coral Acropora digitifera to putative toxins revealed by proteomic analysis of soluble proteins discharged from nematocysts, to determine the extent to which gene duplications contribute to venom innovation in this reef-building coral species. A new bioinformatics tool called HHCompare was developed to detect potential gene duplications in the genomic data, which is made freely available ( https://github.com/rgacesa/HHCompare ). A total of 55 potential toxin encoding genes could be predicted from the A. digitifera genome, of which 36 (65 %) had likely arisen by gene duplication as evinced using the HHCompare tool and verified using two standard phylogeny methods. Surprisingly, only 22 % (12/55) of the potential toxin repertoire could be detected following rigorous proteomic analysis, for which only half (6/12) of the toxin proteome could be accounted for as peptides encoded by the gene duplicates. Biological activities of these toxins are dominatedby putative phospholipases and toxic peptidases. Gene expansions in A. digitifera venom are the most extensive yet described in any venomous animal, and gene duplication plays a significant role leading to toxin diversification in this coral species. Since such low numbers of toxins were detected in the proteome, it is unlikely that the venom is evolving rapidly by prey-driven positive natural selection. Rather we contend that the venom has a defensive role deterring predation or harm from interspecific competition and overgrowth by fouling organisms. Factors influencing translation of toxin encoding genes perhaps warrants more profound experimental consideration.
Announcing the Launch of CPTAC’s Proteogenomics DREAM Challenge | Office of Cancer Clinical Proteomics Research

Cancer.gov

This week, we are excited to announce the launch of the National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) Proteogenomics Computational DREAM Challenge. The aim of this Challenge is to encourage the generation of computational methods for extracting information from the cancer proteome and for linking those data to genomic and transcriptomic information. The specific goals are to predict proteomic and phosphoproteomic data from other multiple data types including transcriptomics and genetics.
Global Shifts in Genome and Proteome Composition Are Very Tightly Coupled

PubMed Central

Brbić, Maria; Warnecke, Tobias; Kriško, Anita; Supek, Fran

2015-01-01

The amino acid composition (AAC) of proteomes differs greatly between microorganisms and is associated with the environmental niche they inhabit, suggesting that these changes may be adaptive. Similarly, the oligonucleotide composition of genomes varies and may confer advantages at the DNA/RNA level. These influences overlap in protein-coding sequences, making it difficult to gauge their relative contributions. We disentangle these effects by systematically evaluating the correspondence between intergenic nucleotide composition, where protein-level selection is absent, the AAC, and ecological parameters of 909 prokaryotes. We find that G + C content, the most frequently used measure of genomic composition, cannot capture diversity in AAC and across ecological contexts. However, di-/trinucleotide composition in intergenic DNA predicts amino acid frequencies of proteomes to the point where very little cross-species variability remains unexplained (91% of variance accounted for). Qualitatively similar results were obtained for 49 fungal genomes, where 80% of the variability in AAC could be explained by the composition of introns and intergenic regions. Upon factoring out oligonucleotide composition and phylogenetic inertia, the residual AAC is poorly predictive of the microbes’ ecological preferences, in stark contrast with the original AAC. Moreover, highly expressed genes do not exhibit more prominent environment-related AAC signatures than lowly expressed genes, despite contributing more to the effective proteome. Thus, evolutionary shifts in overall AAC appear to occur almost exclusively through factors shaping the global oligonucleotide content of the genome. We discuss these results in light of contravening evidence from biophysical data and further reading frame-specific analyses that suggest that adaptation takes place at the protein level. PMID:25971281
Detailed tail proteomic analysis of axolotl (Ambystoma mexicanum) using an mRNA-seq reference database.

PubMed

Demircan, Turan; Keskin, Ilknur; Dumlu, Seda Nilgün; Aytürk, Nilüfer; Avşaroğlu, Mahmut Erhan; Akgün, Emel; Öztürk, Gürkan; Baykal, Ahmet Tarık

2017-01-01

Salamander axolotl has been emerging as an important model for stem cell research due to its powerful regenerative capacity. Several advantages, such as the high capability of advanced tissue, organ, and appendages regeneration, promote axolotl as an ideal model system to extend our current understanding on the mechanisms of regeneration. Acknowledging the common molecular pathways between amphibians and mammals, there is a great potential to translate the messages from axolotl research to mammalian studies. However, the utilization of axolotl is hindered due to the lack of reference databases of genomic, transcriptomic, and proteomic data. Here, we introduce the proteome analysis of the axolotl tail section searched against an mRNA-seq database. We translated axolotl mRNA sequences to protein sequences and annotated these to process the LC-MS/MS data and identified 1001 nonredundant proteins. Functional classification of identified proteins was performed by gene ontology searches. The presence of some of the identified proteins was validated by in situ antibody labeling. Furthermore, we have analyzed the proteome expressional changes postamputation at three time points to evaluate the underlying mechanisms of the regeneration process. Taken together, this work expands the proteomics data of axolotl to contribute to its establishment as a fully utilized model. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Extracellular proteome analysis of Leptospira interrogans serovar Lai.

PubMed

Zeng, Lingbing; Zhang, Yunyi; Zhu, Yongzhang; Yin, Haidi; Zhuang, Xuran; Zhu, Weinan; Guo, Xiaokui; Qin, Jinhong

2013-10-01

Abstract Leptospirosis is one of the most important zoonoses. Leptospira interrogans serovar Lai is a pathogenic spirochete that is responsible for leptospirosis. Extracellular proteins play an important role in the pathogenicity of this bacterium. In this study, L. interrogans serovar Lai was grown in protein-free medium; the supernatant was collected and subsequently analyzed as the extracellular proteome. A total of 66 proteins with more than two unique peptides were detected by MS/MS, and 33 of these were predicted to be extracellular proteins by a combination of bioinformatics analyses, including Psortb, cello, SoSuiGramN and SignalP. Comparisons of the transcriptional levels of these 33 genes between in vivo and in vitro conditions revealed that 15 genes were upregulated and two genes were downregulated in vivo compared to in vitro. A BLAST search for the components of secretion system at the genomic and proteomic levels revealed the presence of the complete type I secretion system and type II secretion system in this strain. Moreover, this strain also exhibits complete Sec translocase and Tat translocase systems. The extracellular proteome analysis of L. interrogans will supplement the previously generated whole proteome data and provide more information for studying the functions of specific proteins in the infection process and for selecting candidate molecules for vaccines or diagnostic tools for leptospirosis.

Hydroponics on a chip: analysis of the Fe deficient Arabidopsis thylakoid membrane proteome.

PubMed

Laganowsky, Arthur; Gómez, Stephen M; Whitelegge, Julian P; Nishio, John N

2009-04-13

The model plant Arabidopsis thaliana was used to evaluate the thylakoid membrane proteome under Fe-deficient conditions. Plants were cultivated using a novel hydroponic system, called "hydroponics on a chip", which yields highly reproducible plant tissue samples for physiological analyses, and can be easily used for in vivo stable isotope labeling. The thylakoid membrane proteome, from intact chloroplasts isolated from Fe-sufficient and Fe-deficient plants grown with hydroponics on a chip, was analyzed using liquid chromatography coupled to mass spectrometry. Intact masses of thylakoid membrane proteins were measured, many for the first time, and several proteins were identified with post-translational modifications that were altered by Fe deficiency; for example, the doubly phosphorylated form of the photosystem II oxygen evolving complex, PSBH, increased under Fe-deficiency. Increased levels of photosystem II protein subunit PSBS were detected in the Fe-deficient samples. Antioxidant enzymes, including ascorbate peroxidase and peroxiredoxin Q, were only detected in the Fe-deficient samples. We present the first biochemical evidence that the two major LHC IIb proteins (LHCB1 and LHCB2) may have significantly different functions in the thylakoid membrane. The study illustrates the utility of intact mass proteomics as an indispensable tool for functional genomics. "Hydroponics on a chip" provides the ability to grow A. thaliana under defined conditions that will be useful for systems biology.
Integrative computational approach for genome-based study of microbial lipid-degrading enzymes.

PubMed

Vorapreeda, Tayvich; Thammarongtham, Chinae; Laoteng, Kobkul

2016-07-01

Lipid-degrading or lipolytic enzymes have gained enormous attention in academic and industrial sectors. Several efforts are underway to discover new lipase enzymes from a variety of microorganisms with particular catalytic properties to be used for extensive applications. In addition, various tools and strategies have been implemented to unravel the functional relevance of the versatile lipid-degrading enzymes for special purposes. This review highlights the study of microbial lipid-degrading enzymes through an integrative computational approach. The identification of putative lipase genes from microbial genomes and metagenomic libraries using homology-based mining is discussed, with an emphasis on sequence analysis of conserved motifs and enzyme topology. Molecular modelling of three-dimensional structure on the basis of sequence similarity is shown to be a potential approach for exploring the structural and functional relationships of candidate lipase enzymes. The perspectives on a discriminative framework of cutting-edge tools and technologies, including bioinformatics, computational biology, functional genomics and functional proteomics, intended to facilitate rapid progress in understanding lipolysis mechanism and to discover novel lipid-degrading enzymes of microorganisms are discussed.
The wheat chloroplastic proteome.

PubMed

Kamal, Abu Hena Mostafa; Cho, Kun; Choi, Jong-Soon; Bae, Kwang-Hee; Komatsu, Setsuko; Uozumi, Nobuyuki; Woo, Sun Hee

2013-11-20

With the availability of plant genome sequencing, analysis of plant proteins with mass spectrometry has become promising and admired. Determining the proteome of a cell is still a challenging assignment, which is convoluted by proteome dynamics and convolution. Chloroplast is fastidious curiosity for plant biologists due to their intricate biochemical pathways for indispensable metabolite functions. In this review, an overview on proteomic studies conducted in wheat with a special focus on subcellular proteomics of chloroplast, salt and water stress. In recent years, we and other groups have attempted to understand the photosynthesis in wheat and abiotic stress under salt imposed and water deficit during vegetative stage. Those studies provide interesting results leading to better understanding of the photosynthesis and identifying the stress-responsive proteins. Indeed, recent studies aimed at resolving the photosynthesis pathway in wheat. Proteomic analysis combining two complementary approaches such as 2-DE and shotgun methods couple to high through put mass spectrometry (LTQ-FTICR and MALDI-TOF/TOF) in order to better understand the responsible proteins in photosynthesis and abiotic stress (salt and water) in wheat chloroplast will be focused. In this review we discussed the identification of the most abundant protein in wheat chloroplast and stress-responsive under salt and water stress in chloroplast of wheat seedlings, thus providing the proteomic view of the events during the development of this seedling under stress conditions. Chloroplast is fastidious curiosity for plant biologists due to their intricate biochemical pathways for indispensable metabolite functions. An overview on proteomic studies conducted in wheat with a special focus on subcellular proteomics of chloroplast, salt and water stress. We have attempted to understand the photosynthesis in wheat and abiotic stress under salt imposed and water deficit during seedling stage. Those studies provide interesting results leading to a better understanding of the photosynthesis and identifying the stress-responsive proteins. In reality, our studies aspired at resolving the photosynthesis pathway in wheat. Proteomic analysis united two complementary approaches such as Tricine SDS-PAGE and 2-DE methods couple to high through put mass spectrometry (LTQ-FTICR and MALDI-TOF/TOF) in order to better understand the responsible proteins in photosynthesis and abiotic stress (salt and water) in wheat chloroplast will be highlighted. This article is part of a Special Issue entitled: Translational Plant Proteomics. Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.
Genome-wide screening and identification of antigens for rickettsial vaccine development

USDA-ARS?s Scientific Manuscript database

The capacity to identify immunogens for vaccine development by genome-wide screening has been markedly enhanced by the availability of complete microbial genome sequences coupled to rapid proteomic and bioinformatic analysis. Critical to this genome-wide screening is in vivo testing in the context o...
The bovine lactation genome: Insights into the evolution of mammalian milk

USDA-ARS?s Scientific Manuscript database

The newly assembled Bos Taurus genome sequence enables the linkage of bovine milk and lactation data with other mammalian genomes. Using publicly available milk proteome data and mammary expressed sequence tags, 197 milk protein genes and over 6,000 mammary genes were identified in the bovine genome...
Molecular predictors of therapeutic response to specific anti-cancer agents

DOEpatents

Spellman, Paul T.; Gray, Joe W.; Sadanandam, Anguraj; Heiser, Laura M.; Gibb, William J.; Kuo, Wen-lin; Wang, Nicholas J.

2016-11-29

Herein is described the use of a collection of 50 breast cancer cell lines to match responses to 77 conventional and experimental therapeutic agents with transcriptional, proteomic and genomic subtypes found in primary tumors. Almost all compounds produced strong differential responses across the cell lines produced responses that were associated with transcriptional and proteomic subtypes and produced responses that were associated with recurrent genome copy number abnormalities. These associations can now be incorporated into clinical trials that test subtype markers and clinical responses simultaneously.
CPTAC Announces New PTRCs, PCCs, and PGDACs | Office of Cancer Clinical Proteomics Research

Cancer.gov

This week, the Office of Cancer Clinical Proteomics Research (OCCPR) at the National Cancer Institute (NCI), part of the National Institutes of Health, announced its aim to further the convergence of proteomics with genomics – “proteogenomics,” to better understand the molecular basis of cancer and accelerate research in these areas by disseminating research resources to the scientific community.
SALAD database: a motif-based database of protein annotations for plant comparative genomics

PubMed Central

Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi

2010-01-01

Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209 529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named ‘SALAD on ARRAYs’ to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis. PMID:19854933
SALAD database: a motif-based database of protein annotations for plant comparative genomics.

PubMed

Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi

2010-01-01

Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209,529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named 'SALAD on ARRAYs' to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis.
Meta-analysis of global metabolomics and proteomics data to link alterations with phenotype

DOE PAGES

Patti, Gary J.; Tautenhahn, Ralf; Fonslow, Bryan R.; ...

2011-01-01

Global metabolomics has emerged as a powerful tool to interrogate cellular biochemistry at the systems level by tracking alterations in the levels of small molecules. One approach to define cellular dynamics with respect to this dysregulation of small molecules has been to consider metabolic flux as a function of time. While flux measurements have proven effective for model organisms, acquiring multiple time points at appropriate temporal intervals for many sample types (e.g., clinical specimens) is challenging. As an alternative, meta-analysis provides another strategy for delineating metabolic cause and effect perturbations. That is, the combination of untargeted metabolomic data from multiplemore » pairwise comparisons enables the association of specific changes in small molecules with unique phenotypic alterations. We recently developed metabolomic software called metaXCMS to automate these types of higher order comparisons. Here we discuss the potential of metaXCMS for analyzing proteomic datasets and highlight the biological value of combining meta-results from both metabolomic and proteomic analyses. The combined meta-analysis has the potential to facilitate efforts in functional genomics and the identification of metabolic disruptions related to disease pathogenesis.« less
Functional environmental proteomics: elucidating the role of a c-type cytochrome abundant during uranium bioremediation

PubMed Central

Yun, Jiae; Malvankar, Nikhil S; Ueki, Toshiyuki; Lovley, Derek R

2016-01-01

Studies with pure cultures of dissimilatory metal-reducing microorganisms have demonstrated that outer-surface c-type cytochromes are important electron transfer agents for the reduction of metals, but previous environmental proteomic studies have typically not recovered cytochrome sequences from subsurface environments in which metal reduction is important. Gel-separation, heme-staining and mass spectrometry of proteins in groundwater from in situ uranium bioremediation experiments identified a putative c-type cytochrome, designated Geobacter subsurface c-type cytochrome A (GscA), encoded within the genome of strain M18, a Geobacter isolate previously recovered from the site. Homologs of GscA were identified in the genomes of other Geobacter isolates in the phylogenetic cluster known as subsurface clade 1, which predominates in a diversity of Fe(III)-reducing subsurface environments. Most of the gscA sequences recovered from groundwater genomic DNA clustered in a tight phylogenetic group closely related to strain M18. GscA was most abundant in groundwater samples in which Geobacter sp. predominated. Expression of gscA in a strain of Geobacter sulfurreducens that lacked the gene for the c-type cytochrome OmcS, thought to facilitate electron transfer from conductive pili to Fe(III) oxide, restored the capacity for Fe(III) oxide reduction. Atomic force microscopy provided evidence that GscA was associated with the pili. These results demonstrate that a c-type cytochrome with an apparent function similar to that of OmcS is abundant when Geobacter sp. are abundant in the subsurface, providing insight into the mechanisms for the growth of subsurface Geobacter sp. on Fe(III) oxide and suggesting an approach for functional analysis of other Geobacter proteins found in the subsurface. PMID:26140532
Functional environmental proteomics: elucidating the role of a c-type cytochrome abundant during uranium bioremediation.

PubMed

Yun, Jiae; Malvankar, Nikhil S; Ueki, Toshiyuki; Lovley, Derek R

2016-02-01

Studies with pure cultures of dissimilatory metal-reducing microorganisms have demonstrated that outer-surface c-type cytochromes are important electron transfer agents for the reduction of metals, but previous environmental proteomic studies have typically not recovered cytochrome sequences from subsurface environments in which metal reduction is important. Gel-separation, heme-staining and mass spectrometry of proteins in groundwater from in situ uranium bioremediation experiments identified a putative c-type cytochrome, designated Geobacter subsurface c-type cytochrome A (GscA), encoded within the genome of strain M18, a Geobacter isolate previously recovered from the site. Homologs of GscA were identified in the genomes of other Geobacter isolates in the phylogenetic cluster known as subsurface clade 1, which predominates in a diversity of Fe(III)-reducing subsurface environments. Most of the gscA sequences recovered from groundwater genomic DNA clustered in a tight phylogenetic group closely related to strain M18. GscA was most abundant in groundwater samples in which Geobacter sp. predominated. Expression of gscA in a strain of Geobacter sulfurreducens that lacked the gene for the c-type cytochrome OmcS, thought to facilitate electron transfer from conductive pili to Fe(III) oxide, restored the capacity for Fe(III) oxide reduction. Atomic force microscopy provided evidence that GscA was associated with the pili. These results demonstrate that a c-type cytochrome with an apparent function similar to that of OmcS is abundant when Geobacter sp. are abundant in the subsurface, providing insight into the mechanisms for the growth of subsurface Geobacter sp. on Fe(III) oxide and suggesting an approach for functional analysis of other Geobacter proteins found in the subsurface.
A comprehensive and scalable database search system for metaproteomics.

PubMed

Chatterjee, Sandip; Stupp, Gregory S; Park, Sung Kyu Robin; Ducom, Jean-Christophe; Yates, John R; Su, Andrew I; Wolan, Dennis W

2016-08-16

Mass spectrometry-based shotgun proteomics experiments rely on accurate matching of experimental spectra against a database of protein sequences. Existing computational analysis methods are limited in the size of their sequence databases, which severely restricts the proteomic sequencing depth and functional analysis of highly complex samples. The growing amount of public high-throughput sequencing data will only exacerbate this problem. We designed a broadly applicable metaproteomic analysis method (ComPIL) that addresses protein database size limitations. Our approach to overcome this significant limitation in metaproteomics was to design a scalable set of sequence databases assembled for optimal library querying speeds. ComPIL was integrated with a modified version of the search engine ProLuCID (termed "Blazmass") to permit rapid matching of experimental spectra. Proof-of-principle analysis of human HEK293 lysate with a ComPIL database derived from high-quality genomic libraries was able to detect nearly all of the same peptides as a search with a human database (~500x fewer peptides in the database), with a small reduction in sensitivity. We were also able to detect proteins from the adenovirus used to immortalize these cells. We applied our method to a set of healthy human gut microbiome proteomic samples and showed a substantial increase in the number of identified peptides and proteins compared to previous metaproteomic analyses, while retaining a high degree of protein identification accuracy and allowing for a more in-depth characterization of the functional landscape of the samples. The combination of ComPIL with Blazmass allows proteomic searches to be performed with database sizes much larger than previously possible. These large database searches can be applied to complex meta-samples with unknown composition or proteomic samples where unexpected proteins may be identified. The protein database, proteomic search engine, and the proteomic data files for the 5 microbiome samples characterized and discussed herein are open source and available for use and additional analysis.
The First Genomic and Proteomic Characterization of a Deep-Sea Sulfate Reducer: Insights into the Piezophilic Lifestyle of Desulfovibrio piezophilus

PubMed Central

Pradel, Nathalie; Ji, Boyang; Gimenez, Grégory; Talla, Emmanuel; Lenoble, Patricia; Garel, Marc; Tamburini, Christian; Fourquet, Patrick; Lebrun, Régine; Bertin, Philippe; Denis, Yann; Pophillat, Matthieu; Barbe, Valérie; Ollivier, Bernard; Dolla, Alain

2013-01-01

Desulfovibrio piezophilus strain C1TLV30T is a piezophilic anaerobe that was isolated from wood falls in the Mediterranean deep-sea. D. piezophilus represents a unique model for studying the adaptation of sulfate-reducing bacteria to hydrostatic pressure. Here, we report the 3.6 Mbp genome sequence of this piezophilic bacterium. An analysis of the genome revealed the presence of seven genomic islands as well as gene clusters that are most likely linked to life at a high hydrostatic pressure. Comparative genomics and differential proteomics identified the transport of solutes and amino acids as well as amino acid metabolism as major cellular processes for the adaptation of this bacterium to hydrostatic pressure. In addition, the proteome profiles showed that the abundance of key enzymes that are involved in sulfate reduction was dependent on hydrostatic pressure. A comparative analysis of orthologs from the non-piezophilic marine bacterium D. salexigens and D. piezophilus identified aspartic acid, glutamic acid, lysine, asparagine, serine and tyrosine as the amino acids preferentially replaced by arginine, histidine, alanine and threonine in the piezophilic strain. This work reveals the adaptation strategies developed by a sulfate reducer to a deep-sea lifestyle. PMID:23383081
Proteogenomic Investigation of Strain Variation in Clinical Mycobacterium tuberculosis Isolates.

PubMed

Heunis, Tiaan; Dippenaar, Anzaan; Warren, Robin M; van Helden, Paul D; van der Merwe, Ruben G; Gey van Pittius, Nicolaas C; Pain, Arnab; Sampson, Samantha L; Tabb, David L

2017-10-06

Mycobacterium tuberculosis consists of a large number of different strains that display unique virulence characteristics. Whole-genome sequencing has revealed substantial genetic diversity among clinical M. tuberculosis isolates, and elucidating the phenotypic variation encoded by this genetic diversity will be of the utmost importance to fully understand M. tuberculosis biology and pathogenicity. In this study, we integrated whole-genome sequencing and mass spectrometry (GeLC-MS/MS) to reveal strain-specific characteristics in the proteomes of two clinical M. tuberculosis Latin American-Mediterranean isolates. Using this approach, we identified 59 peptides containing single amino acid variants, which covered ∼9% of all coding nonsynonymous single nucleotide variants detected by whole-genome sequencing. Furthermore, we identified 29 distinct peptides that mapped to a hypothetical protein not present in the M. tuberculosis H37Rv reference proteome. Here, we provide evidence for the expression of this protein in the clinical M. tuberculosis SAWC3651 isolate. The strain-specific databases enabled confirmation of genomic differences (i.e., large genomic regions of difference and nonsynonymous single nucleotide variants) in these two clinical M. tuberculosis isolates and allowed strain differentiation at the proteome level. Our results contribute to the growing field of clinical microbial proteogenomics and can improve our understanding of phenotypic variation in clinical M. tuberculosis isolates.
Proteomic and comparative genomic analysis reveals adaptability of Brassica napus to phosphorus-deficient stress.

PubMed

Chen, Shuisen; Ding, Guangda; Wang, Zhenhua; Cai, Hongmei; Xu, Fangsen

2015-03-18

Given low solubility and immobility in many soils of the world, phosphorus (P) may be the most widely studied macronutrient for plants. In an attempt to gain an insight into the adaptability of Brassica napus to P deficiency, proteome alterations of roots and leaves in two B. napus contrasting genotypes, P-efficient 'Eyou Changjia' and P-inefficient 'B104-2', under long-term low P stress and short-term P-free starvation conditions were investigated, and proteomic combined with comparative genomic analyses were conducted to interpret the interrelation of differential abundance protein species (DAPs) responding to P deficiency with quantitative trait loci (QTLs) for P deficiency tolerance. P-efficient 'Eyou Changjia' had higher dry weight and P content, and showed high tolerance to low P stress compared with P-inefficient 'B104-2'. A total of 146 DAPs were successfully identified by MALDI TOF/TOF MS, which were categorized into several groups including defense and stress response, carbohydrate and energy metabolism, signaling and regulation, amino acid and fatty acid metabolism, protein process, biogenesis and cellular component, and function unknown. 94 of 146 DAPs were mapped to a linkage map constructed by a B. napus population derived from a cross between the two genotypes, and 72 DAPs were located in the confidence intervals of QTLs for P efficiency related traits. We conclude that the identification of these DAPs and the co-location of DAPs with QTLs in the B. napus linkage genetic map provide us novel information in understanding the adaptability of B. napus to P deficiency, and helpful to isolate P-efficient genes in B. napus. Low P seriously limits the production and quality of B. napus. Proteomics and genetic linkage map were widely used to study the adaptive strategies of B. napus response to P deficiency, proteomic combined with comparative genetic analysis to investigate the correlations between DAPs and QTLs are scarce. Thus, we herein investigated proteome alteration of the roots and leaves in two B. napus genotypes, with different P-deficient tolerances, in response to long-term low P stress and short-term P-free starvation by 2-DE. And comparative genomic was conducted to map the DAPs to the linkage map of B. napus by sequence alignment. The present study offers new insights into adaptability mechanism of B. napus to P deficiency and provides novel information in map-based cloning to isolate the genes in B. napus and scientific improvement of P-efficient in practice. Copyright © 2015 Elsevier B.V. All rights reserved.
Domain selection combined with improved cloning strategy for high throughput expression of higher eukaryotic proteins

PubMed Central

Chen, Yunjia; Qiu, Shihong; Luan, Chi-Hao; Luo, Ming

2007-01-01

Background Expression of higher eukaryotic genes as soluble, stable recombinant proteins is still a bottleneck step in biochemical and structural studies of novel proteins today. Correct identification of stable domains/fragments within the open reading frame (ORF), combined with proper cloning strategies, can greatly enhance the success rate when higher eukaryotic proteins are expressed as these domains/fragments. Furthermore, a HTP cloning pipeline incorporated with bioinformatics domain/fragment selection methods will be beneficial to studies of structure and function genomics/proteomics. Results With bioinformatics tools, we developed a domain/domain boundary prediction (DDBP) method, which was trained by available experimental data. Combined with an improved cloning strategy, DDBP had been applied to 57 proteins from C. elegans. Expression and purification results showed there was a 10-fold increase in terms of obtaining purified proteins. Based on the DDBP method, the improved GATEWAY cloning strategy and a robotic platform, we constructed a high throughput (HTP) cloning pipeline, including PCR primer design, PCR, BP reaction, transformation, plating, colony picking and entry clones extraction, which have been successfully applied to 90 C. elegans genes, 88 Brucella genes, and 188 human genes. More than 97% of the targeted genes were obtained as entry clones. This pipeline has a modular design and can adopt different operations for a variety of cloning/expression strategies. Conclusion The DDBP method and improved cloning strategy were satisfactory. The cloning pipeline, combined with our recombinant protein HTP expression pipeline and the crystal screening robots, constitutes a complete platform for structure genomics/proteomics. This platform will increase the success rate of purification and crystallization dramatically and promote the further advancement of structure genomics/proteomics. PMID:17663785
The proteomic complexity and rise of the primordial ancestor of diversified life

PubMed Central

2011-01-01

Background The last universal common ancestor represents the primordial cellular organism from which diversified life was derived. This urancestor accumulated genetic information before the rise of organismal lineages and is considered to be either a simple 'progenote' organism with a rudimentary translational apparatus or a more complex 'cenancestor' with almost all essential biological processes. Recent comparative genomic studies support the latter model and propose that the urancestor was similar to modern organisms in terms of gene content. However, most of these studies were based on molecular sequences, which are fast evolving and of limited value for deep evolutionary explorations. Results Here we engage in a phylogenomic study of protein domain structure in the proteomes of 420 free-living fully sequenced organisms. Domains were defined at the highly conserved fold superfamily (FSF) level of structural classification and an iterative phylogenomic approach was used to reconstruct max_set and min_set FSF repertoires as upper and lower bounds of the urancestral proteome. While the functional make up of the urancestral sets was complex, they represent only 5-11% of the 1,420 FSFs of extant proteomes and their make up and reuse was at least 5 and 3 times smaller than proteomes of free-living organisms, repectively. Trees of proteomes reconstructed directly from FSFs or from molecular functions, which included the max_set and min_set as articial taxa, showed that urancestors were always placed at their base and rooted the tree of life in Archaea. Finally, a molecular clock of FSFs suggests the min_set reflects urancestral genetic make up more reliably and confirms diversified life emerged about 2.9 billion years ago during the start of planet oxygenation. Conclusions The minimum urancestral FSF set reveals the urancestor had advanced metabolic capabilities, was especially rich in nucleotide metabolism enzymes, had pathways for the biosynthesis of membrane sn1,2 glycerol ester and ether lipids, and had crucial elements of translation, including a primordial ribosome with protein synthesis capabilities. It lacked however fundamental functions, including transcription, processes for extracellular communication, and enzymes for deoxyribonucleotide synthesis. Proteomic history reveals the urancestor is closer to a simple progenote organism but harbors a rather complex set of modern molecular functions. PMID:21612591
Quantitative proteomic analysis in breast cancer.

PubMed

Tabchy, A; Hennessy, B T; Gonzalez-Angulo, A M; Bernstam, F M; Lu, Y; Mills, G B

2011-02-01

Much progress has recently been made in the genomic and transcriptional characterization of tumors. However, historically the characterization of cells at the protein level has suffered limitations in reproducibility, scalability and robustness. Recent technological advances have made it possible to accurately and reproducibly portray the global levels and active states of cellular proteins. Protein microarrays examine the native post-translational conformations of proteins including activated phosphorylated states, in a comprehensive high-throughput mode, and can map activated pathways and networks of proteins inside the cells. The reverse-phase protein microarray (RPPA) offers a unique opportunity to study signal transduction networks in small biological samples such as human biopsy material and can provide critical information for therapeutic decision-making and the monitoring of patients for targeted molecular medicine. By providing the key missing link to the story generated from genomic and gene expression characterization efforts, functional proteomics offer the promise of a comprehensive understanding of cancer. Several initial successes in breast cancer are showing that such information is clinically relevant. Copyright 2011 Prous Science, S.A.U. or its licensors. All rights reserved.
Plasmodium vivax Biology: Insights Provided by Genomics, Transcriptomics and Proteomics

PubMed Central

Bourgard, Catarina; Albrecht, Letusa; Kayano, Ana C. A. V.; Sunnerhagen, Per; Costa, Fabio T. M.

2018-01-01

During the last decade, the vast omics field has revolutionized biological research, especially the genomics, transcriptomics and proteomics branches, as technological tools become available to the field researcher and allow difficult question-driven studies to be addressed. Parasitology has greatly benefited from next generation sequencing (NGS) projects, which have resulted in a broadened comprehension of basic parasite molecular biology, ecology and epidemiology. Malariology is one example where application of this technology has greatly contributed to a better understanding of Plasmodium spp. biology and host-parasite interactions. Among the several parasite species that cause human malaria, the neglected Plasmodium vivax presents great research challenges, as in vitro culturing is not yet feasible and functional assays are heavily limited. Therefore, there are gaps in our P. vivax biology knowledge that affect decisions for control policies aiming to eradicate vivax malaria in the near future. In this review, we provide a snapshot of key discoveries already achieved in P. vivax sequencing projects, focusing on developments, hurdles, and limitations currently faced by the research community, as well as perspectives on future vivax malaria research. PMID:29473024

Analysis of the functional aspects and seminal plasma proteomic profile of sperm from smokers.

PubMed

Antoniassi, Mariana Pereira; Intasqui, Paula; Camargo, Mariana; Zylbersztejn, Daniel Suslik; Carvalho, Valdemir Melechco; Cardozo, Karina H M; Bertolla, Ricardo Pimenta

2016-11-01

To evaluate the effect of smoking on sperm functional quality and seminal plasma proteomic profile. Sperm functional tests were performed in 20 non-smoking men with normal semen quality, according to the World Health Organization (2010) and in 20 smoking patients. These included: evaluation of DNA fragmentation by alkaline Comet assay; analysis of mitochondrial activity using DAB staining; and acrosomal integrity evaluation by PNA binding. The remaining semen was centrifuged and seminal plasma was used for proteomic analysis (liquid chromatography-tandem mass spectrometry). The quantified proteins were used for Venn diagram construction in Cytoscape 3.2.1 software, using the PINA4MS plug-in. Then, differentially expressed proteins were used for functional enrichment analysis of Gene Ontology categories, Kyoto Encyclopedia of Genes and Genomes and Reactome, using Cytoscape software and the ClueGO 2.2.0 plug-in. Smokers had a higher percentage of sperm DNA damage (Comet classes III and IV; P < 0.01), partially and fully inactive mitochondria (DAB classes III and IV; P = 0.001 and P = 0.006, respectively) and non-intact acrosomes (P < 0.01) when compared with the control group. With respect to proteomic analysis, 422 proteins were identified and quantified, of which one protein was absent, 27 proteins were under-represented and six proteins were over-represented in smokers. Functional enrichment analysis showed the enrichment of antigen processing and presentation, positive regulation of prostaglandin secretion involved in immune response, protein kinase A signalling and arachidonic acid secretion, complement activation, regulation of the cytokine-mediated signalling pathway and regulation of acute inflammatory response in the study group (smokers). In conclusion, cigarette smoking was associated with an inflammatory state in the accessory glands and in the testis, as shown by enriched proteomic pathways. This state causes an alteration in sperm functional quality, which is characterized by decreased acrosome integrity and mitochondrial activity, as well as by increased nuclear DNA fragmentation. © 2016 The Authors BJU International © 2016 BJU International Published by John Wiley & Sons Ltd.
DOGMA: domain-based transcriptome and proteome quality assessment.

PubMed

Dohmen, Elias; Kremer, Lukas P M; Bornberg-Bauer, Erich; Kemena, Carsten

2016-09-01

Genome studies have become cheaper and easier than ever before, due to the decreased costs of high-throughput sequencing and the free availability of analysis software. However, the quality of genome or transcriptome assemblies can vary a lot. Therefore, quality assessment of assemblies and annotations are crucial aspects of genome analysis pipelines. We developed DOGMA, a program for fast and easy quality assessment of transcriptome and proteome data based on conserved protein domains. DOGMA measures the completeness of a given transcriptome or proteome and provides information about domain content for further analysis. DOGMA provides a very fast way to do quality assessment within seconds. DOGMA is implemented in Python and published under GNU GPL v.3 license. The source code is available on https://ebbgit.uni-muenster.de/domainWorld/DOGMA/ CONTACTS: e.dohmen@wwu.de or c.kemena@wwu.de Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
The relationships between the isoelectric point and: length of proteins, taxonomy and ecology of organisms

PubMed Central

Kiraga, Joanna; Mackiewicz, Pawel; Mackiewicz, Dorota; Kowalczuk, Maria; Biecek, Przemysław; Polak, Natalia; Smolarczyk, Kamila; Dudek, Miroslaw R; Cebrat, Stanislaw

2007-01-01

Background The distribution of isoelectric point (pI) of proteins in a proteome is universal for all organisms. It is bimodal dividing the proteome into two sets of acidic and basic proteins. Different species however have different abundance of acidic and basic proteins that may be correlated with taxonomy, subcellular localization, ecological niche of organisms and proteome size. Results We have analysed 1784 proteomes encoded by chromosomes of Archaea, Bacteria, Eukaryota, and also mitochondria, plastids, prokaryotic plasmids, phages and viruses. We have found significant correlation in more than 95% of proteomes between the protein length and pI in proteomes – positive for acidic proteins and negative for the basic ones. Plastids, viruses and plasmids encode more basic proteomes while chromosomes of Archaea, Bacteria, Eukaryota, mitochondria and phages more acidic ones. Mitochondrial proteomes of Viridiplantae, Protista and Fungi are more basic than Metazoa. It results from the presence of basic proteins in the former proteomes and their absence from the latter ones and is related with reduction of metazoan genomes. Significant correlation was found between the pI bias of proteomes encoded by prokaryotic chromosomes and proteomes encoded by plasmids but there is no correlation between eukaryotic nuclear-coded proteomes and proteomes encoded by organelles. Detailed analyses of prokaryotic proteomes showed significant relationships between pI distribution and habitat, relation to the host cell and salinity of the environment, but no significant correlation with oxygen and temperature requirements. The salinity is positively correlated with acidicity of proteomes. Host-associated organisms and especially intracellular species have more basic proteomes than free-living ones. The higher rate of mutations accumulation in the intracellular parasites and endosymbionts is responsible for the basicity of their tiny proteomes that explains the observed positive correlation between the decrease of genome size and the increase of basicity of proteomes. The results indicate that even conserved proteins subjected to strong selectional constraints follow the global trend in the pI distribution. Conclusion The distribution of pI of proteins in proteomes shows clear relationships with length of proteins, subcellular localization, taxonomy and ecology of organisms. The distribution is also strongly affected by mutational pressure especially in intracellular organisms. PMID:17565672
Genomic and Proteomic Profiling Reveals Reduced Mitochondrial Function and Disruption of the Neuromuscular Junction Driving Rat Sarcopenia

PubMed Central

Ibebunjo, Chikwendu; Chick, Joel M.; Kendall, Tracee; Eash, John K.; Li, Christine; Zhang, Yunyu; Vickers, Chad; Wu, Zhidan; Clarke, Brian A.; Shi, Jun; Cruz, Joseph; Fournier, Brigitte; Brachat, Sophie; Gutzwiller, Sabine; Ma, QiCheng; Markovits, Judit; Broome, Michelle; Steinkrauss, Michelle; Skuba, Elizabeth; Galarneau, Jean-Rene; Gygi, Steven P.

2013-01-01

Molecular mechanisms underlying sarcopenia, the age-related loss of skeletal muscle mass and function, remain unclear. To identify molecular changes that correlated best with sarcopenia and might contribute to its pathogenesis, we determined global gene expression profiles in muscles of rats aged 6, 12, 18, 21, 24, and 27 months. These rats exhibit sarcopenia beginning at 21 months. Correlation of the gene expression versus muscle mass or age changes, and functional annotation analysis identified gene signatures of sarcopenia distinct from gene signatures of aging. Specifically, mitochondrial energy metabolism (e.g., tricarboxylic acid cycle and oxidative phosphorylation) pathway genes were the most downregulated and most significantly correlated with sarcopenia. Also, perturbed were genes/pathways associated with neuromuscular junction patency (providing molecular evidence of sarcopenia-related functional denervation and neuromuscular junction remodeling), protein degradation, and inflammation. Proteomic analysis of samples at 6, 18, and 27 months confirmed the depletion of mitochondrial energy metabolism proteins and neuromuscular junction proteins. Together, these findings suggest that therapeutic approaches that simultaneously stimulate mitochondrogenesis and reduce muscle proteolysis and inflammation have potential for treating sarcopenia. PMID:23109432
Recent insights into plant-virus interactions through proteomic analysis.

PubMed

Di Carli, Mariasole; Benvenuto, Eugenio; Donini, Marcello

2012-10-05

Plant viruses represent a major threat for a wide range of host species causing severe losses in agricultural practices. The full comprehension of mechanisms underlying events of virus-host plant interaction is crucial to devise novel plant resistance strategies. Until now, functional genomics studies in plant-virus interaction have been limited mainly on transcriptomic analysis. Only recently are proteomic approaches starting to provide important contributions to this area of research. Classical two-dimensional electrophoresis (2-DE) coupled to mass spectrometry (MS) is still the most widely used platform in plant proteome analysis, although in the last years the application of quantitative "second generation" proteomic techniques (such as differential in gel electrophoresis, DIGE, and gel-free protein separation methods) are emerging as more powerful analytical approaches. Apparently simple, plant-virus interactions reveal a really complex pathophysiological context, in which resistance, defense and susceptibility, and direct virus-induced reactions interplay to trigger expression responses of hundreds of genes. Given that, this review is specifically focused on comparative proteome-based studies on pathogenesis of several viral genera, including some of the most important and widespread plant viruses of the genus Tobamovirus, Sobemovirus, Cucumovirus and Potyvirus. In all, this overview reveals a widespread repression of proteins associated with the photosynthetic apparatus, while energy metabolism/protein synthesis and turnover are typically up-regulated, indicating a major redirection of cell metabolism. Other common features include the modulation of metabolisms concerning sugars, cell wall, and reactive oxigen species as well as pathogenesis-related (PR) proteins. The fine-tuning between plant development and antiviral defense mechanisms determines new patterns of regulation of common metabolic pathways. By offering a 360-degree view of protein modulation, all proteomic tools reveal the extraordinary intricacy of mechanisms with which a simple viral genome perturbs the plant cell molecular networks. This "omic" approach, while providing a global perspective and useful information to the understanding of the plant host-virus interactome, may possibly reveal protein targets/markers useful in the design of future diagnosis and/or plant protection strategies.
Comparative proteome analysis of Milnesium tardigradum in early embryonic state versus adults in active and anhydrobiotic state.

PubMed

Schokraie, Elham; Warnken, Uwe; Hotz-Wagenblatt, Agnes; Grohme, Markus A; Hengherr, Steffen; Förster, Frank; Schill, Ralph O; Frohme, Marcus; Dandekar, Thomas; Schnölzer, Martina

2012-01-01

Tardigrades have fascinated researchers for more than 300 years because of their extraordinary capability to undergo cryptobiosis and survive extreme environmental conditions. However, the survival mechanisms of tardigrades are still poorly understood mainly due to the absence of detailed knowledge about the proteome and genome of these organisms. Our study was intended to provide a basis for the functional characterization of expressed proteins in different states of tardigrades. High-throughput, high-accuracy proteomics in combination with a newly developed tardigrade specific protein database resulted in the identification of more than 3000 proteins in three different states: early embryonic state and adult animals in active and anhydrobiotic state. This comprehensive proteome resource includes protein families such as chaperones, antioxidants, ribosomal proteins, cytoskeletal proteins, transporters, protein channels, nutrient reservoirs, and developmental proteins. A comparative analysis of protein families in the different states was performed by calculating the exponentially modified protein abundance index which classifies proteins in major and minor components. This is the first step to analyzing the proteins involved in early embryonic development, and furthermore proteins which might play an important role in the transition into the anhydrobiotic state.
Comparative proteome analysis of Milnesium tardigradum in early embryonic state versus adults in active and anhydrobiotic state

PubMed Central

Schokraie, Elham; Warnken, Uwe; Hotz-Wagenblatt, Agnes; Grohme, Markus A.; Hengherr, Steffen; Förster, Frank; Schill, Ralph O.; Frohme, Marcus; Dandekar, Thomas; Schnölzer, Martina

2012-01-01

Tardigrades have fascinated researchers for more than 300 years because of their extraordinary capability to undergo cryptobiosis and survive extreme environmental conditions. However, the survival mechanisms of tardigrades are still poorly understood mainly due to the absence of detailed knowledge about the proteome and genome of these organisms. Our study was intended to provide a basis for the functional characterization of expressed proteins in different states of tardigrades. High-throughput, high-accuracy proteomics in combination with a newly developed tardigrade specific protein database resulted in the identification of more than 3000 proteins in three different states: early embryonic state and adult animals in active and anhydrobiotic state. This comprehensive proteome resource includes protein families such as chaperones, antioxidants, ribosomal proteins, cytoskeletal proteins, transporters, protein channels, nutrient reservoirs, and developmental proteins. A comparative analysis of protein families in the different states was performed by calculating the exponentially modified protein abundance index which classifies proteins in major and minor components. This is the first step to analyzing the proteins involved in early embryonic development, and furthermore proteins which might play an important role in the transition into the anhydrobiotic state. PMID:23029181
Genome-, Transcriptome- and Proteome-Wide Analyses of the Gliadin Gene Families in Triticum urartu

PubMed Central

Wang, Dongzhi; Yang, Wenlong; Sun, Jiazhu; Zhang, Aimin; Zhan, Kehui

2015-01-01

Gliadins are the major components of storage proteins in wheat grains, and they play an essential role in the dough extensibility and nutritional quality of flour. Because of the large number of the gliadin family members, the high level of sequence identity, and the lack of abundant genomic data for Triticum species, identifying the full complement of gliadin family genes in hexaploid wheat remains challenging. Triticum urartu is a wild diploid wheat species and considered the A-genome donor of polyploid wheat species. The accession PI428198 (G1812) was chosen to determine the complete composition of the gliadin gene families in the wheat A-genome using the available draft genome. Using a PCR-based cloning strategy for genomic DNA and mRNA as well as a bioinformatics analysis of genomic sequence data, 28 gliadin genes were characterized. Of these genes, 23 were α-gliadin genes, three were γ-gliadin genes and two were ω-gliadin genes. An RNA sequencing (RNA-Seq) survey of the dynamic expression patterns of gliadin genes revealed that their synthesis in immature grains began prior to 10 days post-anthesis (DPA), peaked at 15 DPA and gradually decreased at 20 DPA. The accumulation of proteins encoded by 16 of the expressed gliadin genes was further verified and quantified using proteomic methods. The phylogenetic analysis demonstrated that the homologs of these α-gliadin genes were present in tetraploid and hexaploid wheat, which was consistent with T. urartu being the A-genome progenitor species. This study presents a systematic investigation of the gliadin gene families in T. urartu that spans the genome, transcriptome and proteome, and it provides new information to better understand the molecular structure, expression profiles and evolution of the gliadin genes in T. urartu and common wheat. PMID:26132381
Genome-, Transcriptome- and Proteome-Wide Analyses of the Gliadin Gene Families in Triticum urartu.

PubMed

Zhang, Yanlin; Luo, Guangbin; Liu, Dongcheng; Wang, Dongzhi; Yang, Wenlong; Sun, Jiazhu; Zhang, Aimin; Zhan, Kehui

2015-01-01

Gliadins are the major components of storage proteins in wheat grains, and they play an essential role in the dough extensibility and nutritional quality of flour. Because of the large number of the gliadin family members, the high level of sequence identity, and the lack of abundant genomic data for Triticum species, identifying the full complement of gliadin family genes in hexaploid wheat remains challenging. Triticum urartu is a wild diploid wheat species and considered the A-genome donor of polyploid wheat species. The accession PI428198 (G1812) was chosen to determine the complete composition of the gliadin gene families in the wheat A-genome using the available draft genome. Using a PCR-based cloning strategy for genomic DNA and mRNA as well as a bioinformatics analysis of genomic sequence data, 28 gliadin genes were characterized. Of these genes, 23 were α-gliadin genes, three were γ-gliadin genes and two were ω-gliadin genes. An RNA sequencing (RNA-Seq) survey of the dynamic expression patterns of gliadin genes revealed that their synthesis in immature grains began prior to 10 days post-anthesis (DPA), peaked at 15 DPA and gradually decreased at 20 DPA. The accumulation of proteins encoded by 16 of the expressed gliadin genes was further verified and quantified using proteomic methods. The phylogenetic analysis demonstrated that the homologs of these α-gliadin genes were present in tetraploid and hexaploid wheat, which was consistent with T. urartu being the A-genome progenitor species. This study presents a systematic investigation of the gliadin gene families in T. urartu that spans the genome, transcriptome and proteome, and it provides new information to better understand the molecular structure, expression profiles and evolution of the gliadin genes in T. urartu and common wheat.
Identification of p90 Ribosomal S6 Kinase 2 as a Novel Host Protein in HBx Augmenting HBV Replication by iTRAQ-Based Quantitative Comparative Proteomics.

PubMed

Yan, Li-Bo; Yu, You-Jia; Zhang, Qing-Bo; Tang, Xiao-Qiong; Bai, Lang; Huang, FeiJun; Tang, Hong

2018-05-01

The aim of this study was to screen for novel host proteins that play a role in HBx augmenting Hepatitis B virus (HBV) replication. Three HepG2 cell lines stably harboring different functional domains of HBx (HBx, HBx-Cm6, and HBx-Cm16) were cultured. ITRAQ technology integrated with LC-MS/MS analysis was applied to identify the proteome differences among these three cell lines. In brief, a total of 70 different proteins were identified among HepG2-HBx, HepG2-HBx-Cm6, and HepG2-HBx-Cm16 by double repetition. Several differentially expressed proteins, including p90 ribosomal S6 kinase 2 (RSK2), were further validated. RSK2 was expressed at higher levels in HepG2-HBx and HepG2-HBx-Cm6 compared with HepG2-HBx-Cm16. Furthermore, levels of HBV replication intermediates were decreased after silencing RSK2 in HepG2.2.15. An HBx-minus HBV mutant genome led to decreased levels of HBV replication intermediates and these decreases were restored to levels similar to wild-type HBV by transient ectopic expression of HBx. After silencing RSK2 expression, the levels of HBV replication intermediates synthesized from the HBx-minus HBV mutant genome were not restored to levels that were observed with wild-type HBV by transient HBx expression. Based on iTRAQ quantitative comparative proteomics, RSK2 was identified as a novel host protein that plays a role in HBx augmenting HBV replication. © 2018 The Authors. Proteomics - Clinical Application Published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Comparative proteomic analysis of lung tissue from guinea pigs with Leptospiral Pulmonary Haemorrhage Syndrome (LPHS) reveals a decrease in abundance of host proteins involved in cytoskeletal and cellular organization

USDA-ARS?s Scientific Manuscript database

The recent completion of the complete genome sequence of the guinea pig (Cavia porcellus) provides innovative opportunities to apply proteomic technologies to an important animal model of disease. In this study, a 2-D guinea pig proteome lung map was used to investigate the pathogenic mechanisms of ...
HelmCoP: An Online Resource for Helminth Functional Genomics and Drug and Vaccine Targets Prioritization

PubMed Central

Taylor, Christina M.; Mitreva, Makedonka

2011-01-01

A vast majority of the burden from neglected tropical diseases result from helminth infections (nematodes and platyhelminthes). Parasitic helminthes infect over 2 billion, exerting a high collective burden that rivals high-mortality conditions such as AIDS or malaria, and cause devastation to crops and livestock. The challenges to improve control of parasitic helminth infections are multi-fold and no single category of approaches will meet them all. New information such as helminth genomics, functional genomics and proteomics coupled with innovative bioinformatic approaches provide fundamental molecular information about these parasites, accelerating both basic research as well as development of effective diagnostics, vaccines and new drugs. To facilitate such studies we have developed an online resource, HelmCoP (Helminth Control and Prevention), built by integrating functional, structural and comparative genomic data from plant, animal and human helminthes, to enable researchers to develop strategies for drug, vaccine and pesticide prioritization, while also providing a useful comparative genomics platform. HelmCoP encompasses genomic data from several hosts, including model organisms, along with a comprehensive suite of structural and functional annotations, to assist in comparative analyses and to study host-parasite interactions. The HelmCoP interface, with a sophisticated query engine as a backbone, allows users to search for multi-factorial combinations of properties and serves readily accessible information that will assist in the identification of various genes of interest. HelmCoP is publicly available at: http://www.nematode.net/helmcop.html. PMID:21760913
Listeriomics: an Interactive Web Platform for Systems Biology of Listeria

PubMed Central

Koutero, Mikael; Tchitchek, Nicolas; Cerutti, Franck; Lechat, Pierre; Maillet, Nicolas; Hoede, Claire; Chiapello, Hélène; Gaspin, Christine

2017-01-01

ABSTRACT As for many model organisms, the amount of Listeria omics data produced has recently increased exponentially. There are now >80 published complete Listeria genomes, around 350 different transcriptomic data sets, and 25 proteomic data sets available. The analysis of these data sets through a systems biology approach and the generation of tools for biologists to browse these various data are a challenge for bioinformaticians. We have developed a web-based platform, named Listeriomics, that integrates different tools for omics data analyses, i.e., (i) an interactive genome viewer to display gene expression arrays, tiling arrays, and sequencing data sets along with proteomics and genomics data sets; (ii) an expression and protein atlas that connects every gene, small RNA, antisense RNA, or protein with the most relevant omics data; (iii) a specific tool for exploring protein conservation through the Listeria phylogenomic tree; and (iv) a coexpression network tool for the discovery of potential new regulations. Our platform integrates all the complete Listeria species genomes, transcriptomes, and proteomes published to date. This website allows navigation among all these data sets with enriched metadata in a user-friendly format and can be used as a central database for systems biology analysis. IMPORTANCE In the last decades, Listeria has become a key model organism for the study of host-pathogen interactions, noncoding RNA regulation, and bacterial adaptation to stress. To study these mechanisms, several genomics, transcriptomics, and proteomics data sets have been produced. We have developed Listeriomics, an interactive web platform to browse and correlate these heterogeneous sources of information. Our website will allow listeriologists and microbiologists to decipher key regulation mechanism by using a systems biology approach. PMID:28317029
The proteomic landscape of triple-negative breast cancer.

PubMed

Lawrence, Robert T; Perez, Elizabeth M; Hernández, Daniel; Miller, Chris P; Haas, Kelsey M; Irie, Hanna Y; Lee, Su-In; Blau, C Anthony; Villén, Judit

2015-04-28

Triple-negative breast cancer is a heterogeneous disease characterized by poor clinical outcomes and a shortage of targeted treatment options. To discover molecular features of triple-negative breast cancer, we performed quantitative proteomics analysis of twenty human-derived breast cell lines and four primary breast tumors to a depth of more than 12,000 distinct proteins. We used this data to identify breast cancer subtypes at the protein level and demonstrate the precise quantification of biomarkers, signaling proteins, and biological pathways by mass spectrometry. We integrated proteomics data with exome sequence resources to identify genomic aberrations that affect protein expression. We performed a high-throughput drug screen to identify protein markers of drug sensitivity and understand the mechanisms of drug resistance. The genome and proteome provide complementary information that, when combined, yield a powerful engine for therapeutic discovery. This resource is available to the cancer research community to catalyze further analysis and investigation. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Brucella proteomes--a review.

PubMed

DelVecchio, Vito G; Wagner, Mary Ann; Eschenbrenner, Michel; Horn, Troy A; Kraycer, Jo Ann; Estock, Frank; Elzer, Phil; Mujer, Cesar V

2002-12-20

The proteomes of selected Brucella spp. have been extensively analyzed by utilizing current proteomic technology involving 2-DE and MALDI-MS. In Brucella melitensis, more than 500 proteins were identified. The rapid and large-scale identification of proteins in this organism was accomplished by using the annotated B. melitensis genome which is now available in the GenBank. Coupled with new and powerful tools for data analysis, differentially expressed proteins were identified and categorized into several classes. A global overview of protein expression patterns emerged, thereby facilitating the simultaneous analysis of different metabolic pathways in B. melitensis. Such a global characterization would not have been possible by using time consuming and traditional biochemical approaches. The era of post-genomic technology offers new and exciting opportunities to understand the complete biology of different Brucella species.
Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function

PubMed Central

Ezkurdia, Iakes; del Pozo, Angela; Frankish, Adam; Rodriguez, Jose Manuel; Harrow, Jennifer; Ashman, Keith; Valencia, Alfonso; Tress, Michael L.

2012-01-01

Advances in high-throughput mass spectrometry are making proteomics an increasingly important tool in genome annotation projects. Peptides detected in mass spectrometry experiments can be used to validate gene models and verify the translation of putative coding sequences (CDSs). Here, we have identified peptides that cover 35% of the genes annotated by the GENCODE consortium for the human genome as part of a comprehensive analysis of experimental spectra from two large publicly available mass spectrometry databases. We detected the translation to protein of “novel” and “putative” protein-coding transcripts as well as transcripts annotated as pseudogenes and nonsense-mediated decay targets. We provide a detailed overview of the population of alternatively spliced protein isoforms that are detectable by peptide identification methods. We found that 150 genes expressed multiple alternative protein isoforms. This constitutes the largest set of reliably confirmed alternatively spliced proteins yet discovered. Three groups of genes were highly overrepresented. We detected alternative isoforms for 10 of the 25 possible heterogeneous nuclear ribonucleoproteins, proteins with a key role in the splicing process. Alternative isoforms generated from interchangeable homologous exons and from short indels were also significantly enriched, both in human experiments and in parallel analyses of mouse and Drosophila proteomics experiments. Our results show that a surprisingly high proportion (almost 25%) of the detected alternative isoforms are only subtly different from their constitutive counterparts. Many of the alternative splicing events that give rise to these alternative isoforms are conserved in mouse. It was striking that very few of these conserved splicing events broke Pfam functional domains or would damage globular protein structures. This evidence of a strong bias toward subtle differences in CDS and likely conserved cellular function and structure is remarkable and strongly suggests that the translation of alternative transcripts may be subject to selective constraints. PMID:22446687
Compositional complexity of the mitochondrial proteome of a unicellular eukaryote (Acanthamoeba castellanii, supergroup Amoebozoa) rivals that of animals, fungi, and plants.

PubMed

Gawryluk, Ryan M R; Chisholm, Kenneth A; Pinto, Devanand M; Gray, Michael W

2014-09-23

We present a combined proteomic and bioinformatic investigation of mitochondrial proteins from the amoeboid protist Acanthamoeba castellanii, the first such comprehensive investigation in a free-living member of the supergroup Amoebozoa. This protist was chosen both for its phylogenetic position (as a sister to animals and fungi) and its ecological ubiquity and physiological flexibility. We report 1033 A. castellanii mitochondrial protein sequences, 709 supported by mass spectrometry data (676 nucleus-encoded and 33 mitochondrion-encoded), including two previously unannotated mtDNA-encoded proteins, which we identify as highly divergent mitochondrial ribosomal proteins. Other notable findings include duplicate proteins for all of the enzymes of the tricarboxylic acid (TCA) cycle-which, along with the identification of a mitochondrial malate synthase-isocitrate lyase fusion protein, suggests the interesting possibility that the glyoxylate cycle operates in A. castellanii mitochondria. Additionally, the A. castellanii genome encodes an unusually high number (at least 29) of mitochondrion-targeted pentatricopeptide repeat (PPR) proteins, organellar RNA metabolism factors in other organisms. We discuss several key mitochondrial pathways, including DNA replication, transcription and translation, protein degradation, protein import and Fe-S cluster biosynthesis, highlighting similarities and differences in these pathways in other eukaryotes. In compositional and functional complexity, the mitochondrial proteome of A. castellanii rivals that of multicellular eukaryotes. Comprehensive proteomic surveys of mitochondria have been undertaken in a limited number of predominantly multicellular eukaryotes. This phylogenetically narrow perspective constrains and biases our insights into mitochondrial function and evolution, as it neglects protists, which account for most of the evolutionary and functional diversity within eukaryotes. We report here the first comprehensive investigation of the mitochondrial proteome in a member (A. castellanii) of the eukaryotic supergroup Amoebozoa. Through a combination of tandem mass spectrometry (MS/MS) and in silico data mining, we have retrieved 1033 candidate mitochondrial protein sequences, 709 having MS support. These data were used to reconstruct the metabolic pathways and protein complexes of A. castellanii mitochondria, and were integrated with data from other characterized mitochondrial proteomes to augment our understanding of mitochondrial proteome evolution. Our results demonstrate the power of combining direct proteomic and bioinformatic approaches in the discovery of novel mitochondrial proteins, both nucleus-encoded and mitochondrion-encoded, and highlight the compositional complexity of the A. castellanii mitochondrial proteome, which rivals that of animals, fungi and plants. Copyright © 2014 Elsevier B.V. All rights reserved.
Jatropha curcas, a biofuel crop: Functional genomics for understanding metabolic pathways and genetic improvement

PubMed Central

Maghuly, Fatemeh; Laimer, Margit

2013-01-01

Jatropha curcas is currently attracting much attention as an oilseed crop for biofuel, as Jatropha can grow under climate and soil conditions that are unsuitable for food production. However, little is known about Jatropha, and there are a number of challenges to be overcome. In fact, Jatropha has not really been domesticated; most of the Jatropha accessions are toxic, which renders the seedcake unsuitable for use as animal feed. The seeds of Jatropha contain high levels of polyunsaturated fatty acids, which negatively impact the biofuel quality. Fruiting of Jatropha is fairly continuous, thus increasing costs of harvesting. Therefore, before starting any improvement program using conventional or molecular breeding techniques, understanding gene function and the genome scale of Jatropha are prerequisites. This review presents currently available and relevant information on the latest technologies (genomics, transcriptomics, proteomics and metabolomics) to decipher important metabolic pathways within Jatropha, such as oil and toxin synthesis. Further, it discusses future directions for biotechnological approaches in Jatropha breeding and improvement. PMID:24092674
Proteomic analysis of propiconazole responses in mouse liver: comparison of genomic and proteomic profiles

EPA Science Inventory

We have performed for the first time a comprehensive profiling of changes in protein expression of soluble proteins in livers from mice treated with the mouse liver tumorigen, propiconazole, to uncover the pathways and networks altered by this fungicide. Utilizing twodimensional...
Polycyclic aromatic hydrocarbon metabolic network in Mycobacterium vanbaalenii PYR-1.

PubMed

Kweon, Ohgew; Kim, Seong-Jae; Holland, Ricky D; Chen, Hongyan; Kim, Dae-Wi; Gao, Yuan; Yu, Li-Rong; Baek, Songjoon; Baek, Dong-Heon; Ahn, Hongsik; Cerniglia, Carl E

2011-09-01

This study investigated a metabolic network (MN) from Mycobacterium vanbaalenii PYR-1 for polycyclic aromatic hydrocarbons (PAHs) from the perspective of structure, behavior, and evolution, in which multilayer omics data are integrated. Initially, we utilized a high-throughput proteomic analysis to assess the protein expression response of M. vanbaalenii PYR-1 to seven different aromatic compounds. A total of 3,431 proteins (57.38% of the genome-predicted proteins) were identified, which included 160 proteins that seemed to be involved in the degradation of aromatic hydrocarbons. Based on the proteomic data and the previous metabolic, biochemical, physiological, and genomic information, we reconstructed an experiment-based system-level PAH-MN. The structure of PAH-MN, with 183 metabolic compounds and 224 chemical reactions, has a typical scale-free nature. The behavior and evolution of the PAH-MN reveals a hierarchical modularity with funnel effects in structure/function and intimate association with evolutionary modules of the functional modules, which are the ring cleavage process (RCP), side chain process (SCP), and central aromatic process (CAP). The 189 commonly upregulated proteins in all aromatic hydrocarbon treatments provide insights into the global adaptation to facilitate the PAH metabolism. Taken together, the findings of our study provide the hierarchical viewpoint from genes/proteins/metabolites to the network via functional modules of the PAH-MN equipped with the engineering-driven approaches of modularization and rationalization, which may expand our understanding of the metabolic potential of M. vanbaalenii PYR-1 for bioremediation applications.

Proteomic Analysis of Rhizoctonia solani Identifies Infection-specific, Redox Associated Proteins and Insight into Adaptation to Different Plant Hosts*

PubMed Central

Anderson, Jonathan P.; Hane, James K.; Stoll, Thomas; Pain, Nicholas; Hastie, Marcus L.; Kaur, Parwinder; Hoogland, Christine; Gorman, Jeffrey J.; Singh, Karam B.

2016-01-01

Rhizoctonia solani is an important root infecting pathogen of a range of food staples worldwide including wheat, rice, maize, soybean, potato and others. Conventional resistance breeding strategies are hindered by the absence of tractable genetic resistance in any crop host. Understanding the biology and pathogenicity mechanisms of this fungus is important for addressing these disease issues, however, little is known about how R. solani causes disease. This study capitalizes on recent genomic studies by applying mass spectrometry based proteomics to identify soluble, membrane-bound and culture filtrate proteins produced under wheat infection and vegetative growth conditions. Many of the proteins found in the culture filtrate had predicted functions relating to modification of the plant cell wall, a major activity required for pathogenesis on the plant host, including a number found only under infection conditions. Other infection related proteins included a high proportion of proteins with redox associated functions and many novel proteins without functional classification. The majority of infection only proteins tested were confirmed to show transcript up-regulation during infection including a thaumatin which increased susceptibility to R. solani when expressed in Nicotiana benthamiana. In addition, analysis of expression during infection of different plant hosts highlighted how the infection strategy of this broad host range pathogen can be adapted to the particular host being encountered. Data are available via ProteomeXchange with identifier PXD002806. PMID:26811357
Proteomic Insights into Sulfur Metabolism in the Hydrogen-Producing Hyperthermophilic Archaeon Thermococcus onnurineus NA1

PubMed Central

Moon, Yoon-Jung; Kwon, Joseph; Yun, Sung-Ho; Lim, Hye Li; Kim, Jonghyun; Kim, Soo Jung; Kang, Sung Gyun; Lee, Jung-Hyun; Kim, Seung Il; Chung, Young-Ho

2015-01-01

The hyperthermophilic archaeon Thermococcus onnurineus NA1 has been shown to produce H2 when using CO, formate, or starch as a growth substrate. This strain can also utilize elemental sulfur as a terminal electron acceptor for heterotrophic growth. To gain insight into sulfur metabolism, the proteome of T. onnurineus NA1 cells grown under sulfur culture conditions was quantified and compared with those grown under H2-evolving substrate culture conditions. Using label-free nano-UPLC-MSE-based comparative proteomic analysis, approximately 38.4% of the total identified proteome (589 proteins) was found to be significantly up-regulated (≥1.5-fold) under sulfur culture conditions. Many of these proteins were functionally associated with carbon fixation, Fe–S cluster biogenesis, ATP synthesis, sulfur reduction, protein glycosylation, protein translocation, and formate oxidation. Based on the abundances of the identified proteins in this and other genomic studies, the pathways associated with reductive sulfur metabolism, H2-metabolism, and oxidative stress defense were proposed. The results also revealed markedly lower expression levels of enzymes involved in the sulfur assimilation pathway, as well as cysteine desulfurase, under sulfur culture condition. The present results provide the first global atlas of proteome changes triggered by sulfur, and may facilitate an understanding of how hyperthermophilic archaea adapt to sulfur-rich, extreme environments. PMID:25915030
Comparative and quantitative proteomics reveal the adaptive strategies of oyster larvae to ocean acidification.

PubMed

Dineshram, R; Quan, Q; Sharma, Rakesh; Chandramouli, Kondethimmanahalli; Yalamanchili, Hari Krishna; Chu, Ivan; Thiyagarajan, Vengatesen

2015-12-01

Decreasing pH due to anthropogenic CO2 inputs, called ocean acidification (OA), can make coastal environments unfavorable for oysters. This is a serious socioeconomical issue for China which supplies >70% of the world's edible oysters. Here, we present an iTRAQ-based protein profiling approach for the detection and quantification of proteome changes under OA in the early life stage of a commercially important oyster, Crassostrea hongkongensis. Availability of complete genome sequence for the pacific oyster (Crassostrea gigas) enabled us to confidently quantify over 1500 proteins in larval oysters. Over 7% of the proteome was altered in response to OA at pHNBS 7.6. Analysis of differentially expressed proteins and their associated functional pathways showed an upregulation of proteins involved in calcification, metabolic processes, and oxidative stress, each of which may be important in physiological adaptation of this species to OA. The downregulation of cytoskeletal and signal transduction proteins, on the other hand, might have impaired cellular dynamics and organelle development under OA. However, there were no significant detrimental effects in developmental processes such as metamorphic success. Implications of the differentially expressed proteins and metabolic pathways in the development of OA resistance in oyster larvae are discussed. The MS proteomics data have been deposited to the ProteomeXchange with identifiers PXD002138 (http://proteomecentral.proteomexchange.org/dataset/PXD002138). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Design and Initial Characterization of the SC-200 Proteomics Standard Mixture

PubMed Central

Bauman, Andrew; Higdon, Roger; Rapson, Sean; Loiue, Brenton; Hogan, Jason; Stacy, Robin; Napuli, Alberto; Guo, Wenjin; van Voorhis, Wesley; Roach, Jared; Lu, Vincent; Landorf, Elizabeth; Stewart, Elizabeth; Kolker, Natali; Collart, Frank; Myler, Peter; van Belle, Gerald

2011-01-01

Abstract High-throughput (HTP) proteomics studies generate large amounts of data. Interpretation of these data requires effective approaches to distinguish noise from biological signal, particularly as instrument and computational capacity increase and studies become more complex. Resolving this issue requires validated and reproducible methods and models, which in turn requires complex experimental and computational standards. The absence of appropriate standards and data sets for validating experimental and computational workflows hinders the development of HTP proteomics methods. Most protein standards are simple mixtures of proteins or peptides, or undercharacterized reference standards in which the identity and concentration of the constituent proteins is unknown. The Seattle Children's 200 (SC-200) proposed proteomics standard mixture is the next step toward developing realistic, fully characterized HTP proteomics standards. The SC-200 exhibits a unique modular design to extend its functionality, and consists of 200 proteins of known identities and molar concentrations from 6 microbial genomes, distributed into 10 molar concentration tiers spanning a 1,000-fold range. We describe the SC-200's design, potential uses, and initial characterization. We identified 84% of SC-200 proteins with an LTQ-Orbitrap and 65% with an LTQ-Velos (false discovery rate = 1% for both). There were obvious trends in success rate, sequence coverage, and spectral counts with protein concentration; however, protein identification, sequence coverage, and spectral counts vary greatly within concentration levels. PMID:21250827
Design and initial characterization of the SC-200 proteomics standard mixture.

PubMed

Bauman, Andrew; Higdon, Roger; Rapson, Sean; Loiue, Brenton; Hogan, Jason; Stacy, Robin; Napuli, Alberto; Guo, Wenjin; van Voorhis, Wesley; Roach, Jared; Lu, Vincent; Landorf, Elizabeth; Stewart, Elizabeth; Kolker, Natali; Collart, Frank; Myler, Peter; van Belle, Gerald; Kolker, Eugene

2011-01-01

High-throughput (HTP) proteomics studies generate large amounts of data. Interpretation of these data requires effective approaches to distinguish noise from biological signal, particularly as instrument and computational capacity increase and studies become more complex. Resolving this issue requires validated and reproducible methods and models, which in turn requires complex experimental and computational standards. The absence of appropriate standards and data sets for validating experimental and computational workflows hinders the development of HTP proteomics methods. Most protein standards are simple mixtures of proteins or peptides, or undercharacterized reference standards in which the identity and concentration of the constituent proteins is unknown. The Seattle Children's 200 (SC-200) proposed proteomics standard mixture is the next step toward developing realistic, fully characterized HTP proteomics standards. The SC-200 exhibits a unique modular design to extend its functionality, and consists of 200 proteins of known identities and molar concentrations from 6 microbial genomes, distributed into 10 molar concentration tiers spanning a 1,000-fold range. We describe the SC-200's design, potential uses, and initial characterization. We identified 84% of SC-200 proteins with an LTQ-Orbitrap and 65% with an LTQ-Velos (false discovery rate = 1% for both). There were obvious trends in success rate, sequence coverage, and spectral counts with protein concentration; however, protein identification, sequence coverage, and spectral counts vary greatly within concentration levels.
Identifying the missing proteins in human proteome by biological language model.

PubMed

Dong, Qiwen; Wang, Kai; Liu, Xuan

2016-12-23

With the rapid development of high-throughput sequencing technology, the proteomics research becomes a trendy field in the post genomics era. It is necessary to identify all the native-encoding protein sequences for further function and pathway analysis. Toward that end, the Human Proteome Organization lunched the Human Protein Project in 2011. However many proteins are hard to be detected by experiment methods, which becomes one of the bottleneck in Human Proteome Project. In consideration of the complicatedness of detecting these missing proteins by using wet-experiment approach, here we use bioinformatics method to pre-filter the missing proteins. Since there are analogy between the biological sequences and natural language, the n-gram models from Natural Language Processing field has been used to filter the missing proteins. The dataset used in this study contains 616 missing proteins from the "uncertain" category of the neXtProt database. There are 102 proteins deduced by the n-gram model, which have high probability to be native human proteins. We perform a detail analysis on the predicted structure and function of these missing proteins and also compare the high probability proteins with other mass spectrum datasets. The evaluation shows that the results reported here are in good agreement with those obtained by other well-established databases. The analysis shows that 102 proteins may be native gene-coding proteins and some of the missing proteins are membrane or natively disordered proteins which are hard to be detected by experiment methods.
Trade-off between Transcriptome Plasticity and Genome Evolution in Cephalopods.

PubMed

Liscovitch-Brauer, Noa; Alon, Shahar; Porath, Hagit T; Elstein, Boaz; Unger, Ron; Ziv, Tamar; Admon, Arie; Levanon, Erez Y; Rosenthal, Joshua J C; Eisenberg, Eli

2017-04-06

RNA editing, a post-transcriptional process, allows the diversification of proteomes beyond the genomic blueprint; however it is infrequently used among animals for this purpose. Recent reports suggesting increased levels of RNA editing in squids thus raise the question of the nature and effects of these events. We here show that RNA editing is particularly common in behaviorally sophisticated coleoid cephalopods, with tens of thousands of evolutionarily conserved sites. Editing is enriched in the nervous system, affecting molecules pertinent for excitability and neuronal morphology. The genomic sequence flanking editing sites is highly conserved, suggesting that the process confers a selective advantage. Due to the large number of sites, the surrounding conservation greatly reduces the number of mutations and genomic polymorphisms in protein-coding regions. This trade-off between genome evolution and transcriptome plasticity highlights the importance of RNA recoding as a strategy for diversifying proteins, particularly those associated with neural function. PAPERCLIP. Copyright © 2017 Elsevier Inc. All rights reserved.
The restricted metabolism of the obligate organohalide respiring bacterium Dehalobacter restrictus: lessons from tiered functional genomics

PubMed Central

Rupakula, Aamani; Kruse, Thomas; Boeren, Sjef; Holliger, Christof; Smidt, Hauke; Maillard, Julien

2013-01-01

Dehalobacter restrictus strain PER-K23 is an obligate organohalide respiring bacterium, which displays extremely narrow metabolic capabilities. It grows only via coupling energy conservation to anaerobic respiration of tetra- and trichloroethene with hydrogen as sole electron donor. Dehalobacter restrictus represents the paradigmatic member of the genus Dehalobacter, which in recent years has turned out to be a major player in the bioremediation of an increasing number of organohalides, both in situ and in laboratory studies. The recent elucidation of the D. restrictus genome revealed a rather elaborate genome with predicted pathways that were not suspected from its restricted metabolism, such as a complete corrinoid biosynthetic pathway, the Wood–Ljungdahl (WL) pathway for CO2 fixation, abundant transcriptional regulators and several types of hydrogenases. However, one important feature of the genome is the presence of 25 reductive dehalogenase genes, from which so far only one, pceA, has been characterized on genetic and biochemical levels. This study describes a multi-level functional genomics approach on D. restrictus across three different growth phases. A global proteomic analysis allowed consideration of general metabolic pathways relevant to organohalide respiration, whereas the dedicated genomic and transcriptomic analysis focused on the diversity, composition and expression of genes associated with reductive dehalogenases. PMID:23479754
"Omics" of Selenium Biology: A Prospective Study of Plasma Proteome Network Before and After Selenized-Yeast Supplementation in Healthy Men.

PubMed

Sinha, Indu; Karagoz, Kubra; Fogle, Rachel L; Hollenbeak, Christopher S; Zea, Arnold H; Arga, Kazim Y; Stanley, Anne E; Hawkes, Wayne C; Sinha, Raghu

2016-04-01

Low selenium levels have been linked to a higher incidence of cancer and other diseases, including Keshan, Chagas, and Kashin-Beck, and insulin resistance. Additionally, muscle and cardiovascular disorders, immune dysfunction, cancer, neurological disorders, and endocrine function have been associated with mutations in genes encoding for selenoproteins. Selenium biology is complex, and a systems biology approach to study global metabolomics, genomics, and/or proteomics may provide important clues to examining selenium-responsive markers in circulation. In the current investigation, we applied a global proteomics approach on plasma samples collected from a previously conducted, double-blinded placebo controlled clinical study, where men were supplemented with selenized-yeast (Se-Yeast; 300 μg/day, 3.8 μmol/day) or placebo-yeast for 48 weeks. Proteomic analysis was performed by iTRAQ on 8 plasma samples from each arm at baseline and 48 weeks. A total of 161 plasma proteins were identified in both arms. Twenty-two proteins were significantly altered following Se-Yeast supplementation and thirteen proteins were significantly changed after placebo-yeast supplementation in healthy men. The differentially expressed proteins were involved in complement and coagulation pathways, immune functions, lipid metabolism, and insulin resistance. Reconstruction and analysis of protein-protein interaction network around selected proteins revealed several hub proteins. One of the interactions suggested by our analysis, PHLD-APOA4, which is involved in insulin resistance, was subsequently validated by Western blot analysis. Our systems approach illustrates a viable platform for investigating responsive proteomic profile in 'before and after' condition following Se-Yeast supplementation. The nature of proteins identified suggests that selenium may play an important role in complement and coagulation pathways, and insulin resistance.
Tools to covisualize and coanalyze proteomic data with genomes and transcriptomes: validation of genes and alternative mRNA splicing.

PubMed

Pang, Chi Nam Ignatius; Tay, Aidan P; Aya, Carlos; Twine, Natalie A; Harkness, Linda; Hart-Smith, Gene; Chia, Samantha Z; Chen, Zhiliang; Deshpande, Nandan P; Kaakoush, Nadeem O; Mitchell, Hazel M; Kassem, Moustapha; Wilkins, Marc R

2014-01-03

Direct links between proteomic and genomic/transcriptomic data are not frequently made, partly because of lack of appropriate bioinformatics tools. To help address this, we have developed the PG Nexus pipeline. The PG Nexus allows users to covisualize peptides in the context of genomes or genomic contigs, along with RNA-seq reads. This is done in the Integrated Genome Viewer (IGV). A Results Analyzer reports the precise base position where LC-MS/MS-derived peptides cover genes or gene isoforms, on the chromosomes or contigs where this occurs. In prokaryotes, the PG Nexus pipeline facilitates the validation of genes, where annotation or gene prediction is available, or the discovery of genes using a "virtual protein"-based unbiased approach. We illustrate this with a comprehensive proteogenomics analysis of two strains of Campylobacter concisus . For higher eukaryotes, the PG Nexus facilitates gene validation and supports the identification of mRNA splice junction boundaries and splice variants that are protein-coding. This is illustrated with an analysis of splice junctions covered by human phosphopeptides, and other examples of relevance to the Chromosome-Centric Human Proteome Project. The PG Nexus is open-source and available from https://github.com/IntersectAustralia/ap11_Samifier. It has been integrated into Galaxy and made available in the Galaxy tool shed.
Integrative Identification of Arabidopsis Mitochondrial Proteome and Its Function Exploitation through Protein Interaction Network

PubMed Central

Cui, Jian; Liu, Jinghua; Li, Yuhua; Shi, Tieliu

2011-01-01

Mitochondria are major players on the production of energy, and host several key reactions involved in basic metabolism and biosynthesis of essential molecules. Currently, the majority of nucleus-encoded mitochondrial proteins are unknown even for model plant Arabidopsis. We reported a computational framework for predicting Arabidopsis mitochondrial proteins based on a probabilistic model, called Naive Bayesian Network, which integrates disparate genomic data generated from eight bioinformatics tools, multiple orthologous mappings, protein domain properties and co-expression patterns using 1,027 microarray profiles. Through this approach, we predicted 2,311 candidate mitochondrial proteins with 84.67% accuracy and 2.53% FPR performances. Together with those experimental confirmed proteins, 2,585 mitochondria proteins (named CoreMitoP) were identified, we explored those proteins with unknown functions based on protein-protein interaction network (PIN) and annotated novel functions for 26.65% CoreMitoP proteins. Moreover, we found newly predicted mitochondrial proteins embedded in particular subnetworks of the PIN, mainly functioning in response to diverse environmental stresses, like salt, draught, cold, and wound etc. Candidate mitochondrial proteins involved in those physiological acitivites provide useful targets for further investigation. Assigned functions also provide comprehensive information for Arabidopsis mitochondrial proteome. PMID:21297957
The UniProtKB guide to the human proteome

PubMed Central

Breuza, Lionel; Poux, Sylvain; Estreicher, Anne; Famiglietti, Maria Livia; Magrane, Michele; Tognolli, Michael; Bridge, Alan; Baratin, Delphine; Redaschi, Nicole

2016-01-01

Advances in high-throughput and advanced technologies allow researchers to routinely perform whole genome and proteome analysis. For this purpose, they need high-quality resources providing comprehensive gene and protein sets for their organisms of interest. Using the example of the human proteome, we will describe the content of a complete proteome in the UniProt Knowledgebase (UniProtKB). We will show how manual expert curation of UniProtKB/Swiss-Prot is complemented by expert-driven automatic annotation to build a comprehensive, high-quality and traceable resource. We will also illustrate how the complexity of the human proteome is captured and structured in UniProtKB. Database URL: www.uniprot.org PMID:26896845
A large scale Plasmodium vivax- Saimiri boliviensis trophozoite-schizont transition proteome

PubMed Central

Lapp, Stacey A.; Barnwell, John W.; Galinski, Mary R.

2017-01-01

Plasmodium vivax is a complex protozoan parasite with over 6,500 genes and stage-specific differential expression. Much of the unique biology of this pathogen remains unknown, including how it modifies and restructures the host reticulocyte. Using a recently published P. vivax reference genome, we report the proteome from two biological replicates of infected Saimiri boliviensis host reticulocytes undergoing transition from the late trophozoite to early schizont stages. Using five database search engines, we identified a total of 2000 P. vivax and 3487 S. boliviensis proteins, making this the most comprehensive P. vivax proteome to date. PlasmoDB GO-term enrichment analysis of proteins identified at least twice by a search engine highlighted core metabolic processes and molecular functions such as glycolysis, translation and protein folding, cell components such as ribosomes, proteasomes and the Golgi apparatus, and a number of vesicle and trafficking related clusters. Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.8 enriched functional annotation clusters of S. boliviensis proteins highlighted vesicle and trafficking-related clusters, elements of the cytoskeleton, oxidative processes and response to oxidative stress, macromolecular complexes such as the proteasome and ribosome, metabolism, translation, and cell death. Host and parasite proteins potentially involved in cell adhesion were also identified. Over 25% of the P. vivax proteins have no functional annotation; this group includes 45 VIR members of the large PIR family. A number of host and pathogen proteins contained highly oxidized or nitrated residues, extending prior trophozoite-enriched stage observations from S. boliviensis infections, and supporting the possibility of oxidative stress in relation to the disease. This proteome significantly expands the size and complexity of the known P. vivax and Saimiri host iRBC proteomes, and provides in-depth data that will be valuable for ongoing research on this parasite’s biology and pathogenesis. PMID:28829774
Proteomic analysis of tardigrades: towards a better understanding of molecular mechanisms by anhydrobiotic organisms.

PubMed

Schokraie, Elham; Hotz-Wagenblatt, Agnes; Warnken, Uwe; Mali, Brahim; Frohme, Marcus; Förster, Frank; Dandekar, Thomas; Hengherr, Steffen; Schill, Ralph O; Schnölzer, Martina

2010-03-03

Tardigrades are small, multicellular invertebrates which are able to survive times of unfavourable environmental conditions using their well-known capability to undergo cryptobiosis at any stage of their life cycle. Milnesium tardigradum has become a powerful model system for the analysis of cryptobiosis. While some genetic information is already available for Milnesium tardigradum the proteome is still to be discovered. Here we present to the best of our knowledge the first comprehensive study of Milnesium tardigradum on the protein level. To establish a proteome reference map we developed optimized protocols for protein extraction from tardigrades in the active state and for separation of proteins by high resolution two-dimensional gel electrophoresis. Since only limited sequence information of M. tardigradum on the genome and gene expression level is available to date in public databases we initiated in parallel a tardigrade EST sequencing project to allow for protein identification by electrospray ionization tandem mass spectrometry. 271 out of 606 analyzed protein spots could be identified by searching against the publicly available NCBInr database as well as our newly established tardigrade protein database corresponding to 144 unique proteins. Another 150 spots could be identified in the tardigrade clustered EST database corresponding to 36 unique contigs and ESTs. Proteins with annotated function were further categorized in more detail by their molecular function, biological process and cellular component. For the proteins of unknown function more information could be obtained by performing a protein domain annotation analysis. Our results include proteins like protein member of different heat shock protein families and LEA group 3, which might play important roles in surviving extreme conditions. The proteome reference map of Milnesium tardigradum provides the basis for further studies in order to identify and characterize the biochemical mechanisms of tolerance to extreme desiccation. The optimized proteomics workflow will enable application of sensitive quantification techniques to detect differences in protein expression, which are characteristic of the active and anhydrobiotic states of tardigrades.
Proteomic Analysis of Tardigrades: Towards a Better Understanding of Molecular Mechanisms by Anhydrobiotic Organisms

PubMed Central

Schokraie, Elham; Hotz-Wagenblatt, Agnes; Warnken, Uwe; Mali, Brahim; Frohme, Marcus; Förster, Frank; Dandekar, Thomas; Hengherr, Steffen; Schill, Ralph O.; Schnölzer, Martina

2010-01-01

Background Tardigrades are small, multicellular invertebrates which are able to survive times of unfavourable environmental conditions using their well-known capability to undergo cryptobiosis at any stage of their life cycle. Milnesium tardigradum has become a powerful model system for the analysis of cryptobiosis. While some genetic information is already available for Milnesium tardigradum the proteome is still to be discovered. Principal Findings Here we present to the best of our knowledge the first comprehensive study of Milnesium tardigradum on the protein level. To establish a proteome reference map we developed optimized protocols for protein extraction from tardigrades in the active state and for separation of proteins by high resolution two-dimensional gel electrophoresis. Since only limited sequence information of M. tardigradum on the genome and gene expression level is available to date in public databases we initiated in parallel a tardigrade EST sequencing project to allow for protein identification by electrospray ionization tandem mass spectrometry. 271 out of 606 analyzed protein spots could be identified by searching against the publicly available NCBInr database as well as our newly established tardigrade protein database corresponding to 144 unique proteins. Another 150 spots could be identified in the tardigrade clustered EST database corresponding to 36 unique contigs and ESTs. Proteins with annotated function were further categorized in more detail by their molecular function, biological process and cellular component. For the proteins of unknown function more information could be obtained by performing a protein domain annotation analysis. Our results include proteins like protein member of different heat shock protein families and LEA group 3, which might play important roles in surviving extreme conditions. Conclusions The proteome reference map of Milnesium tardigradum provides the basis for further studies in order to identify and characterize the biochemical mechanisms of tolerance to extreme desiccation. The optimized proteomics workflow will enable application of sensitive quantification techniques to detect differences in protein expression, which are characteristic of the active and anhydrobiotic states of tardigrades. PMID:20224743
Proteomic Analysis of Propiconazole Responses in Mouse Liver-Comparison of Genomic and Proteomic Profiles

EPA Science Inventory

We have performed for the first time a comprehensive profiling of changes in protein expression of soluble proteins in livers from mice treated with the mouse liver tumorigen, propiconazole, to uncover the pathways and networks altered by this commonly used fungicide. Utilizing t...
Genomic identification of potential targets unique to Candida albicans for the discovery of antifungal agents.

PubMed

Tripathi, Himanshu; Luqman, Suaib; Meena, Abha; Khan, Feroz

2014-01-01

Despite of modern antifungal therapy, the mortality rates of invasive infection with human fungal pathogen Candida albicans are up to 40%. Studies suggest that drug resistance in the three most common species of human fungal pathogens viz., C. albicans, Aspergillus fumigatus (causing mortality rate up to 90%) and Cryptococcus neoformans (causing mortality rate up to 70%) is due to mutations in the target enzymes or high expression of drug transporter genes. Drug resistance in human fungal pathogens has led to an imperative need for the identification of new targets unique to fungal pathogens. In the present study, we have used a comparative genomics approach to find out potential target proteins unique to C. albicans, an opportunistic fungus responsible for severe infection in immune-compromised human. Interestingly, many target proteins of existing antifungal agents showed orthologs in human cells. To identify unique proteins, we have compared proteome of C. albicans [SC5314] i.e., 14,633 total proteins retrieved from the RefSeq database of NCBI, USA with proteome of human and non-pathogenic yeast Saccharomyces cerevisiae. Results showed that 4,568 proteins were identified unique to C. albicans as compared to those of human and later when these unique proteins were compared with S. cerevisiae proteome, finally 2,161 proteins were identified as unique proteins and after removing repeats total 1,618 unique proteins (42 functionally known, 1,566 hypothetical and 10 unknown) were selected as potential antifungal drug targets unique to C. albicans.
Analytical validation considerations of multiplex mass-spectrometry-based proteomic platforms for measuring protein biomarkers.

PubMed

Boja, Emily S; Fehniger, Thomas E; Baker, Mark S; Marko-Varga, György; Rodriguez, Henry

2014-12-05

Protein biomarker discovery and validation in current omics era are vital for healthcare professionals to improve diagnosis, detect cancers at an early stage, identify the likelihood of cancer recurrence, stratify stages with differential survival outcomes, and monitor therapeutic responses. The success of such biomarkers would have a huge impact on how we improve the diagnosis and treatment of patients and alleviate the financial burden of healthcare systems. In the past, the genomics community (mostly through large-scale, deep genomic sequencing technologies) has been steadily improving our understanding of the molecular basis of disease, with a number of biomarker panels already authorized by the U.S. Food and Drug Administration (FDA) for clinical use (e.g., MammaPrint, two recently cleared devices using next-generation sequencing platforms to detect DNA changes in the cystic fibrosis transmembrane conductance regulator (CFTR) gene). Clinical proteomics, on the other hand, albeit its ability to delineate the functional units of a cell, more likely driving the phenotypic differences of a disease (i.e., proteins and protein-protein interaction networks and signaling pathways underlying the disease), "staggers" to make a significant impact with only an average ∼ 1.5 protein biomarkers per year approved by the FDA over the past 15-20 years. This statistic itself raises the concern that major roadblocks have been impeding an efficient transition of protein marker candidates in biomarker development despite major technological advances in proteomics in recent years.
Integrative Analysis of Many RNA-Seq Datasets to Study Alternative Splicing

PubMed Central

Li, Wenyuan; Dai, Chao; Kang, Shuli; Zhou, Xianghong Jasmine

2014-01-01

Alternative splicing is an important gene regulatory mechanism that dramatically increases the complexity of the proteome. However, how alternative splicing is regulated and how transcription and splicing are coordinated are still poorly understood, and functions of transcript isoforms have been studied only in a few limited cases. Nowadays, RNA-seq technology provides an exceptional opportunity to study alternative splicing on genome-wide scales and in an unbiased manner. With the rapid accumulation of data in public repositories, new challenges arise from the urgent need to effectively integrate many different RNA-seq datasets for study alterative splicing. This paper discusses a set of advanced computational methods that can integrate and analyze many RNA-seq datasets to systematically identify splicing modules, unravel the coupling of transcription and splicing, and predict the functions of splicing isoforms on a genome-wide scale. PMID:24583115
A comprehensive survey of the Plasmodium life cycle by genomic, transcriptomic, and proteomic analyses.

PubMed

Hall, Neil; Karras, Marianna; Raine, J Dale; Carlton, Jane M; Kooij, Taco W A; Berriman, Matthew; Florens, Laurence; Janssen, Christoph S; Pain, Arnab; Christophides, Georges K; James, Keith; Rutherford, Kim; Harris, Barbara; Harris, David; Churcher, Carol; Quail, Michael A; Ormond, Doug; Doggett, Jon; Trueman, Holly E; Mendoza, Jacqui; Bidwell, Shelby L; Rajandream, Marie-Adele; Carucci, Daniel J; Yates, John R; Kafatos, Fotis C; Janse, Chris J; Barrell, Bart; Turner, C Michael R; Waters, Andrew P; Sinden, Robert E

2005-01-07

Plasmodium berghei and Plasmodium chabaudi are widely used model malaria species. Comparison of their genomes, integrated with proteomic and microarray data, with the genomes of Plasmodium falciparum and Plasmodium yoelii revealed a conserved core of 4500 Plasmodium genes in the central regions of the 14 chromosomes and highlighted genes evolving rapidly because of stage-specific selective pressures. Four strategies for gene expression are apparent during the parasites' life cycle: (i) housekeeping; (ii) host-related; (iii) strategy-specific related to invasion, asexual replication, and sexual development; and (iv) stage-specific. We observed posttranscriptional gene silencing through translational repression of messenger RNA during sexual development, and a 47-base 3' untranslated region motif is implicated in this process.

Investigation of Yersinia pestis laboratory adaptation through a combined genomics and proteomics approach

DOE PAGES

Leiser, Owen P.; Merkley, Eric D.; Clowers, Brian H.; ...

2015-11-24

Here, the bacterial pathogen Yersinia pestis, the cause of plague in humans and animals, normally has a sylvatic lifestyle, cycling between fleas and mammals. In contrast, laboratory-grown Y. pestis experiences a more constant environment and conditions that it would not normally encounter. The transition from the natural environment to the laboratory results in a vastly different set of selective pressures, and represents what could be considered domestication. Understanding the kinds of adaptations Y. pestis undergoes as it becomes domesticated will contribute to understanding the basic biology of this important pathogen. In this study, we performed a Parallel Serial Passage Experimentmore » (PSPE) to explore the mechanisms by which Y. pestis adapts to laboratory conditions, hypothesizing that cells would undergo significant changes in virulence and nutrient acquisition systems. Two wild strains were serially passaged in 12 independent populations each for ~750 generations, after which each population was analyzed using whole-genome sequencing. We observed considerable parallel evolution in the endpoint populations, detecting multiple independent mutations in ail, pepA, and zwf, suggesting that specific selective pressures are shaping evolutionary responses. Complementary LC-MS-based proteomic data provide physiological context to the observed mutations, and reveal regulatory changes not necessarily associated with specific mutations, including changes in amino acid metabolism, envelope biogenesis, iron storage and acquisition, and a type VI secretion system. Proteomic data support hypotheses generated by genomic data in addition to suggesting future mechanistic studies, indicating that future whole-genome sequencing studies be designed to leverage proteomics as a critical complement.« less
Genomics and proteomics in liver fibrosis and cirrhosis

PubMed Central

2012-01-01

Genomics and proteomics have become increasingly important in biomedical science in the past decade, as they provide an opportunity for hypothesis-free experiments that can yield major insights not previously foreseen when scientific and clinical questions are based only on hypothesis-driven approaches. Use of these tools, therefore, opens new avenues for uncovering physiological and pathological pathways. Liver fibrosis is a complex disease provoked by a range of chronic injuries to the liver, among which are viral hepatitis, (non-) alcoholic steatohepatitis and autoimmune disorders. Some chronic liver patients will never develop fibrosis or cirrhosis, whereas others rapidly progress towards cirrhosis in a few years. This variety can be caused by disease-related factors (for example, viral genotype) or host-factors (genetic/epigenetic). It is vital to establish accurate tools to identify those patients at highest risk for disease severity or progression in order to determine who are in need of immediate therapies. Moreover, there is an urgent imperative to identify non-invasive markers that can accurately distinguish mild and intermediate stages of fibrosis. Ideally, biomarkers can be used to predict disease progression and treatment response, but these studies will take many years due to the requirement for lengthy follow-up periods to assess outcomes. Current genomic and proteomic research provides many candidate biomarkers, but independent validation of these biomarkers is lacking, and reproducibility is still a key concern. Thus, great opportunities and challenges lie ahead in the field of genomics and proteomics, which, if successful, could transform the diagnosis and treatment of chronic fibrosing liver diseases. PMID:22214245
Investigation of Yersinia pestis laboratory adaptation through a combined genomics and proteomics approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leiser, Owen P.; Merkley, Eric D.; Clowers, Brian H.

Here, the bacterial pathogen Yersinia pestis, the cause of plague in humans and animals, normally has a sylvatic lifestyle, cycling between fleas and mammals. In contrast, laboratory-grown Y. pestis experiences a more constant environment and conditions that it would not normally encounter. The transition from the natural environment to the laboratory results in a vastly different set of selective pressures, and represents what could be considered domestication. Understanding the kinds of adaptations Y. pestis undergoes as it becomes domesticated will contribute to understanding the basic biology of this important pathogen. In this study, we performed a Parallel Serial Passage Experimentmore » (PSPE) to explore the mechanisms by which Y. pestis adapts to laboratory conditions, hypothesizing that cells would undergo significant changes in virulence and nutrient acquisition systems. Two wild strains were serially passaged in 12 independent populations each for ~750 generations, after which each population was analyzed using whole-genome sequencing. We observed considerable parallel evolution in the endpoint populations, detecting multiple independent mutations in ail, pepA, and zwf, suggesting that specific selective pressures are shaping evolutionary responses. Complementary LC-MS-based proteomic data provide physiological context to the observed mutations, and reveal regulatory changes not necessarily associated with specific mutations, including changes in amino acid metabolism, envelope biogenesis, iron storage and acquisition, and a type VI secretion system. Proteomic data support hypotheses generated by genomic data in addition to suggesting future mechanistic studies, indicating that future whole-genome sequencing studies be designed to leverage proteomics as a critical complement.« less
Integrated genomics and proteomics of the Torpedo californica electric organ: concordance with the mammalian neuromuscular junction

PubMed Central

2011-01-01

Background During development, the branchial mesoderm of Torpedo californica transdifferentiates into an electric organ capable of generating high voltage discharges to stun fish. The organ contains a high density of cholinergic synapses and has served as a biochemical model for the membrane specialization of myofibers, the neuromuscular junction (NMJ). We studied the genome and proteome of the electric organ to gain insight into its composition, to determine if there is concordance with skeletal muscle and the NMJ, and to identify novel synaptic proteins. Results Of 435 proteins identified, 300 mapped to Torpedo cDNA sequences with ≥2 peptides. We identified 14 uncharacterized proteins in the electric organ that are known to play a role in acetylcholine receptor clustering or signal transduction. In addition, two human open reading frames, C1orf123 and C6orf130, showed high sequence similarity to electric organ proteins. Our profile lists several proteins that are highly expressed in skeletal muscle or are muscle specific. Synaptic proteins such as acetylcholinesterase, acetylcholine receptor subunits, and rapsyn were present in the electric organ proteome but absent in the skeletal muscle proteome. Conclusions Our integrated genomic and proteomic analysis supports research describing a muscle-like profile of the organ. We show that it is a repository of NMJ proteins but we present limitations on its use as a comprehensive model of the NMJ. Finally, we identified several proteins that may become candidates for signaling proteins not previously characterized as components of the NMJ. PMID:21798097
Viruses are a dominant driver of protein adaptation in mammals.

PubMed

Enard, David; Cai, Le; Gwennap, Carina; Petrov, Dmitri A

2016-05-17

Viruses interact with hundreds to thousands of proteins in mammals, yet adaptation against viruses has only been studied in a few proteins specialized in antiviral defense. Whether adaptation to viruses typically involves only specialized antiviral proteins or affects a broad array of virus-interacting proteins is unknown. Here, we analyze adaptation in ~1300 virus-interacting proteins manually curated from a set of 9900 proteins conserved in all sequenced mammalian genomes. We show that viruses (i) use the more evolutionarily constrained proteins within the cellular functions they interact with and that (ii) despite this high constraint, virus-interacting proteins account for a high proportion of all protein adaptation in humans and other mammals. Adaptation is elevated in virus-interacting proteins across all functional categories, including both immune and non-immune functions. We conservatively estimate that viruses have driven close to 30% of all adaptive amino acid changes in the part of the human proteome conserved within mammals. Our results suggest that viruses are one of the most dominant drivers of evolutionary change across mammalian and human proteomes.
Viruses are a dominant driver of protein adaptation in mammals

PubMed Central

Enard, David; Cai, Le; Gwennap, Carina; Petrov, Dmitri A

2016-01-01

Viruses interact with hundreds to thousands of proteins in mammals, yet adaptation against viruses has only been studied in a few proteins specialized in antiviral defense. Whether adaptation to viruses typically involves only specialized antiviral proteins or affects a broad array of virus-interacting proteins is unknown. Here, we analyze adaptation in ~1300 virus-interacting proteins manually curated from a set of 9900 proteins conserved in all sequenced mammalian genomes. We show that viruses (i) use the more evolutionarily constrained proteins within the cellular functions they interact with and that (ii) despite this high constraint, virus-interacting proteins account for a high proportion of all protein adaptation in humans and other mammals. Adaptation is elevated in virus-interacting proteins across all functional categories, including both immune and non-immune functions. We conservatively estimate that viruses have driven close to 30% of all adaptive amino acid changes in the part of the human proteome conserved within mammals. Our results suggest that viruses are one of the most dominant drivers of evolutionary change across mammalian and human proteomes. DOI: http://dx.doi.org/10.7554/eLife.12469.001 PMID:27187613
HaloTag Technology: A Versatile Platform for Biomedical Applications

PubMed Central

2015-01-01

Exploration of protein function and interaction is critical for discovering links among genomics, proteomics, and disease state; yet, the immense complexity of proteomics found in biological systems currently limits our investigational capacity. Although affinity and autofluorescent tags are widely employed for protein analysis, these methods have been met with limited success because they lack specificity and require multiple fusion tags and genetic constructs. As an alternative approach, the innovative HaloTag protein fusion platform allows protein function and interaction to be comprehensively analyzed using a single genetic construct with multiple capabilities. This is accomplished using a simplified process, in which a variable HaloTag ligand binds rapidly to the HaloTag protein (usually linked to the protein of interest) with high affinity and specificity. In this review, we examine all current applications of the HaloTag technology platform for biomedical applications, such as the study of protein isolation and purification, protein function, protein–protein and protein–DNA interactions, biological assays, in vitro cellular imaging, and in vivo molecular imaging. In addition, novel uses of the HaloTag platform are briefly discussed along with potential future applications. PMID:25974629
Matrix metalloproteinase proteomics: substrates, targets, and therapy.

PubMed

Morrison, Charlotte J; Butler, Georgina S; Rodríguez, David; Overall, Christopher M

2009-10-01

Proteomics encompasses powerful techniques termed 'degradomics' for unbiased high-throughput protease substrate discovery screens that have been applied to an important family of extracellular proteases, the matrix metalloproteinases (MMPs). Together with the data generated from genetic deletion and transgenic mouse models and genomic profiling, these screens can uncover the diverse range of MMP functions, reveal which MMPs and MMP-mediated pathways exacerbate pathology, and which are involved in protection and the resolution of disease. This information can be used to identify and validate candidate drug targets and antitargets, and is critical for the development of new inhibitors of MMP function. Such inhibitors may target either the MMP directly in a specific manner or pathways upstream and downstream of MMP activity that are mediating deleterious effects in disease. Since MMPs do not operate alone but are part of the 'protease web', it is necessary to use system-wide approaches to understand MMP proteolysis in vivo, to discover new biological roles and their potential for therapeutic modification.
Genic insights from integrated human proteomics in GeneCards.

PubMed

Fishilevich, Simon; Zimmerman, Shahar; Kohn, Asher; Iny Stein, Tsippi; Olender, Tsviya; Kolker, Eugene; Safran, Marilyn; Lancet, Doron

2016-01-01

GeneCards is a one-stop shop for searchable human gene annotations (http://www.genecards.org/). Data are automatically mined from ∼120 sources and presented in an integrated web card for every human gene. We report the application of recent advances in proteomics to enhance gene annotation and classification in GeneCards. First, we constructed the Human Integrated Protein Expression Database (HIPED), a unified database of protein abundance in human tissues, based on the publically available mass spectrometry (MS)-based proteomics sources ProteomicsDB, Multi-Omics Profiling Expression Database, Protein Abundance Across Organisms and The MaxQuant DataBase. The integrated database, residing within GeneCards, compares favourably with its individual sources, covering nearly 90% of human protein-coding genes. For gene annotation and comparisons, we first defined a protein expression vector for each gene, based on normalized abundances in 69 normal human tissues. This vector is portrayed in the GeneCards expression section as a bar graph, allowing visual inspection and comparison. These data are juxtaposed with transcriptome bar graphs. Using the protein expression vectors, we further defined a pairwise metric that helps assess expression-based pairwise proximity. This new metric for finding functional partners complements eight others, including sharing of pathways, gene ontology (GO) terms and domains, implemented in the GeneCards Suite. In parallel, we calculated proteome-based differential expression, highlighting a subset of tissues that overexpress a gene and subserving gene classification. This textual annotation allows users of VarElect, the suite's next-generation phenotyper, to more effectively discover causative disease variants. Finally, we define the protein-RNA expression ratio and correlation as yet another attribute of every gene in each tissue, adding further annotative information. The results constitute a significant enhancement of several GeneCards sections and help promote and organize the genome-wide structural and functional knowledge of the human proteome. Database URL:http://www.genecards.org/. © The Author(s) 2016. Published by Oxford University Press.
Molecular Diagnosis and Biomarker Identification on SELDI proteomics data by ADTBoost method.

PubMed

Wang, Lu-Yong; Chakraborty, Amit; Comaniciu, Dorin

2005-01-01

Clinical proteomics is an emerging field that will have great impact on molecular diagnosis, identification of disease biomarkers, drug discovery and clinical trials in the post-genomic era. Protein profiling in tissues and fluids in disease and pathological control and other proteomics techniques will play an important role in molecular diagnosis with therapeutics and personalized healthcare. We introduced a new robust diagnostic method based on ADTboost algorithm, a novel algorithm in proteomics data analysis to improve classification accuracy. It generates classification rules, which are often smaller and easier to interpret. This method often gives most discriminative features, which can be utilized as biomarkers for diagnostic purpose. Also, it has a nice feature of providing a measure of prediction confidence. We carried out this method in amyotrophic lateral sclerosis (ALS) disease data acquired by surface enhanced laser-desorption/ionization-time-of-flight mass spectrometry (SELDI-TOF MS) experiments. Our method is shown to have outstanding prediction capacity through the cross-validation, ROC analysis results and comparative study. Our molecular diagnosis method provides an efficient way to distinguish ALS disease from neurological controls. The results are expressed in a simple and straightforward alternating decision tree format or conditional format. We identified most discriminative peaks in proteomic data, which can be utilized as biomarkers for diagnosis. It will have broad application in molecular diagnosis through proteomics data analysis and personalized medicine in this post-genomic era.
Proteomic insights into floral biology.

PubMed

Li, Xiaobai; Jackson, Aaron; Xie, Ming; Wu, Dianxing; Tsai, Wen-Chieh; Zhang, Sheng

2016-08-01

The flower is the most important biological structure for ensuring angiosperms reproductive success. Not only does the flower contain critical reproductive organs, but the wide variation in morphology, color, and scent has evolved to entice specialized pollinators, and arguably mankind in many cases, to ensure the successful propagation of its species. Recent proteomic approaches have identified protein candidates related to these flower traits, which has shed light on a number of previously unknown mechanisms underlying these traits. This review article provides a comprehensive overview of the latest advances in proteomic research in floral biology according to the order of flower structure, from corolla to male and female reproductive organs. It summarizes mainstream proteomic methods for plant research and recent improvements on two dimensional gel electrophoresis and gel-free workflows for both peptide level and protein level analysis. The recent advances in sequencing technologies provide a new paradigm for the ever-increasing genome and transcriptome information on many organisms. It is now possible to integrate genomic and transcriptomic data with proteomic results for large-scale protein characterization, so that a global understanding of the complex molecular networks in flower biology can be readily achieved. This article is part of a Special Issue entitled: Plant Proteomics--a bridge between fundamental processes and crop production, edited by Dr. Hans-Peter Mock. Copyright © 2016 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Wrighton, Kelly C.; Castelle, Cindy; Wilkins, Michael J.

Fermentation-based metabolism is an important ecosystem function often associated with environments rich in organic carbon, such as wetlands, sewage sludge, and the mammalian gut. The diversity of microorganisms and pathways involved in carbon and hydrogen cycling in sediments and aquifers and the impacts of these processes on other biogeochemical cycles remain poorly understood. Here we used metagenomics and proteomics to characterize microbial communities sampled from an aquifer adjacent to the Colorado River at Rifle, Colorado, USA, and document interlinked microbial roles in geochemical cycling. The organic carbon content in the aquifer was elevated via two acetate-based biostimulation treatments. Samples weremore » collected at three time points, with the objective of extensive genome recovery to enable metabolic reconstruction of the community. Fermentative community members include genomes from a new phylum (ACD20), phylogenetically novel members of the Chloroflexi and Bacteroidetes, as well as candidate phyla genomes (OD1, BD1-5, SR1, WWE3, ACD58, TM6, PER, and OP11). These organisms have the capacity to produce hydrogen, acetate, formate, ethanol, butyrate, and lactate, activities supported by proteomic data. The diversity and expression of hydrogenases suggests the importance of hydrogen currency in the subsurface. Our proteogenomic data further indicate the consumption of fermentation intermediates by Proteobacteria can be coupled to nitrate, sulfate, and iron reduction. Thus, fermentation carried out by previously unstudied members of sediment microbial communities may be an important driver of diverse subsurface biogeochemical cycles.« less
TCGA-assembler 2: software pipeline for retrieval and processing of TCGA/CPTAC data.

PubMed

Wei, Lin; Jin, Zhilin; Yang, Shengjie; Xu, Yanxun; Zhu, Yitan; Ji, Yuan

2018-05-01

The Cancer Genome Atlas (TCGA) program has produced huge amounts of cancer genomics data providing unprecedented opportunities for research. In 2014, we developed TCGA-Assembler, a software pipeline for retrieval and processing of public TCGA data. In 2016, TCGA data were transferred from the TCGA data portal to the Genomic Data Commons (GDCs), which is supported by a different set of data storage and retrieval mechanisms. In addition, new proteomics data of TCGA samples have been generated by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) program, which were not available for downloading through TCGA-Assembler. It is desirable to acquire and integrate data from both GDC and CPTAC. We develop TCGA-assembler 2 (TA2) to automatically download and integrate data from GDC and CPTAC. We make substantial improvement on the functionality of TA2 to enhance user experience and software performance. TA2 together with its previous version have helped more than 2000 researchers from 64 countries to access and utilize TCGA and CPTAC data in their research. Availability of TA2 will continue to allow existing and new users to conduct reproducible research based on TCGA and CPTAC data. http://www.compgenome.org/TCGA-Assembler/ or https://github.com/compgenome365/TCGA-Assembler-2. zhuyitan@gmail.com or koaeraser@gmail.com. Supplementary data are available at Bioinformatics online.
High-throughput molecular analysis in lung cancer: insights into biology and potential clinical applications.

PubMed

Ocak, S; Sos, M L; Thomas, R K; Massion, P P

2009-08-01

During the last decade, high-throughput technologies including genomic, epigenomic, transcriptomic and proteomic have been applied to further our understanding of the molecular pathogenesis of this heterogeneous disease, and to develop strategies that aim to improve the management of patients with lung cancer. Ultimately, these approaches should lead to sensitive, specific and noninvasive methods for early diagnosis, and facilitate the prediction of response to therapy and outcome, as well as the identification of potential novel therapeutic targets. Genomic studies were the first to move this field forward by providing novel insights into the molecular biology of lung cancer and by generating candidate biomarkers of disease progression. Lung carcinogenesis is driven by genetic and epigenetic alterations that cause aberrant gene function; however, the challenge remains to pinpoint the key regulatory control mechanisms and to distinguish driver from passenger alterations that may have a small but additive effect on cancer development. Epigenetic regulation by DNA methylation and histone modifications modulate chromatin structure and, in turn, either activate or silence gene expression. Proteomic approaches critically complement these molecular studies, as the phenotype of a cancer cell is determined by proteins and cannot be predicted by genomics or transcriptomics alone. The present article focuses on the technological platforms available and some proposed clinical applications. We illustrate herein how the "-omics" have revolutionised our approach to lung cancer biology and hold promise for personalised management of lung cancer.
Integrative genomic and proteomic profiling of human neuroblastoma SH-SY5Y cells reveals signatures of endosulfan exposure.

PubMed

Gandhi, Deepa; Tarale, Prashant; Naoghare, Pravin K; Bafana, Amit; Kannan, Krishnamurthi; Sivanesan, Saravanadevi

2016-01-01

Endosulfan, an organochlorine pesticide, is known to induce multiple disorders/abnormalities including neuro-degenerative disorders in many animal species. However, the molecular mechanism of endosulfan induced neuronal alterations is still not well understood. In the present study, the effect of sub-lethal concentration of endosulfan (3 μM) on human neuroblastoma cells (SH-SY5Y) was investigated using genomic and proteomic approaches. Microarray and 2D-PAGE followed by MALDI-TOF-MS analysis revealed differential expression of 831 transcripts and 16 proteins in exposed cells. A gene ontology enrichment analysis revealed that the differentially expressed genes and proteins were involved in variety of cellular events such as neuronal developmental pathway, immune response, cell differentiation, apoptosis, transmission of nerve impulse, axonogenesis, etc. The present study attempted to explore the possible molecular mechanism of endosulfan induced neuronal alterations in SH-SY5Y cells using an integrated genomic and proteomic approach. Based on the gene and protein profile possible mechanisms underlying endosulfan neurotoxicity were predicted. Copyright © 2015 Elsevier B.V. All rights reserved.
SWI/SNF Chromatin-remodeling Factors: Multiscale Analyses and Diverse Functions*

PubMed Central

Euskirchen, Ghia; Auerbach, Raymond K.; Snyder, Michael

2012-01-01

Chromatin-remodeling enzymes play essential roles in many biological processes, including gene expression, DNA replication and repair, and cell division. Although one such complex, SWI/SNF, has been extensively studied, new discoveries are still being made. Here, we review SWI/SNF biochemistry; highlight recent genomic and proteomic advances; and address the role of SWI/SNF in human diseases, including cancer and viral infections. These studies have greatly increased our understanding of complex nuclear processes. PMID:22952240
Proteomic Analysis Reveals Resistance Mechanism Against Chlorpyrifos in Frankliniella occidentalis (Thysanoptera: Thripidae).

PubMed

Yan, Dan-Kan; Hu, Min; Tang, Yun-Xia; Fan, Jia-Qin

2015-08-01

The western flower thrips is an economically important worldwide pest of many crops, and chlorpyrifos has been used to control western flower thrips for many years. To develop a better resistance-management strategy, a chlorpyrifos-resistant strain of western flower thrips (WFT-chl) was selected in the laboratory. More than 39-fold resistance was achieved after selected by chlorpyrifos for 19 generations in comparison with the susceptible strain (WFT-S). Proteome of western flower thrips (WFT-S and WFT-chl) was investigated using a quantitative proteomics approach with isobaric tag for relative and absolute quantification technique and liquid chromatography-tandem mass spectrometry technologies. According to the functional analysis, 773 proteins identified were grouped into 10 categories of molecular functions and 706 proteins were presented in 213 kinds of pathways. Comparing the proteome of WFT-chl with that of WFT-S, a total of eight proteins were found up-regulated and three down-regulated. The results from functional annotation and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses indicated that the differentially expressed protein functions in binding, catalyzing, transporting, and enzyme regulation were most important in resistance development. A list of proteins functioning in biological processes of metabolism, biological regulation, and response to stimulus was found in WFT-chl, suggesting that they are possibly the major components of the resistance mechanism to chlorpyrifos in western flower thrips. Notably, several novel potential resistance-related proteins were identified such as ribosomal protein, Vg (vitellogenin), and MACT (muscle actin), which can be used to improve our understanding of the resistance mechanisms in western flower thrips. This study provided the first comprehensive view of the complicated resistance mechanism employed by WFT-S and WFT-chl through the isobaric tag for relative and absolute quantification coupled with liquid chromatography-tandem mass spectrometry technologies. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
The genome, transcriptome, and proteome of the nematode Steinernema carpocapsae: evolutionary signatures of a pathogenic lifestyle

PubMed Central

Rougon-Cardoso, Alejandra; Flores-Ponce, Mitzi; Ramos-Aboites, Hilda Eréndira; Martínez-Guerrero, Christian Eduardo; Hao, You-Jin; Cunha, Luis; Rodríguez-Martínez, Jonathan Alejandro; Ovando-Vázquez, Cesaré; Bermúdez-Barrientos, José Roberto; Abreu-Goodger, Cei; Chavarría-Hernández, Norberto; Simões, Nelson; Montiel, Rafael

2016-01-01

The entomopathogenic nematode Steinernema carpocapsae has been widely used for the biological control of insect pests. It shares a symbiotic relationship with the bacterium Xenorhabdus nematophila, and is emerging as a genetic model to study symbiosis and pathogenesis. We obtained a high-quality draft of the nematode’s genome comprising 84,613,633 bp in 347 scaffolds, with an N50 of 1.24 Mb. To improve annotation, we sequenced both short and long RNA and conducted shotgun proteomic analyses. S. carpocapsae shares orthologous genes with other parasitic nematodes that are absent in the free-living nematode C. elegans, it has ncRNA families that are enriched in parasites, and expresses proteins putatively associated with parasitism and pathogenesis, suggesting an active role for the nematode during the pathogenic process. Host and parasites might engage in a co-evolutionary arms-race dynamic with genes participating in their interaction showing signatures of positive selection. Our analyses indicate that the consequence of this arms race is better characterized by positive selection altering specific functions instead of just increasing the number of positively selected genes, adding a new perspective to these co-evolutionary theories. We identified a protein, ATAD-3, that suggests a relevant role for mitochondrial function in the evolution and mechanisms of nematode parasitism. PMID:27876851
Characterization of Breast Cancer Interstitial Fluids by TmT Labeling, LTQ-Orbitrap Velos Mass Spectrometry and Pathway Analysis

PubMed Central

Cinzia, Raso; Carlo, Cosentino; Marco, Gaspari; Natalia, Malara; Xuemei, Han; Daniel, McClatchy; Kyu, Park Sung; Maria, Renne; Nuria, Vadalà; Ubaldo, Prati; Giovanni, Cuda; Vincenzo, Mollace; Francesco, Amato; Yates, John R.

2012-01-01

Cancer is currently considered as the end point of numerous genomic and epigenomic mutations and as the result of the interaction of transformed cells within the stromal microenvironment. The present work focuses on breast cancer, one of the most common malignancies affecting the female population in industrialized countries. In this study we perform a proteomic analysis of bioptic samples from human breast cancer, namely interstitial fluids and primary cells, normal vs disease tissues, using Tandem mass Tags (TmT) quantitative mass spectrometry combined with the MudPIT technique. To the best of our knowledge this work, with over 1700 proteins identified, represents the most comprehensive characterization of the breast cancer interstitial fluid proteome to date. Network analysis was used to identify functionally active networks in the breast cancer associated samples. From the list of differentially expressed genes we have retrieved the associated functional interaction networks. Many different signaling pathways were found activated, strongly linked to invasion, metastasis development, proliferation and with a significant cross-talking rate. This pilot study presents evidence that the proposed quantitative proteomic approach can be applied to discriminate between normal and tumoral samples and for the discovery of yet unknown carcinogenesis mechanisms and therapeutic strategies. PMID:22563702
Analysis of the Human Prostate-Specific Proteome Defined by Transcriptomics and Antibody-Based Profiling Identifies TMEM79 and ACOXL as Two Putative, Diagnostic Markers in Prostate Cancer

PubMed Central

O'Hurley, Gillian; Busch, Christer; Fagerberg, Linn; Hallström, Björn M.; Stadler, Charlotte; Tolf, Anna; Lundberg, Emma; Schwenk, Jochen M.; Jirström, Karin; Bjartell, Anders; Gallagher, William M.; Uhlén, Mathias; Pontén, Fredrik

2015-01-01

To better understand prostate function and disease, it is important to define and explore the molecular constituents that signify the prostate gland. The aim of this study was to define the prostate specific transcriptome and proteome, in comparison to 26 other human tissues. Deep sequencing of mRNA (RNA-seq) and immunohistochemistry-based protein profiling were combined to identify prostate specific gene expression patterns and to explore tissue biomarkers for potential clinical use in prostate cancer diagnostics. We identified 203 genes with elevated expression in the prostate, 22 of which showed more than five-fold higher expression levels compared to all other tissue types. In addition to previously well-known proteins we identified two poorly characterized proteins, TMEM79 and ACOXL, with potential to differentiate between benign and cancerous prostatic glands in tissue biopsies. In conclusion, we have applied a genome-wide analysis to identify the prostate specific proteome using transcriptomics and antibody-based protein profiling to identify genes with elevated expression in the prostate. Our data provides a starting point for further functional studies to explore the molecular repertoire of normal and diseased prostate including potential prostate cancer markers such as TMEM79 and ACOXL. PMID:26237329

iTRAQ-Based Proteomics Analysis and Network Integration for Kernel Tissue Development in Maize

PubMed Central

Dong, Yongbin; Wang, Qilei; Du, Chunguang; Xiong, Wenwei; Li, Xinyu; Zhu, Sailan; Li, Yuling

2017-01-01

Grain weight is one of the most important yield components and a developmentally complex structure comprised of two major compartments (endosperm and pericarp) in maize (Zea mays L.), however, very little is known concerning the coordinated accumulation of the numerous proteins involved. Herein, we used isobaric tags for relative and absolute quantitation (iTRAQ)-based comparative proteomic method to analyze the characteristics of dynamic proteomics for endosperm and pericarp during grain development. Totally, 9539 proteins were identified for both components at four development stages, among which 1401 proteins were non-redundant, 232 proteins were specific in pericarp and 153 proteins were specific in endosperm. A functional annotation of the identified proteins revealed the importance of metabolic and cellular processes, and binding and catalytic activities for the tissue development. Three and 76 proteins involved in 49 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were integrated for the specific endosperm and pericarp proteins, respectively, reflecting their complex metabolic interactions. In addition, four proteins with important functions and different expression levels were chosen for gene cloning and expression analysis. Different concordance between mRNA level and the protein abundance was observed across different proteins, stages, and tissues as in previous research. These results could provide useful message for understanding the developmental mechanisms in grain development in maize. PMID:28837076
Identification of proteins capable of metal reduction from the proteome of the Gram-positive bacterium Desulfotomaculum reducens MI-1 using an NADH-based activity assay

DOE Office of Scientific and Technical Information (OSTI.GOV)

Otwell, Annie E.; Sherwood, Roberts; Zhang, Sheng

Metal reduction capability has been found in numerous species of environmentally abundant Gram-positive bacteria. However, understanding of microbial metal reduction is based almost solely on studies of Gram-negative organisms. In this study, we focus on Desulfotomaculum reducens MI-1, a Gram-positive metal reducer whose genome lacks genes with similarity to any characterized metal reductase. D. reducens has been shown to reduce not only Fe(III), but also the environmentally important contaminants U(VI) and Cr(VI). By extracting, separating, and analyzing the functional proteome of D. reducens, using a ferrozine-based assay in order to screen for chelated Fe(III)-NTA reduction with NADH as electron donor,more » we have identified proteins not previously characterized as iron reductases. Their function was confirmed by heterologous expression in E. coli. These are the protein NADH:flavin oxidoreductase (Dred_2421) and a protein complex composed of oxidoreductase FAD/NAD(P)-binding subunit (Dred_1685) and dihydroorotate dehydrogenase 1B (Dred_1686). Dred_2421 was identified in the soluble proteome and is predicted to be a cytoplasmic protein. Dred_1685 and Dred_1686 were identified in both the soluble as well as the insoluble (presumably membrane) protein fraction, suggesting a type of membrane-association, although PSORTb predicts both proteins are cytoplasmic. Furthermore, we show that these proteins have the capability to reduce soluble Cr(VI) and U(VI) with NADH as electron donor. This study is the first functional proteomic analysis of D. reducens, and one of the first analyses of metal and radionuclide reduction in an environmentally relevant Gram-positive bacterium.« less
Unexplored therapeutic opportunities in the human genome.

PubMed

Oprea, Tudor I; Bologa, Cristian G; Brunak, Søren; Campbell, Allen; Gan, Gregory N; Gaulton, Anna; Gomez, Shawn M; Guha, Rajarshi; Hersey, Anne; Holmes, Jayme; Jadhav, Ajit; Jensen, Lars Juhl; Johnson, Gary L; Karlson, Anneli; Leach, Andrew R; Ma'ayan, Avi; Malovannaya, Anna; Mani, Subramani; Mathias, Stephen L; McManus, Michael T; Meehan, Terrence F; von Mering, Christian; Muthas, Daniel; Nguyen, Dac-Trung; Overington, John P; Papadatos, George; Qin, Jun; Reich, Christian; Roth, Bryan L; Schürer, Stephan C; Simeonov, Anton; Sklar, Larry A; Southall, Noel; Tomita, Susumu; Tudose, Ilinca; Ursu, Oleg; Vidovic, Dušica; Waller, Anna; Westergaard, David; Yang, Jeremy J; Zahoránszky-Köhalmi, Gergely

2018-05-01

A large proportion of biomedical research and the development of therapeutics is focused on a small fraction of the human genome. In a strategic effort to map the knowledge gaps around proteins encoded by the human genome and to promote the exploration of currently understudied, but potentially druggable, proteins, the US National Institutes of Health launched the Illuminating the Druggable Genome (IDG) initiative in 2014. In this article, we discuss how the systematic collection and processing of a wide array of genomic, proteomic, chemical and disease-related resource data by the IDG Knowledge Management Center have enabled the development of evidence-based criteria for tracking the target development level (TDL) of human proteins, which indicates a substantial knowledge deficit for approximately one out of three proteins in the human proteome. We then present spotlights on the TDL categories as well as key drug target classes, including G protein-coupled receptors, protein kinases and ion channels, which illustrate the nature of the unexplored opportunities for biomedical research and therapeutic development.
Phylogenomics of plant genomes: a methodology for genome-wide searches for orthologs in plants

PubMed Central

Conte, Matthieu G; Gaillard, Sylvain; Droc, Gaetan; Perin, Christophe

2008-01-01

Background Gene ortholog identification is now a major objective for mining the increasing amount of sequence data generated by complete or partial genome sequencing projects. Comparative and functional genomics urgently need a method for ortholog detection to reduce gene function inference and to aid in the identification of conserved or divergent genetic pathways between several species. As gene functions change during evolution, reconstructing the evolutionary history of genes should be a more accurate way to differentiate orthologs from paralogs. Phylogenomics takes into account phylogenetic information from high-throughput genome annotation and is the most straightforward way to infer orthologs. However, procedures for automatic detection of orthologs are still scarce and suffer from several limitations. Results We developed a procedure for ortholog prediction between Oryza sativa and Arabidopsis thaliana. Firstly, we established an efficient method to cluster A. thaliana and O. sativa full proteomes into gene families. Then, we developed an optimized phylogenomics pipeline for ortholog inference. We validated the full procedure using test sets of orthologs and paralogs to demonstrate that our method outperforms pairwise methods for ortholog predictions. Conclusion Our procedure achieved a high level of accuracy in predicting ortholog and paralog relationships. Phylogenomic predictions for all validated gene families in both species were easily achieved and we can conclude that our methodology outperforms similarly based methods. PMID:18426584
Protein Charge and Mass Contribute to the Spatio-temporal Dynamics of Protein-Protein Interactions in a Minimal Proteome

PubMed Central

Xu, Yu; Wang, Hong; Nussinov, Ruth; Ma, Buyong

2013-01-01

We constructed and simulated a ‘minimal proteome’ model using Langevin dynamics. It contains 206 essential protein types which were compiled from the literature. For comparison, we generated six proteomes with randomized concentrations. We found that the net charges and molecular weights of the proteins in the minimal genome are not random. The net charge of a protein decreases linearly with molecular weight, with small proteins being mostly positively charged and large proteins negatively charged. The protein copy numbers in the minimal genome have the tendency to maximize the number of protein-protein interactions in the network. Negatively charged proteins which tend to have larger sizes can provide large collision cross-section allowing them to interact with other proteins; on the other hand, the smaller positively charged proteins could have higher diffusion speed and are more likely to collide with other proteins. Proteomes with random charge/mass populations form less stable clusters than those with experimental protein copy numbers. Our study suggests that ‘proper’ populations of negatively and positively charged proteins are important for maintaining a protein-protein interaction network in a proteome. It is interesting to note that the minimal genome model based on the charge and mass of E. Coli may have a larger protein-protein interaction network than that based on the lower organism M. pneumoniae. PMID:23420643
EFICAz2.5: application of a high-precision enzyme function predictor to 396 proteomes.

PubMed

Kumar, Narendra; Skolnick, Jeffrey

2012-10-15

High-quality enzyme function annotation is essential for understanding the biochemistry, metabolism and disease processes of organisms. Previously, we developed a multi-component high-precision enzyme function predictor, EFICAz(2) (enzyme function inference by a combined approach). Here, we present an updated improved version, EFICAz(2.5), that is trained on a significantly larger data set of enzyme sequences and PROSITE patterns. We also present the results of the application of EFICAz(2.5) to the enzyme reannotation of 396 genomes cataloged in the ENSEMBL database. The EFICAz(2.5) server and database is freely available with a use-friendly interface at http://cssb.biology.gatech.edu/EFICAz2.5.
Metalloproteomics: Forward and Reverse Approaches in Metalloprotein Structural and Functional Characterization

PubMed Central

Shi, Wuxian; Chance, Mark R.

2010-01-01

About one-third of all proteins are associated with a metal. Metalloproteomics is defined as the structural and functional characterization of metalloproteins on a genome-wide scale. The methodologies utilized in metalloproteomics, including both forward (bottom-up) and reverse (top-down) technologies, to provide information on the identity, quantity and function of metalloproteins are discussed. Important techniques frequently employed in metalloproteomics include classical proteomics tools such as mass spectrometry and 2-D gels, immobilized-metal affinity chromatography, bioinformatics sequence analysis and homology modeling, X-ray absorption spectroscopy and other synchrotron radiation based tools. Combinative applications of these techniques provide a powerful approach to understand the function of metalloproteins. PMID:21130021
Common bean proteomics: Present status and future strategies.

PubMed

Zargar, Sajad Majeed; Mahajan, Reetika; Nazir, Muslima; Nagar, Preeti; Kim, Sun Tae; Rai, Vandna; Masi, Antonio; Ahmad, Syed Mudasir; Shah, Riaz Ahmad; Ganai, Nazir Ahmad; Agrawal, Ganesh K; Rakwal, Randeep

2017-10-03

Common bean (Phaseolus vulgaris L.) is a legume of appreciable importance and usefulness worldwide to the human population providing food and feed. It is rich in high-quality protein, energy, fiber and micronutrients especially iron, zinc, and pro-vitamin A; and possesses potentially disease-preventing and health-promoting compounds. The recently published genome sequence of common bean is an important landmark in common bean research, opening new avenues for understanding its genetics in depth. This legume crop is affected by diverse biotic and abiotic stresses severely limiting its productivity. Looking at the trend of increasing world population and the need for food crops best suited to the health of humankind, the legumes will be in great demand, including the common bean mostly for its nutritive values. Hence the need for new research in understanding the biology of this crop brings us to utilize and apply high-throughput omics approaches. In this mini-review our focus will be on the need for proteomics studies in common bean, potential of proteomics for understanding genetic regulation under abiotic and biotic stresses and how proteogenomics will lead to nutritional improvement. We will also discuss future proteomics-based strategies that must be adopted to mine new genomic resources by identifying molecular switches regulating various biological processes. Common bean is regarded as "grain of hope" for the poor, being rich in high-quality protein, energy, fiber and micronutrients (iron, zinc, pro-vitamin A); and possesses potentially disease-preventing and health-promoting compounds. Increasing world population and the need for food crops best suited to the health of humankind, puts legumes into great demand, which includes the common bean mostly. An important landmark in common bean research was the recent publication of its genome sequence, opening new avenues for understanding its genetics in depth. This legume crop is affected by diverse biotic and abiotic stresses severely limiting its productivity. Therefore, the need for new research in understanding the biology of this crop brings us to utilize and apply high-throughput omics approaches. Proteomics can be used to track all the candidate proteins/genes responsible for a biological process under specific conditions in a particular tissue. The potential of proteomics will not only help in determining the functions of a large number of genes in a single experiment but will also be a useful tool to mine new genes that can provide solution to various problems (abiotic stress, biotic stress, nutritional improvement, etc). We believe that a combined approach including breeding along with omics tools will lead towards attaining sustainability in legumes, including common bean. Copyright © 2017 Elsevier B.V. All rights reserved.
GAP Final Technical Report 12-14-04

DOE Office of Scientific and Technical Information (OSTI.GOV)

Andrew J. Bordner, PhD, Senior Research Scientist

2004-12-14

The Genomics Annotation Platform (GAP) was designed to develop new tools for high throughput functional annotation and characterization of protein sequences and structures resulting from genomics and structural proteomics, benchmarking and application of those tools. Furthermore, this platform integrated the genomic scale sequence and structural analysis and prediction tools with the advanced structure prediction and bioinformatics environment of ICM. The development of GAP was primarily oriented towards the annotation of new biomolecular structures using both structural and sequence data. Even though the amount of protein X-ray crystal data is growing exponentially, the volume of sequence data is growing even moremore » rapidly. This trend was exploited by leveraging the wealth of sequence data to provide functional annotation for protein structures. The additional information provided by GAP is expected to assist the majority of the commercial users of ICM, who are involved in drug discovery, in identifying promising drug targets as well in devising strategies for the rational design of therapeutics directed at the protein of interest. The GAP also provided valuable tools for biochemistry education, and structural genomics centers. In addition, GAP incorporates many novel prediction and analysis methods not available in other molecular modeling packages. This development led to signing the first Molsoft agreement in the structural genomics annotation area with the University of oxford Structural Genomics Center. This commercial agreement validated the Molsoft efforts under the GAP project and provided the basis for further development of the large scale functional annotation platform.« less
A genomic view of food-related and probiotic Enterococcus strains

PubMed Central

Suárez, Nadia; Hormigo, Ricardo; Fadda, Silvina; Saavedra, Lucila

2017-01-01

Abstract The study of enterococcal genomes has grown considerably in recent years. While special attention is paid to comparative genomic analysis among clinical relevant isolates, in this study we performed an exhaustive comparative analysis of enterococcal genomes of food origin and/or with potential to be used as probiotics. Beyond common genetic features, we especially aimed to identify those that are specific to enterococcal strains isolated from a certain food-related source as well as features present in a species-specific manner. Thus, the genome sequences of 25 Enterococcus strains, from 7 different species, were examined and compared. Their phylogenetic relationship was reconstructed based on orthologous proteins and whole genomes. Likewise, markers associated with a successful colonization (bacteriocin genes and genomic islands) and genome plasticity (phages and clustered regularly interspaced short palindromic repeats) were investigated for lifestyle specific genetic features. At the same time, a search for antibiotic resistance genes was carried out, since they are of big concern in the food industry. Finally, it was possible to locate 1617 FIGfam families as a core proteome universally present among the genera and to determine that most of the accessory genes code for hypothetical proteins, providing reasonable hints to support their functional characterization. PMID:27773878
Rodriguez and Pennington Address Proteogenomics and Data Sharing in the Journal Cell | Office of Cancer Clinical Proteomics Research

Cancer.gov

Precision medicine is an approach that allows doctors to understand how a patient's genetic profile may cause cancer to grow and spread, leading to a more personalized treatment strategy based on molecular characterization of a person's tumor. However, precision medicine as a genomics-based approach does not yet apply to all patients because genetic mutations do not always lead to changes of the corresponding proteins. Therefore, integrating genomics and proteomics data, or proteogenomics, presents as a new approach that may help make precision medicine a more effective treatment option for patients.
GFam: a platform for automatic annotation of gene families.

PubMed

Sasidharan, Rajkumar; Nepusz, Tamás; Swarbreck, David; Huala, Eva; Paccanaro, Alberto

2012-10-01

We have developed GFam, a platform for automatic annotation of gene/protein families. GFam provides a framework for genome initiatives and model organism resources to build domain-based families, derive meaningful functional labels and offers a seamless approach to propagate functional annotation across periodic genome updates. GFam is a hybrid approach that uses a greedy algorithm to chain component domains from InterPro annotation provided by its 12 member resources followed by a sequence-based connected component analysis of un-annotated sequence regions to derive consensus domain architecture for each sequence and subsequently generate families based on common architectures. Our integrated approach increases sequence coverage by 7.2 percentage points and residue coverage by 14.6 percentage points higher than the coverage relative to the best single-constituent database within InterPro for the proteome of Arabidopsis. The true power of GFam lies in maximizing annotation provided by the different InterPro data sources that offer resource-specific coverage for different regions of a sequence. GFam's capability to capture higher sequence and residue coverage can be useful for genome annotation, comparative genomics and functional studies. GFam is a general-purpose software and can be used for any collection of protein sequences. The software is open source and can be obtained from http://www.paccanarolab.org/software/gfam/.
A genome-wide structure-based survey of nucleotide binding proteins in M. tuberculosis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhagavat, Raghu; Kim, Heung -Bok; Kim, Chang -Yub

Nucleoside tri-phosphates (NTP) form an important class of small molecule ligands that participate in, and are essential to a large number of biological processes. Here, we seek to identify the NTP binding proteome (NTPome) in M. tuberculosis (M.tb), a deadly pathogen. Identifying the NTPome is useful not only for gaining functional insights of the individual proteins but also for identifying useful drug targets. From an earlier study, we had structural models of M.tb at a proteome scale from which a set of 13,858 small molecule binding pockets were identified. We use a set of NTP binding sub-structural motifs derived frommore » a previous study and scan the M.tb pocketome, and find that 1,768 proteins or 43% of the proteome can theoretically bind NTP ligands. Using an experimental proteomics approach involving dye-ligand affinity chromatography, we confirm NTP binding to 47 different proteins, of which 4 are hypothetical proteins. Our analysis also provides the precise list of binding site residues in each case, and the probable ligand binding pose. In conclusion, as the list includes a number of known and potential drug targets, the identification of NTP binding can directly facilitate structure-based drug design of these targets.« less
Proteomic profiling of developing cotton fibers from wild and domesticated Gossypium barbadense.

PubMed

Hu, Guanjing; Koh, Jin; Yoo, Mi-Jeong; Grupp, Kara; Chen, Sixue; Wendel, Jonathan F

2013-10-01

Pima cotton (Gossypium barbadense) is widely cultivated because of its long, strong seed trichomes ('fibers') used for premium textiles. These agronomically advanced fibers were derived following domestication and thousands of years of human-mediated crop improvement. To gain an insight into fiber development and evolution, we conducted comparative proteomic and transcriptomic profiling of developing fiber from an elite cultivar and a wild accession. Analyses using isobaric tag for relative and absolute quantification (iTRAQ) LC-MS/MS technology identified 1317 proteins in fiber. Of these, 205 were differentially expressed across developmental stages, and 190 showed differential expression between wild and cultivated forms, 14.4% of the proteome sampled. Human selection may have shifted the timing of developmental modules, such that some occur earlier in domesticated than in wild cotton. A novel approach was used to detect possible biased expression of homoeologous copies of proteins. Results indicate a significant partitioning of duplicate gene expression at the protein level, but an approximately equal degree of bias for each of the two constituent genomes of allopolyploid cotton. Our results demonstrate the power of complementary transcriptomic and proteomic approaches for the study of the domestication process. They also provide a rich database for mining for functional analyses of cotton improvement or evolution. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.
A genome-wide structure-based survey of nucleotide binding proteins in M. tuberculosis

DOE PAGES

Bhagavat, Raghu; Kim, Heung -Bok; Kim, Chang -Yub; ...

2017-10-02

Nucleoside tri-phosphates (NTP) form an important class of small molecule ligands that participate in, and are essential to a large number of biological processes. Here, we seek to identify the NTP binding proteome (NTPome) in M. tuberculosis (M.tb), a deadly pathogen. Identifying the NTPome is useful not only for gaining functional insights of the individual proteins but also for identifying useful drug targets. From an earlier study, we had structural models of M.tb at a proteome scale from which a set of 13,858 small molecule binding pockets were identified. We use a set of NTP binding sub-structural motifs derived frommore » a previous study and scan the M.tb pocketome, and find that 1,768 proteins or 43% of the proteome can theoretically bind NTP ligands. Using an experimental proteomics approach involving dye-ligand affinity chromatography, we confirm NTP binding to 47 different proteins, of which 4 are hypothetical proteins. Our analysis also provides the precise list of binding site residues in each case, and the probable ligand binding pose. In conclusion, as the list includes a number of known and potential drug targets, the identification of NTP binding can directly facilitate structure-based drug design of these targets.« less
Identification of virulence determinants of the human pathogenic fungi Aspergillus fumigatus and Candida albicans by proteomics.

PubMed

Kniemeyer, Olaf; Schmidt, André D; Vödisch, Martin; Wartenberg, Dirk; Brakhage, Axel A

2011-06-01

Both fungi Candida albicans and Aspergillus fumigatus can cause a number of life-threatening systemic infections in humans. The commensal yeast C. albicans is one of the main causes of nosocomial fungal infectious diseases, whereas the filamentous fungus A. fumigatus has become one of the most prevalent airborne fungal pathogens. Early diagnosis of these fungal infections is challenging, only a limited number of antifungals for treatment are available, and the molecular details of pathogenicity are hardly understood. The completion of both the A. fumigatus and C. albicans genome sequence provides the opportunity to improve diagnosis, to define new drug targets, to understand the functions of many uncharacterised proteins, and to study protein regulation on a global scale. With the application of proteomic tools, particularly two-dimensional gel electrophoresis and LC/MS-based methods, a comprehensive overview about the proteins of A. fumigatus and C. albicans present or induced during environmental changes and stress conditions has been obtained in the past 5 years. However, for the discovery of further putative virulence determinants, more sensitive and targeted proteomic methods have to be applied. Here, we review the recent proteome data generated for A. fumigatus and C. albicans that are related to factors required for pathogenicity. Copyright © 2011 Elsevier GmbH. All rights reserved.
Proteomics to assess the role of phenotypic plasticity in aquatic organisms exposed to pollution and global warming.

PubMed

Silvestre, Frédéric; Gillardin, Virginie; Dorts, Jennifer

2012-11-01

Nowadays, the unprecedented rates of anthropogenic changes in ecosystems suggest that organisms have to migrate to new distributional ranges or to adapt commensurately quickly to new conditions to avoid becoming extinct. Pollution and global warming are two of the most important threats aquatic organisms will have to face in the near future. If genetic changes in a population in response to natural selection are extensively studied, the role of acclimation through phenotypic plasticity (the property of a given genotype to produce different phenotypes in response to particular environmental conditions) in a species to deal with new environmental conditions remains largely unknown. Proteomics is the extensive study of the protein complement of a genome. It is dynamic and depends on the specific tissue, developmental stage, and environmental conditions. As the final product of gene expression, it is subjected to several regulatory steps from gene transcription to the functional protein. Consequently, there is a discrepancy between the abundance of mRNA and the abundance of the corresponding protein. Moreover, proteomics is closer to physiology and gives a more functional knowledge of the regulation of gene expression than does transcriptomics. The study of protein-expression profiles, however, gives a better portrayal of the cellular phenotype and is considered as a key link between the genotype and the organismal phenotype. Under new environmental conditions, we can observe a shift of the protein-expression pattern defining a new cellular phenotype that can possibly improve the fitness of the organism. It is now necessary to define a proteomic norm of reaction for organisms acclimating to environmental stressors. Its link to fitness will give new insights into how organisms can evolve in a changing environment. The proteomic literature bearing on chronic exposure to pollutants and on acclimation to heat stress in aquatic organisms, as well as potential application of proteomics in evolutionary issues, are outlined. While the transcriptome responses are commonly investigated, proteomics approaches now need to be intensified, with the new perspective of integrating the cellular phenotype with the organismal phenotype and with the mechanisms of the regulation of gene expression, such as epigenetics.
P-MartCancer: A New Online Platform to Access CPTAC Datasets and Enable New Analyses | Office of Cancer Clinical Proteomics Research

Cancer.gov

The November 1, 2017 issue of Cancer Research is dedicated to a collection of computational resource papers in genomics, proteomics, animal models, imaging, and clinical subjects for non-bioinformaticists looking to incorporate computing tools into their work. Scientists at Pacific Northwest National Laboratory have developed P-MartCancer, an open, web-based interactive software tool that enables statistical analyses of peptide or protein data generated from mass-spectrometry (MS)-based global proteomics experiments.
VIDEO: Dr. Henry Rodriguez - Proteogenomics in Cancer Medicine | Office of Cancer Clinical Proteomics Research

Cancer.gov

Dr. Henry Rodriguez, director of the Office of Cancer Clinical Proteomics Research (OCCPR) at NCI, speaks with ecancer television at WIN 2017 about the translation of the proteins expressed in a patient's tumor into a map for druggable targets. By combining genomic and proteomic information (proteogenomics), leading scientists are gaining new insights into ways to detect and treat cancer due to a more complete and unified understanding of complex biological processes.
A reference map of the Arabidopsis thaliana mature pollen proteome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Noir, Sandra; Braeutigam, Anne; Colby, Thomas

The male gametophyte (or pollen) plays an obligatory role during sexual reproduction of higher plants. The extremely reduced complexity of this organ renders pollen a valuable experimental system for studying fundamental aspects of plant biology such as cell fate determination, cell-cell interactions, cell polarity, and tip-growth. Here, we present the first reference map of the mature pollen proteome of the dicotyledonous model plant species, Arabidopsis thaliana. Based on two-dimensional gel electrophoresis, matrix-assisted laser desorption/ionization time-of-flight, and electrospray quadrupole time-of-flight mass spectrometry, we reproducibly identified 121 different proteins in 145 individual spots. The presence, subcellular localization, and functional classification of themore » identified proteins are discussed in relation to the pollen transcriptome and the full protein complement encoded by the nuclear Arabidopsis genome.« less

fusionDB: assessing microbial diversity and environmental preferences via functional similarity networks

PubMed Central

Zhu, Chengsheng; Miller, Maximilian

2018-01-01

Abstract Microbial functional diversification is driven by environmental factors, i.e. microorganisms inhabiting the same environmental niche tend to be more functionally similar than those from different environments. In some cases, even closely phylogenetically related microbes differ more across environments than across taxa. While microbial similarities are often reported in terms of taxonomic relationships, no existing databases directly link microbial functions to the environment. We previously developed a method for comparing microbial functional similarities on the basis of proteins translated from their sequenced genomes. Here, we describe fusionDB, a novel database that uses our functional data to represent 1374 taxonomically distinct bacteria annotated with available metadata: habitat/niche, preferred temperature, and oxygen use. Each microbe is encoded as a set of functions represented by its proteome and individual microbes are connected via common functions. Users can search fusionDB via combinations of organism names and metadata. Moreover, the web interface allows mapping new microbial genomes to the functional spectrum of reference bacteria, rendering interactive similarity networks that highlight shared functionality. fusionDB provides a fast means of comparing microbes, identifying potential horizontal gene transfer events, and highlighting key environment-specific functionality. PMID:29112720
Investigating Gene Function in Cereal Rust Fungi by Plant-Mediated Virus-Induced Gene Silencing.

PubMed

Panwar, Vinay; Bakkeren, Guus

2017-01-01

Cereal rust fungi are destructive pathogens, threatening grain production worldwide. Targeted breeding for resistance utilizing host resistance genes has been effective. However, breakdown of resistance occurs frequently and continued efforts are needed to understand how these fungi overcome resistance and to expand the range of available resistance genes. Whole genome sequencing, transcriptomic and proteomic studies followed by genome-wide computational and comparative analyses have identified large repertoire of genes in rust fungi among which are candidates predicted to code for pathogenicity and virulence factors. Some of these genes represent defence triggering avirulence effectors. However, functions of most genes still needs to be assessed to understand the biology of these obligate biotrophic pathogens. Since genetic manipulations such as gene deletion and genetic transformation are not yet feasible in rust fungi, performing functional gene studies is challenging. Recently, Host-induced gene silencing (HIGS) has emerged as a useful tool to characterize gene function in rust fungi while infecting and growing in host plants. We utilized Barley stripe mosaic virus-mediated virus induced gene silencing (BSMV-VIGS) to induce HIGS of candidate rust fungal genes in the wheat host to determine their role in plant-fungal interactions. Here, we describe the methods for using BSMV-VIGS in wheat for functional genomics study in cereal rust fungi.
Proteomics: a new approach to the study of disease.

PubMed

Chambers, G; Lawrie, L; Cash, P; Murray, G I

2000-11-01

The global analysis of cellular proteins has recently been termed proteomics and is a key area of research that is developing in the post-genome era. Proteomics uses a combination of sophisticated techniques including two-dimensional (2D) gel electrophoresis, image analysis, mass spectrometry, amino acid sequencing, and bio-informatics to resolve comprehensively, to quantify, and to characterize proteins. The application of proteomics provides major opportunities to elucidate disease mechanisms and to identify new diagnostic markers and therapeutic targets. This review aims to explain briefly the background to proteomics and then to outline proteomic techniques. Applications to the study of human disease conditions ranging from cancer to infectious diseases are reviewed. Finally, possible future advances are briefly considered, especially those which may lead to faster sample throughput and increased sensitivity for the detection of individual proteins. Copyright 2000 John Wiley & Sons, Ltd.
A chromosome-centric human proteome project (C-HPP) to characterize the sets of proteins encoded in chromosome 17.

PubMed

Liu, Suli; Im, Hogune; Bairoch, Amos; Cristofanilli, Massimo; Chen, Rui; Deutsch, Eric W; Dalton, Stephen; Fenyo, David; Fanayan, Susan; Gates, Chris; Gaudet, Pascale; Hincapie, Marina; Hanash, Samir; Kim, Hoguen; Jeong, Seul-Ki; Lundberg, Emma; Mias, George; Menon, Rajasree; Mu, Zhaomei; Nice, Edouard; Paik, Young-Ki; Uhlen, Mathias; Wells, Lance; Wu, Shiaw-Lin; Yan, Fangfei; Zhang, Fan; Zhang, Yue; Snyder, Michael; Omenn, Gilbert S; Beavis, Ronald C; Hancock, William S

2013-01-04

We report progress assembling the parts list for chromosome 17 and illustrate the various processes that we have developed to integrate available data from diverse genomic and proteomic knowledge bases. As primary resources, we have used GPMDB, neXtProt, PeptideAtlas, Human Protein Atlas (HPA), and GeneCards. All sites share the common resource of Ensembl for the genome modeling information. We have defined the chromosome 17 parts list with the following information: 1169 protein-coding genes, the numbers of proteins confidently identified by various experimental approaches as documented in GPMDB, neXtProt, PeptideAtlas, and HPA, examples of typical data sets obtained by RNASeq and proteomic studies of epithelial derived tumor cell lines (disease proteome) and a normal proteome (peripheral mononuclear cells), reported evidence of post-translational modifications, and examples of alternative splice variants (ASVs). We have constructed a list of the 59 "missing" proteins as well as 201 proteins that have inconclusive mass spectrometric (MS) identifications. In this report we have defined a process to establish a baseline for the incorporation of new evidence on protein identification and characterization as well as related information from transcriptome analyses. This initial list of "missing" proteins that will guide the selection of appropriate samples for discovery studies as well as antibody reagents. Also we have illustrated the significant diversity of protein variants (including post-translational modifications, PTMs) using regions on chromosome 17 that contain important oncogenes. We emphasize the need for mandated deposition of proteomics data in public databases, the further development of improved PTM, ASV, and single nucleotide variant (SNV) databases, and the construction of Web sites that can integrate and regularly update such information. In addition, we describe the distribution of both clustered and scattered sets of protein families on the chromosome. Since chromosome 17 is rich in cancer-associated genes, we have focused the clustering of cancer-associated genes in such genomic regions and have used the ERBB2 amplicon as an example of the value of a proteogenomic approach in which one integrates transcriptomic with proteomic information and captures evidence of coexpression through coordinated regulation.
Proteomic Dissection of the Mitochondrial DNA Metabolism Apparatus in Arabidopsis

DOE Office of Scientific and Technical Information (OSTI.GOV)

SAlly A. Mackenzie

2004-01-06

This study involves the investigation of nuclear genetic components that regulate mitochondrial genome behavior in higher plants. The approach utilizes the advanced plant model system of Arabidopsis thaliana to identify and functionally characterize multiple components of the mitochondrial DNA replication, recombination and mismatch repair system and their interaction partners. The rationale for the research stems from the central importance of mitochondria to overall cellular metabolism and the essential nature of the mitochondrial genome to mitochondrial function. Relatively little is understood about mitochondrial DNA maintenance and transmission in higher eukaryotes, and the higher plant mitochondrial genome displays unique properties and behavior.more » This investigation has revealed at least three important properties of plant mitochondrial DNA metabolism components. (1) Many are dual targeted to mitochondrial and chloroplasts by novel mechanisms, suggesting that the mitochondria a nd chloroplast share their genome maintenance apparatus. (2)The MSH1 gene, originating as a component of mismatch repair, has evolved uniquely in plants to participate in differential replication of the mitochondrial genome. (3) This mitochondrial differential replication process, termed substoichiometric shifting and also involving a RecA-related gene, appears to represent an adaptive mechanism to expand plant reproductive capacity and is likely present throughout the plant kingdom.« less
Functional genomics of corrinoid starvation in the organohalide-respiring bacterium Dehalobacter restrictus strain PER-K23

PubMed Central

Rupakula, Aamani; Lu, Yue; Kruse, Thomas; Boeren, Sjef; Holliger, Christof; Smidt, Hauke; Maillard, Julien

2015-01-01

De novo corrinoid biosynthesis represents one of the most complicated metabolic pathways in nature. Organohalide-respiring bacteria (OHRB) have developed different strategies to deal with their need of corrinoid, as it is an essential cofactor of reductive dehalogenases, the key enzymes in OHR metabolism. In contrast to Dehalococcoides mccartyi, the genome of Dehalobacter restrictus strain PER-K23 contains a complete set of corrinoid biosynthetic genes, of which cbiH appears to be truncated and therefore non-functional, possibly explaining the corrinoid auxotrophy of this obligate OHRB. Comparative genomics within Dehalobacter spp. revealed that one (operon-2) of the five distinct corrinoid biosynthesis associated operons present in the genome of D. restrictus appeared to be present only in that particular strain, which encodes multiple members of corrinoid transporters and salvaging enzymes. Operon-2 was highly up-regulated upon corrinoid starvation both at the transcriptional (346-fold) and proteomic level (46-fold on average), in line with the presence of an upstream cobalamin riboswitch. Together, these data highlight the importance of this operon in corrinoid homeostasis in D. restrictus and the augmented salvaging strategy this bacterium adopted to cope with the need for this essential cofactor. PMID:25610435
A glimpse into the proteome of phototrophic bacterium Rhodobacter capsulatus.

PubMed

Onder, Ozlem; Aygun-Sunar, Semra; Selamoglu, Nur; Daldal, Fevzi

2010-01-01

A first glimpse into the proteome of Rhodobacter capsulatus revealed more than 450 (with over 210 cytoplasmic and 185 extracytoplasmic known as well as 55 unknown) proteins that are identified with high degree of confidence using nLC-MS/MS analyses. The accumulated data provide a solid platform for ongoing efforts to establish the proteome of this species and the cellular locations of its constituents. They also indicate that at least 40 of the identified proteins, which were annotated in genome databases as unknown hypothetical proteins, correspond to predicted translation products that are indeed present in cells under the growth conditions used in this work. In addition, matching the identification labels of the proteins reported between the two available R. capsulatus genome databases (ERGO-light with RRCxxxxx and NT05 with NT05RCxxxx numbers) indicated that 11 such proteins are listed only in the latter database.
An orthology-based analysis of pathogenic protozoa impacting global health: an improved comparative genomics approach with prokaryotes and model eukaryote orthologs.

PubMed

Cuadrat, Rafael R C; da Serra Cruz, Sérgio Manuel; Tschoeke, Diogo Antônio; Silva, Edno; Tosta, Frederico; Jucá, Henrique; Jardim, Rodrigo; Campos, Maria Luiza M; Mattoso, Marta; Dávila, Alberto M R

2014-08-01

A key focus in 21(st) century integrative biology and drug discovery for neglected tropical and other diseases has been the use of BLAST-based computational methods for identification of orthologous groups in pathogenic organisms to discern orthologs, with a view to evaluate similarities and differences among species, and thus allow the transfer of annotation from known/curated proteins to new/non-annotated ones. We used here a profile-based sensitive methodology to identify distant homologs, coupled to the NCBI's COG (Unicellular orthologs) and KOG (Eukaryote orthologs), permitting us to perform comparative genomics analyses on five protozoan genomes. OrthoSearch was used in five protozoan proteomes showing that 3901 and 7473 orthologs can be identified by comparison with COG and KOG proteomes, respectively. The core protozoa proteome inferred was 418 Protozoa-COG orthologous groups and 704 Protozoa-KOG orthologous groups: (i) 31.58% (132/418) belongs to the category J (translation, ribosomal structure, and biogenesis), and 9.81% (41/418) to the category O (post-translational modification, protein turnover, chaperones) using COG; (ii) 21.45% (151/704) belongs to the categories J, and 13.92% (98/704) to the O using KOG. The phylogenomic analysis showed four well-supported clades for Eukarya, discriminating Multicellular [(i) human, fly, plant and worm] and Unicellular [(ii) yeast, (iii) fungi, and (iv) protozoa] species. These encouraging results attest to the usefulness of the profile-based methodology for comparative genomics to accelerate semi-automatic re-annotation, especially of the protozoan proteomes. This approach may also lend itself for applications in global health, for example, in the case of novel drug target discovery against pathogenic organisms previously considered difficult to research with traditional drug discovery tools.
An Orthology-Based Analysis of Pathogenic Protozoa Impacting Global Health: An Improved Comparative Genomics Approach with Prokaryotes and Model Eukaryote Orthologs

PubMed Central

Cuadrat, Rafael R. C.; da Serra Cruz, Sérgio Manuel; Tschoeke, Diogo Antônio; Silva, Edno; Tosta, Frederico; Jucá, Henrique; Jardim, Rodrigo; Campos, Maria Luiza M.; Mattoso, Marta

2014-01-01

Abstract A key focus in 21st century integrative biology and drug discovery for neglected tropical and other diseases has been the use of BLAST-based computational methods for identification of orthologous groups in pathogenic organisms to discern orthologs, with a view to evaluate similarities and differences among species, and thus allow the transfer of annotation from known/curated proteins to new/non-annotated ones. We used here a profile-based sensitive methodology to identify distant homologs, coupled to the NCBI's COG (Unicellular orthologs) and KOG (Eukaryote orthologs), permitting us to perform comparative genomics analyses on five protozoan genomes. OrthoSearch was used in five protozoan proteomes showing that 3901 and 7473 orthologs can be identified by comparison with COG and KOG proteomes, respectively. The core protozoa proteome inferred was 418 Protozoa-COG orthologous groups and 704 Protozoa-KOG orthologous groups: (i) 31.58% (132/418) belongs to the category J (translation, ribosomal structure, and biogenesis), and 9.81% (41/418) to the category O (post-translational modification, protein turnover, chaperones) using COG; (ii) 21.45% (151/704) belongs to the categories J, and 13.92% (98/704) to the O using KOG. The phylogenomic analysis showed four well-supported clades for Eukarya, discriminating Multicellular [(i) human, fly, plant and worm] and Unicellular [(ii) yeast, (iii) fungi, and (iv) protozoa] species. These encouraging results attest to the usefulness of the profile-based methodology for comparative genomics to accelerate semi-automatic re-annotation, especially of the protozoan proteomes. This approach may also lend itself for applications in global health, for example, in the case of novel drug target discovery against pathogenic organisms previously considered difficult to research with traditional drug discovery tools. PMID:24960463
Quantitative proteomics in teleost fish: insights and challenges for neuroendocrine and neurotoxicology research.

PubMed

Martyniuk, Christopher J; Popesku, Jason T; Chown, Brittany; Denslow, Nancy D; Trudeau, Vance L

2012-05-01

Neuroendocrine systems integrate both extrinsic and intrinsic signals to regulate virtually all aspects of an animal's physiology. In aquatic toxicology, studies have shown that pollutants are capable of disrupting the neuroendocrine system of teleost fish, and many chemicals found in the environment can also have a neurotoxic mode of action. Omics approaches are now used to better understand cell signaling cascades underlying fish neurophysiology and the control of pituitary hormone release, in addition to identifying adverse effects of pollutants in the teleostean central nervous system. For example, both high throughput genomics and proteomic investigations of molecular signaling cascades for both neurotransmitter and nuclear receptor agonists/antagonists have been reported. This review highlights recent studies that have utilized quantitative proteomics methods such as 2D differential in-gel electrophoresis (DIGE) and isobaric tagging for relative and absolute quantitation (iTRAQ) in neuroendocrine regions and uses these examples to demonstrate the challenges of using proteomics in neuroendocrinology and neurotoxicology research. To begin to characterize the teleost neuroproteome, we functionally annotated 623 unique proteins found in the fish hypothalamus and telencephalon. These proteins have roles in biological processes that include synaptic transmission, ATP production, receptor activity, cell structure and integrity, and stress responses. The biological processes most represented by proteins detected in the teleost neuroendocrine brain included transport (8.4%), metabolic process (5.5%), and glycolysis (4.8%). We provide an example of using sub-network enrichment analysis (SNEA) to identify protein networks in the fish hypothalamus in response to dopamine receptor signaling. Dopamine signaling altered the abundance of proteins that are binding partners of microfilaments, integrins, and intermediate filaments, consistent with data suggesting dopaminergic regulation of neuronal stability and structure. Lastly, for fish neuroendocrine studies using both high-throughput genomics and proteomics, we compare gene and protein relationships in the hypothalamus and demonstrate that correlation is often poor for single time point experiments. These studies highlight the need for additional time course analyses to better understand gene-protein relationships and adverse outcome pathways. This is important if both transcriptomics and proteomics are to be used together to investigate neuroendocrine signaling pathways or as bio-monitoring tools in ecotoxicology. Copyright © 2011 Elsevier Inc. All rights reserved.
A Genomic Approach: The Effects of Bisphenol A on Zebrafish

EPA Science Inventory

Genomics, proteomics, and metabolomics are emerging technologies used to analyze the effects of the increasing level of environmental pollutants that are affecting aquatic organisms. Some of these toxins are considered endocrine-disrupting chemicals (EDC) due to their interferenc...
Global analysis of the Brucella melitensis proteome: Identification of proteins expressed in laboratory-grown culture.

PubMed

Wagner, Mary Ann; Eschenbrenner, Michel; Horn, Troy A; Kraycer, Jo Ann; Mujer, Cesar V; Hagius, Sue; Elzer, Philip; DelVecchio, Vito G

2002-08-01

Brucella melitensis is a facultative intracellular bacterial pathogen that causes brucellosis, a zoonotic disease primarily infecting sheep and goats, characterized by undulant fever, arthritic pain and other neurological disorders in humans. A comprehensive proteomic study of strain 16M was conducted to identify and characterize the proteins expressed in laboratory-grown culture. Using overlapping narrow range immobilized pH gradient strips for two-dimensional gel electrophoresis, 883 protein spots were detected between pH 3.5 and 11. The average isoelectric point and molecular weight values of the detected spots were 5.22 and 46.5 kDa, respectively. Of the 883 observed protein spots, 440 have been identified by matrix-assisted laser desorption/ionization-mass spectrometry. These proteins represent 187 discrete open reading frames (ORFs) or 6% of the predicted 3197 ORFs contained in the genome. The corresponding ORFs of the identified proteins are distributed evenly between each of the two circular B. melitensis chromosomes, indicating that both replicons are functionally active. The presented proteome map lists those protein spots identified to date in this study. This map may serve as a baseline reference for future proteomic studies aimed at the definition of biochemical pathways associated with stress responses, host specificity, pathogenicity and virulence. It will also assist in characterization of global proteomic effects in gene-knockout mutants. Ultimately, it may aid in our overall understanding of the cell biology of B. melitensis, an important bacterial pathogen.
Shotgun proteomics approach to characterizing the embryonic proteome of the silkworm, Bombyx mori, at labrum appearance stage.

PubMed

Li, J-Y; Chen, X; Hosseini Moghaddam, S H; Chen, M; Wei, H; Zhong, B-X

2009-10-01

The shotgun approach has gained considerable acknowledgement in recent years as a dominant strategy in proteomics. We observed a dramatic increase of specific protein spots in two-dimensional electrophoresis (2-DE) gels of the silkworm (Bombyx mori) embryo at labrum appearance, a characteristic stage during embryonic development of silkworm which is involved with temperature increase by silkworm raiser. We employed shotgun liquid chromatography tandem mass spectrometry (LC-MS/MS) technology to analyse the proteome of B. mori embryos at this stage. A total of 2168 proteins were identified with an in-house database. Approximately 47% of them had isoelectric point (pI) values distributed theoretically in the range pI 5-7 and approximately 60% of them had molecular weights of 15-45 kDa. Furthermore, 111 proteins had an pI greater than 10 and were difficult to separate by 2-DE. Many important functional proteins related to embryonic development, stress response, DNA transcription/translation, cell growth, proliferation and differentiation, organogenesis and reproduction were identified. Among them proteins related to nervous system development were noticeable. All known heat shock proteins (HSPs) were detected in this developmental stage of B. mori embryo. In addition, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis showed energetic metabolism at this stage. These results were expected to provide more information for proteomic monitoring of the insect embryo and better understanding of the spatiotemporal expression of genes during embryonic developmental processes.
ATRX Directs Binding of PRC2 to Xist RNA and Polycomb Targets

PubMed Central

Sarma, Kavitha; Cifuentes-Rojas, Catherine; Ergun, Ayla; del Rosario, Amanda; Jeon, Yesu; White, Forest; Sadreyev, Ruslan; Lee, Jeannie T.

2015-01-01

SUMMARY X chromosome inactivation (XCI) depends on the long noncoding RNA Xist and its recruitment of Polycomb Repressive Complex 2 (PRC2). PRC2 is also targeted to other sites throughout the genome to effect transcriptional repression. Using XCI as a model, we apply an unbiased proteomics approach to isolate Xist and PRC2 regulators and identified ATRX. ATRX unexpectedly functions as a high-affinity RNA-binding protein that directly interacts with RepA/Xist RNA to promote loading of PRC2 in vivo. Without ATRX, PRC2 cannot load onto Xist RNA nor spread in cis along the X chromosome. Moreover, epigenomic profiling reveals that genome-wide targeting of PRC2 depends on ATRX, as loss of ATRX leads to spatial redistribution of PRC2 and derepression of Polycomb responsive genes. Thus, ATRX is a required specificity determinant for PRC2 targeting and function. PMID:25417162
Proteomic analysis of the venom from the scorpion Mesobuthus martensii.

PubMed

Xu, Xiaobo; Duan, Zhigui; Di, Zhiyong; He, Yawen; Li, Jianglin; Li, Zhongjie; Xie, Chunliang; Zeng, Xiongzhi; Cao, Zhijian; Wu, Yingliang; Liang, Songping; Li, Wenxin

2014-06-25

The scorpion Mesobuthus martensii is the most populous species in eastern Asian countries, and several toxic components have been identified from their venoms. Nevertheless, a complete proteomic profile of the venom of M. martensii is still not available. In this study, the venom of M. martensii was analyzed by comprehensive proteomic approaches. 153 fractions were isolated from the M. martensii venom by 2-DE, SDS-PAGE and RP-HPLC. The ESI-Q-TOF MS results of all fractions were used to search the scorpion genomic and transcriptomic databases. Totally, 227 non-redundant protein sequences were unambiguously identified, composed of 134 previously known and 93 previously unknown proteins. Among 134 previously known proteins, 115 proteins were firstly confirmed from the M. martensii crude venom and 19 toxins were confirmed once again, involving 43 typical toxins, 7 atypical toxins, 12 venom enzymes and 72 cell associated proteins. In typical toxins, 7 novel-toxin sequences were identified, including 3 Na(+)-channel toxins, 3K(+)-channel toxins and 1 no-annotation toxin. These results increased 230% (115/50) venom components compared with previous studies from the M. martensii venom, especially 50% (24/48) typical toxins. Additionally, a mass fingerprint obtained by MALDI-TOF MS indicated that the scorpion venom contained more than 200 different molecular mass components. This work firstly gave a systematic investigation of the M. martensii venom by combined proteomics strategy coupled with genomics and transcriptomics. A large number of protein components were unambiguously identified from the venom of M. martensii, most of which were confirmed for the first time. We also contributed 7 novel-toxin sequences and 93 protein sequences previously unknown to be part of the venom, for which we assigned potential biological functions. Besides, we obtained a mass fingerprint of the M. martensii venom. Together, our study not only provides the most comprehensive catalog of the molecular diversity of the M. martensii venom at the proteomic level, but also enriches the composition information of scorpion venom. Copyright © 2014 Elsevier B.V. All rights reserved.
Microgravity-driven remodeling of the proteome reveals insights into molecular mechanisms and signal networks involved in response to the space flight environment.

PubMed

Rea, Giuseppina; Cristofaro, Francesco; Pani, Giuseppe; Pascucci, Barbara; Ghuge, Sandip A; Corsetto, Paola Antonia; Imbriani, Marcello; Visai, Livia; Rizzo, Angela M

2016-03-30

Space is a hostile environment characterized by high vacuum, extreme temperatures, meteoroids, space debris, ionospheric plasma, microgravity and space radiation, which all represent risks for human health. A deep understanding of the biological consequences of exposure to the space environment is required to design efficient countermeasures to minimize their negative impact on human health. Recently, proteomic approaches have received a significant amount of attention in the effort to further study microgravity-induced physiological changes. In this review, we summarize the current knowledge about the effects of microgravity on microorganisms (in particular Cupriavidus metallidurans CH34, Bacillus cereus and Rhodospirillum rubrum S1H), plants (whole plants, organs, and cell cultures), mammalian cells (endothelial cells, bone cells, chondrocytes, muscle cells, thyroid cancer cells, immune system cells) and animals (invertebrates, vertebrates and mammals). Herein, we describe their proteome's response to microgravity, focusing on proteomic discoveries and their future potential applications in space research. Space experiments and operational flight experience have identified detrimental effects on human health and performance because of exposure to weightlessness, even when currently available countermeasures are implemented. Many experimental tools and methods have been developed to study microgravity induced physiological changes. Recently, genomic and proteomic approaches have received a significant amount of attention. This review summarizes the recent research studies of the proteome response to microgravity inmicroorganisms, plants, mammalians cells and animals. Current proteomic tools allow large-scale, high-throughput analyses for the detection, identification, and functional investigation of all proteomes. Understanding gene and/or protein expression is the key to unlocking the mechanisms behind microgravity-induced problems and to finding effective countermeasures to spaceflight-induced alterations but also for the study of diseases on earth. Future perspectives are also highlighted. Copyright © 2015 Elsevier B.V. All rights reserved.
Phylogenetic Tracings of Proteome Size Support the Gradual Accretion of Protein Structural Domains and the Early Origin of Viruses from Primordial Cells

PubMed Central

Nasir, Arshan; Kim, Kyung Mo; Caetano-Anollés, Gustavo

2017-01-01

Untangling the origin and evolution of viruses remains a challenging proposition. We recently studied the global distribution of protein domain structures in thousands of completely sequenced viral and cellular proteomes with comparative genomics, phylogenomics, and multidimensional scaling methods. A tree of life describing the evolution of proteomes revealed viruses emerging from the base of the tree as a fourth supergroup of life. A tree of domains indicated an early origin of modern viral lineages from ancient cells that co-existed with the cellular ancestors. However, it was recently argued that the rooting of our trees and the basal placement of viruses was artifactually induced by small genome (proteome) size. Here we show that these claims arise from misunderstanding and misinterpretations of cladistic methodology. Trees are reconstructed unrooted, and thus, their topologies cannot be distorted a posteriori by the rooting methodology. Tracing proteome size in trees and multidimensional views of evolutionary relationships as well as tests of leaf stability and exclusion/inclusion of taxa demonstrated that the smallest proteomes were neither attracted toward the root nor caused any topological distortions of the trees. Simulations confirmed that taxa clustering patterns were independent of proteome size and were determined by the presence of known evolutionary relatives in data matrices, highlighting the need for broader taxon sampling in phylogeny reconstruction. Instead, phylogenetic tracings of proteome size revealed a slowdown in innovation of the structural domain vocabulary and four regimes of allometric scaling that reflected a Heaps law. These regimes explained increasing economies of scale in the evolutionary growth and accretion of kernel proteome repertoires of viruses and cellular organisms that resemble growth of human languages with limited vocabulary sizes. Results reconcile dynamic and static views of domain frequency distributions that are consistent with the axiom of spatiotemporal continuity that is tenet of evolutionary thinking. PMID:28690608
Architecture of the human interactome defines protein communities and disease networks

PubMed Central

Huttlin, Edward L.; Bruckner, Raphael J.; Paulo, Joao A.; Cannon, Joe R.; Ting, Lily; Baltier, Kurt; Colby, Greg; Gebreab, Fana; Gygi, Melanie P.; Parzen, Hannah; Szpyt, John; Tam, Stanley; Zarraga, Gabriela; Pontano-Vaites, Laura; Swarup, Sharan; White, Anne E.; Schweppe, Devin K.; Rad, Ramin; Erickson, Brian K.; Obar, Robert A.; Guruharsha, K.G.; Li, Kejie; Artavanis-Tsakonas, Spyros; Gygi, Steven P.; Harper, J. Wade

2017-01-01

The physiology of a cell can be viewed as the product of thousands of proteins acting in concert to shape the cellular response. Coordination is achieved in part through networks of protein-protein interactions that assemble functionally related proteins into complexes, organelles, and signal transduction pathways. Understanding the architecture of the human proteome has the potential to inform cellular, structural, and evolutionary mechanisms and is critical to elucidation of how genome variation contributes to disease1–3. Here, we present BioPlex 2.0 (Biophysical Interactions of ORFEOME-derived complexes), which employs robust affinity purification-mass spectrometry (AP-MS) methodology4 to elucidate protein interaction networks and co-complexes nucleated by more than 25% of protein coding genes from the human genome, and constitutes the largest such network to date. With >56,000 candidate interactions, BioPlex 2.0 contains >29,000 previously unknown co-associations and provides functional insights into hundreds of poorly characterized proteins while enhancing network-based analyses of domain associations, subcellular localization, and co-complex formation. Unsupervised Markov clustering (MCL)5 of interacting proteins identified more than 1300 protein communities representing diverse cellular activities. Genes essential for cell fitness6,7 are enriched within 53 communities representing central cellular functions. Moreover, we identified 442 communities associated with more than 2000 disease annotations, placing numerous candidate disease genes into a cellular framework. BioPlex 2.0 exceeds previous experimentally derived interaction networks in depth and breadth, and will be a valuable resource for exploring the biology of incompletely characterized proteins and for elucidating larger-scale patterns of proteome organization. PMID:28514442
Evidence-based green algal genomics reveals marine diversity and ancestral characteristics of land plants.

PubMed

van Baren, Marijke J; Bachy, Charles; Reistetter, Emily Nahas; Purvine, Samuel O; Grimwood, Jane; Sudek, Sebastian; Yu, Hang; Poirier, Camille; Deerinck, Thomas J; Kuo, Alan; Grigoriev, Igor V; Wong, Chee-Hong; Smith, Richard D; Callister, Stephen J; Wei, Chia-Lin; Schmutz, Jeremy; Worden, Alexandra Z

2016-03-31

Prasinophytes are widespread marine green algae that are related to plants. Cellular abundance of the prasinophyte Micromonas has reportedly increased in the Arctic due to climate-induced changes. Thus, studies of these unicellular eukaryotes are important for marine ecology and for understanding Viridiplantae evolution and diversification. We generated evidence-based Micromonas gene models using proteomics and RNA-Seq to improve prasinophyte genomic resources. First, sequences of four chromosomes in the 22 Mb Micromonas pusilla (CCMP1545) genome were finished. Comparison with the finished 21 Mb genome of Micromonas commoda (RCC299; named herein) shows they share ≤8,141 of ~10,000 protein-encoding genes, depending on the analysis method. Unlike RCC299 and other sequenced eukaryotes, CCMP1545 has two abundant repetitive intron types and a high percent (26 %) GC splice donors. Micromonas has more genus-specific protein families (19 %) than other genome sequenced prasinophytes (11 %). Comparative analyses using predicted proteomes from other prasinophytes reveal proteins likely related to scale formation and ancestral photosynthesis. Our studies also indicate that peptidoglycan (PG) biosynthesis enzymes have been lost in multiple independent events in select prasinophytes and plants. However, CCMP1545, polar Micromonas CCMP2099 and prasinophytes from other classes retain the entire PG pathway, like moss and glaucophyte algae. Surprisingly, multiple vascular plants also have the PG pathway, except the Penicillin-Binding Protein, and share a unique bi-domain protein potentially associated with the pathway. Alongside Micromonas experiments using antibiotics that halt bacterial PG biosynthesis, the findings highlight unrecognized phylogenetic complexity in PG-pathway retention and implicate a role in chloroplast structure or division in several extant Viridiplantae lineages. Extensive differences in gene loss and architecture between related prasinophytes underscore their divergence. PG biosynthesis genes from the cyanobacterial endosymbiont that became the plastid, have been selectively retained in multiple plants and algae, implying a biological function. Our studies provide robust genomic resources for emerging model algae, advancing knowledge of marine phytoplankton and plant evolution.
KEGG orthology-based annotation of the predicted proteome of Acropora digitifera: ZoophyteBase - an open access and searchable database of a coral genome

PubMed Central

2013-01-01

Background Contemporary coral reef research has firmly established that a genomic approach is urgently needed to better understand the effects of anthropogenic environmental stress and global climate change on coral holobiont interactions. Here we present KEGG orthology-based annotation of the complete genome sequence of the scleractinian coral Acropora digitifera and provide the first comprehensive view of the genome of a reef-building coral by applying advanced bioinformatics. Description Sequences from the KEGG database of protein function were used to construct hidden Markov models. These models were used to search the predicted proteome of A. digitifera to establish complete genomic annotation. The annotated dataset is published in ZoophyteBase, an open access format with different options for searching the data. A particularly useful feature is the ability to use a Google-like search engine that links query words to protein attributes. We present features of the annotation that underpin the molecular structure of key processes of coral physiology that include (1) regulatory proteins of symbiosis, (2) planula and early developmental proteins, (3) neural messengers, receptors and sensory proteins, (4) calcification and Ca2+-signalling proteins, (5) plant-derived proteins, (6) proteins of nitrogen metabolism, (7) DNA repair proteins, (8) stress response proteins, (9) antioxidant and redox-protective proteins, (10) proteins of cellular apoptosis, (11) microbial symbioses and pathogenicity proteins, (12) proteins of viral pathogenicity, (13) toxins and venom, (14) proteins of the chemical defensome and (15) coral epigenetics. Conclusions We advocate that providing annotation in an open-access searchable database available to the public domain will give an unprecedented foundation to interrogate the fundamental molecular structure and interactions of coral symbiosis and allow critical questions to be addressed at the genomic level based on combined aspects of evolutionary, developmental, metabolic, and environmental perspectives. PMID:23889801

The influence of iron on the proteomic profile of Chromobacterium violaceum.

PubMed

Lima, Daniel C; Duarte, Fábio T; Medeiros, Viviane K S; Lima, Diogo B; Carvalho, Paulo C; Bonatto, Diego; Batistuzzo de Medeiros, Silvia R

2014-10-20

Chromobacterium violaceum is a bacterium commonly found in tropical and subtropical regions and is associated with important pharmacological and industrial attributes such as producing substances with therapeutic properties and synthesizing biodegradable polymers. Its genome was sequenced, however, approximately 40% of its genes still remain with unknown functions. Although C. violaceum is known by its versatile capacity of living in a wide range of environments, little is known on how it achieves such success. Here, we investigated the proteomic profile of C. violaceum cultivated in the absence and presence of high iron concentration, describing some proteins of unknown function that might play an important role in iron homeostasis, amongst others. Briefly, C. violaceum was cultivated in the absence and in the presence of 9 mM of iron during four hours. Total proteins were identified by LC-MS and through the PatternLab pipeline. Our proteomic analysis indicates major changes in the energetic metabolism, and alterations in the synthesis of key transport and stress proteins. In addition, it may suggest the presence of a yet unidentified operon that could be related to oxidative stress, together with a set of other proteins with unknown function. The protein-protein interaction network also pinpointed the importance of energetic metabolism proteins to the acclimatation of C. violaceum in high concentration of iron. This is the first proteomic analysis of the opportunistic pathogen C. violaceum in the presence of high iron concentration. Our data allowed us to identify a yet undescribed operon that might have a role in oxidative stress defense. Our work provides new data that will contribute to understand how this bacterium achieve its capacity of surviving in harsh conditions as well as to open a way to explore the yet little availed biotechnological characteristics of this bacterium with the further exploring of the proteins of unknown function that we showed to be up-regulated in high iron concentration.
New Funding Opportunity: Tissue Purchase Order Acquisitions | Office of Cancer Clinical Proteomics Research

Cancer.gov

The National Cancer Institute (NCI) is expanding its basic and translational research programs that rely heavily on sufficient availability of high quality, well annotated biospecimens suitable for use in genomic and proteomic studies. The NCI’s overarching goal with such programs is to improve the ability to diagnose, treat, and prevent cancer.
Floral gene resources from basal angiosperms for comparative genomics research

PubMed Central

Albert, Victor A; Soltis, Douglas E; Carlson, John E; Farmerie, William G; Wall, P Kerr; Ilut, Daniel C; Solow, Teri M; Mueller, Lukas A; Landherr, Lena L; Hu, Yi; Buzgo, Matyas; Kim, Sangtae; Yoo, Mi-Jeong; Frohlich, Michael W; Perl-Treves, Rafael; Schlarbaum, Scott E; Bliss, Barbara J; Zhang, Xiaohong; Tanksley, Steven D; Oppenheimer, David G; Soltis, Pamela S; Ma, Hong; dePamphilis, Claude W; Leebens-Mack, James H

2005-01-01

Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and functional divergence, and analyses of adaptive molecular evolution. Since not all genes in the floral transcriptome will be associated with flowering, these EST resources will also be of interest to plant scientists working on other functions, such as photosynthesis, signal transduction, and metabolic pathways. PMID:15799777
TabSQL: a MySQL tool to facilitate mapping user data to public databases.

PubMed

Xia, Xiao-Qin; McClelland, Michael; Wang, Yipeng

2010-06-23

With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data.
TabSQL: a MySQL tool to facilitate mapping user data to public databases

PubMed Central

2010-01-01

Background With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. Results We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. Conclusions TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data. PMID:20573251
Omics-based interpretation of synergism in a soil-derived cellulose-degrading microbial community

PubMed Central

Zhou, Yizhuang; Pope, Phillip B.; Li, Shaochun; Wen, Bo; Tan, Fengji; Cheng, Shu; Chen, Jing; Yang, Jinlong; Liu, Feng; Lei, Xuejing; Su, Qingqing; Zhou, Chengran; Zhao, Jiao; Dong, Xiuzhu; Jin, Tao; Zhou, Xin; Yang, Shuang; Zhang, Gengyun; Yang, Huangming; Wang, Jian; Yang, Ruifu; Eijsink, Vincent G. H.; Wang, Jun

2014-01-01

Reaching a comprehensive understanding of how nature solves the problem of degrading recalcitrant biomass may eventually allow development of more efficient biorefining processes. Here we interpret genomic and proteomic information generated from a cellulolytic microbial consortium (termed F1RT) enriched from soil. Analyses of reconstructed bacterial draft genomes from all seven uncultured phylotypes in F1RT indicate that its constituent microbes cooperate in both cellulose-degrading and other important metabolic processes. Support for cellulolytic inter-species cooperation came from the discovery of F1RT microbes that encode and express complimentary enzymatic inventories that include both extracellular cellulosomes and secreted free-enzyme systems. Metabolic reconstruction of the seven F1RT phylotypes predicted a wider genomic rationale as to how this particular community functions as well as possible reasons as to why biomass conversion in nature relies on a structured and cooperative microbial community. PMID:24924356
Proteogenomic insights into salt tolerance by a halotolerant alpha-proteobacterium isolated from an Andean saline spring.

PubMed

Rubiano-Labrador, Carolina; Bland, Céline; Miotello, Guylaine; Guérin, Philippe; Pible, Olivier; Baena, Sandra; Armengaud, Jean

2014-01-31

Tistlia consotensis is a halotolerant Rhodospirillaceae that was isolated from a saline spring located in the Colombian Andes with a salt concentration close to seawater (4.5%w/vol). We cultivated this microorganism in three NaCl concentrations, i.e. optimal (0.5%), without (0.0%) and high (4.0%) salt concentration, and analyzed its cellular proteome. For assigning tandem mass spectrometry data, we first sequenced its genome and constructed a six reading frame ORF database from the draft sequence. We annotated only the genes whose products (872) were detected. We compared the quantitative proteome data sets recorded for the three different growth conditions. At low salinity general stress proteins (chaperons, proteases and proteins associated with oxidative stress protection), were detected in higher amounts, probably linked to difficulties for proper protein folding and metabolism. Proteogenomics and comparative genomics pointed at the CrgA transcriptional regulator as a key-factor for the proteome remodeling upon low osmolarity. In hyper-osmotic condition, T. consotensis produced in larger amounts proteins involved in the sensing of changes in salt concentration, as well as a wide panel of transport systems for the transport of organic compatible solutes such as glutamate. We have described here a straightforward procedure in making a new environmental isolate quickly amenable to proteomics. The bacterium Tistlia consotensis was isolated from a saline spring in the Colombian Andes and represents an interesting environmental model to be compared with extremophiles or other moderate organisms. To explore the halotolerance molecular mechanisms of the bacterium T. consotensis, we developed an innovative proteogenomic strategy consisting of i) genome sequencing, ii) quick annotation of the genes whose products were detected by mass spectrometry, and iii) comparative proteomics of cells grown in three salt conditions. We highlighted in this manuscript how efficient such an approach can be compared to time-consuming genome annotation when pointing at the key proteins of a given biological question. We documented a large number of proteins found produced in greater amounts when cells are cultivated in either hypo-osmotic or hyper-osmotic conditions. This article is part of a Special Issue entitled: Trends in Microbial Proteomics. Copyright © 2013 Elsevier B.V. All rights reserved.
Recombinant organisms for production of industrial products

PubMed Central

Adrio, Jose-Luis

2010-01-01

A revolution in industrial microbiology was sparked by the discoveries of ther double-stranded structure of DNA and the development of recombinant DNA technology. Traditional industrial microbiology was merged with molecular biology to yield improved recombinant processes for the industrial production of primary and secondary metabolites, protein biopharmaceuticals and industrial enzymes. Novel genetic techniques such as metabolic engineering, combinatorial biosynthesis and molecular breeding techniques and their modifications are contributing greatly to the development of improved industrial processes. In addition, functional genomics, proteomics and metabolomics are being exploited for the discovery of novel valuable small molecules for medicine as well as enzymes for catalysis. The sequencing of industrial microbal genomes is being carried out which bodes well for future process improvement and discovery of new industrial products. PMID:21326937
Comparative proteomic analysis of Desulfotomaculum reducens MI-1: Insights into the metabolic versatility of a gram-positive sulfate- and metal-reducing bacterium

DOE PAGES

Otwell, Anne E.; Callister, Stephen J.; Zink, Erika M.; ...

2016-02-19

In this study, the proteomes of the metabolically versatile and poorly characterized Gram-positive bacterium Desulfotomaculum reducens MI-1 were compared across four cultivation conditions including sulfate reduction, soluble Fe(III) reduction, insoluble Fe(III) reduction, and pyruvate fermentation. Collectively across conditions, we observed at high confidence ~38% of genome-encoded proteins. Here, we focus on proteins that display significant differential abundance on conditions tested. To the best of our knowledge, this is the first full-proteome study focused on a Gram-positive organism cultivated either on sulfate or metal-reducing conditions. Several proteins with uncharacterized function encoded within heterodisulfide reductase ( hdr)-containing loci were upregulated on eithermore » sulfate (Dred_0633-4, Dred_0689-90, and Dred_1325-30) or Fe(III)-citrate-reducing conditions (Dred_0432-3 and Dred_1778-84). Two of these hdr-containing loci display homology to recently described flavin-based electron bifurcation (FBEB) pathways (Dred_1325-30 and Dred_1778-84). Additionally, we propose that a cluster of proteins, which is homologous to a described FBEB lactate dehydrogenase (LDH) complex, is performing lactate oxidation in D. reducens (Dred_0367-9). Analysis of the putative sulfate reduction machinery in D. reducens revealed that most of these proteins are constitutively expressed across cultivation conditions tested. In addition, peptides from the single multiheme c-type cytochrome (MHC) in the genome were exclusively observed on the insoluble Fe(III) condition, suggesting that this MHC may play a role in reduction of insoluble metals.« less
Comparative proteomic analysis of Desulfotomaculum reducens MI-1: Insights into the metabolic versatility of a gram-positive sulfate- and metal-reducing bacterium

DOE Office of Scientific and Technical Information (OSTI.GOV)

Otwell, Anne E.; Callister, Stephen J.; Zink, Erika M.

In this study, the proteomes of the metabolically versatile and poorly characterized Gram-positive bacterium Desulfotomaculum reducens MI-1 were compared across four cultivation conditions including sulfate reduction, soluble Fe(III) reduction, insoluble Fe(III) reduction, and pyruvate fermentation. Collectively across conditions, we observed at high confidence ~38% of genome-encoded proteins. Here, we focus on proteins that display significant differential abundance on conditions tested. To the best of our knowledge, this is the first full-proteome study focused on a Gram-positive organism cultivated either on sulfate or metal-reducing conditions. Several proteins with uncharacterized function encoded within heterodisulfide reductase ( hdr)-containing loci were upregulated on eithermore » sulfate (Dred_0633-4, Dred_0689-90, and Dred_1325-30) or Fe(III)-citrate-reducing conditions (Dred_0432-3 and Dred_1778-84). Two of these hdr-containing loci display homology to recently described flavin-based electron bifurcation (FBEB) pathways (Dred_1325-30 and Dred_1778-84). Additionally, we propose that a cluster of proteins, which is homologous to a described FBEB lactate dehydrogenase (LDH) complex, is performing lactate oxidation in D. reducens (Dred_0367-9). Analysis of the putative sulfate reduction machinery in D. reducens revealed that most of these proteins are constitutively expressed across cultivation conditions tested. In addition, peptides from the single multiheme c-type cytochrome (MHC) in the genome were exclusively observed on the insoluble Fe(III) condition, suggesting that this MHC may play a role in reduction of insoluble metals.« less
Systems Biology and Mode of Action Based Risk Assessment

EPA Science Inventory

The application of systems biology has increased in the past decade largely as a consequence of the human genome project and technological advances in genomics and proteomics. Systems approaches have been used in the medical & pharmaceutical realm for diagnostic purposes and targ...
Systems Biology and Mode of Action Based Risk Assessment.

EPA Science Inventory

The application of systems biology approaches has greatly increased in the past decade largely as a consequence of the human genome project and technological advances in genomics and proteomics. Systems approaches have been used in the medical & pharmaceutical realm for diagnost...
Chemical Proteomic Approaches Targeting Cancer Stem Cells: A Review of Current Literature.

PubMed

Jung, Hye Jin

2017-01-01

Cancer stem cells (CSCs) have been proposed as central drivers of tumor initiation, progression, recurrence, and therapeutic resistance. Therefore, identifying stem-like cells within cancers and understanding their properties is crucial for the development of effective anticancer therapies. Recently, chemical proteomics has become a powerful tool to efficiently determine protein networks responsible for CSC pathophysiology and comprehensively elucidate molecular mechanisms of drug action against CSCs. This review provides an overview of major methodologies utilized in chemical proteomic approaches. In addition, recent successful chemical proteomic applications targeting CSCs are highlighted. Future direction of potential CSC research by integrating chemical genomic and proteomic data obtained from a single biological sample of CSCs are also suggested in this review. Copyright© 2017, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.
Integration of biological networks and gene expression data using Cytoscape

PubMed Central

Cline, Melissa S; Smoot, Michael; Cerami, Ethan; Kuchinsky, Allan; Landys, Nerius; Workman, Chris; Christmas, Rowan; Avila-Campilo, Iliana; Creech, Michael; Gross, Benjamin; Hanspers, Kristina; Isserlin, Ruth; Kelley, Ryan; Killcoyne, Sarah; Lotia, Samad; Maere, Steven; Morris, John; Ono, Keiichiro; Pavlovic, Vuk; Pico, Alexander R; Vailaya, Aditya; Wang, Peng-Liang; Adler, Annette; Conklin, Bruce R; Hood, Leroy; Kuiper, Martin; Sander, Chris; Schmulevich, Ilya; Schwikowski, Benno; Warner, Guy J; Ideker, Trey; Bader, Gary D

2013-01-01

Cytoscape is a free software package for visualizing, modeling and analyzing molecular and genetic interaction networks. This protocol explains how to use Cytoscape to analyze the results of mRNA expression profiling, and other functional genomics and proteomics experiments, in the context of an interaction network obtained for genes of interest. Five major steps are described: (i) obtaining a gene or protein network, (ii) displaying the network using layout algorithms, (iii) integrating with gene expression and other functional attributes, (iv) identifying putative complexes and functional modules and (v) identifying enriched Gene Ontology annotations in the network. These steps provide a broad sample of the types of analyses performed by Cytoscape. PMID:17947979
Contemporary Network Proteomics and Its Requirements

PubMed Central

Goh, Wilson Wen Bin; Wong, Limsoon; Sng, Judy Chia Ghee

2013-01-01

The integration of networks with genomics (network genomics) is a familiar field. Conventional network analysis takes advantage of the larger coverage and relative stability of gene expression measurements. Network proteomics on the other hand has to develop further on two critical factors: (1) expanded data coverage and consistency, and (2) suitable reference network libraries, and data mining from them. Concerning (1) we discuss several contemporary themes that can improve data quality, which in turn will boost the outcome of downstream network analysis. For (2), we focus on network analysis developments, specifically, the need for context-specific networks and essential considerations for localized network analysis. PMID:24833333
Jatropha curcas, a biofuel crop: functional genomics for understanding metabolic pathways and genetic improvement.

PubMed

Maghuly, Fatemeh; Laimer, Margit

2013-10-01

Jatropha curcas is currently attracting much attention as an oilseed crop for biofuel, as Jatropha can grow under climate and soil conditions that are unsuitable for food production. However, little is known about Jatropha, and there are a number of challenges to be overcome. In fact, Jatropha has not really been domesticated; most of the Jatropha accessions are toxic, which renders the seedcake unsuitable for use as animal feed. The seeds of Jatropha contain high levels of polyunsaturated fatty acids, which negatively impact the biofuel quality. Fruiting of Jatropha is fairly continuous, thus increasing costs of harvesting. Therefore, before starting any improvement program using conventional or molecular breeding techniques, understanding gene function and the genome scale of Jatropha are prerequisites. This review presents currently available and relevant information on the latest technologies (genomics, transcriptomics, proteomics and metabolomics) to decipher important metabolic pathways within Jatropha, such as oil and toxin synthesis. Further, it discusses future directions for biotechnological approaches in Jatropha breeding and improvement. © 2013 The Authors. Biotechnology Journal published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Genomic, proteomic and biochemical analysis of the chitinolytic machinery of Serratia marcescens BJL200.

PubMed

Tuveng, Tina R; Hagen, Live Heldal; Mekasha, Sophanit; Frank, Jeremy; Arntzen, Magnus Øverlie; Vaaje-Kolstad, Gustav; Eijsink, Vincent G H

2017-04-01

The chitinolytic machinery of Serratia marcescens BJL200 has been studied in detail over the last couple of decades, however, the proteome secreted by this Gram-negative bacterium during growth on chitin has not been studied in depth. In addition, the genome of this most studied chitinolytic Serratia strain has until now, not been sequenced. We report a draft genome sequence for S. marcescens BJL200. Using label-free quantification (LFQ) proteomics and a recently developed plate-method for assessing secretomes during growth on solid substrates, we find that, as expected, the chitin-active enzymes (ChiA, B, C, and CBP21) are produced in high amounts when the bacterium grows on chitin. Other proteins produced in high amounts after bacterial growth on chitin provide interesting targets for further exploration of the proteins involved in degradation of chitin-rich biomasses. The genome encodes a fourth chitinase (ChiD), which is produced in low amounts during growth on chitin. Studies of chitin degradation with mixtures of recombinantly produced chitin-degrading enzymes showed that ChiD does not contribute to the overall efficiency of the process. ChiD is capable of converting N,N'-diacetyl chitobiose to N-acetyl glucosamine, but is less efficient than another enzyme produced for this purpose, the Chitobiase. Thus, the role of ChiD in chitin degradation, if any, remains unclear. Copyright © 2017 Elsevier B.V. All rights reserved.
Reevaluation of the Coding Potential and Proteomic Analysis of the BAC Derived Rhesus Cytomegalovirus Strain 68-1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Malouli, Daniel; Nakayasu, Ernesto S.; Viswanathan, Kasinath

2012-09-01

Cytomegaloviruses are highly host restricted resulting in co-speciation with their hosts. As a natural pathogen of rhesus macaques (RM), Rhesus Cytomegalovirus (RhCMV) has therefore emerged as a highly relevant experimental model for pathogenesis and vaccine development due to its close evolutionary relationship to human CMV (HCMV). To date, most in vivo experiments performed with RhCMV employed strain 68-1 cloned as bacterial artificial chromosome (BAC). However, the complete genome sequence of the 68-1 BAC has not been determined. Furthermore, the gene content of the RhCMV genome is unknown and previous open reading frame (ORF) predictions relied solely on uninterrupted ORFs withmore » an arbitrary cutoff of 300bp. To obtain a more precise picture of the actual proteins encoded by the most commonly used molecular clone of RhCMV we re-evaluated the RhCMV 68-1 BAC-genome by whole genome shotgun sequencing and determined the protein content of the resulting RhCMV virions by proteomics. By additionally comparing the RhCMV genome to that of several closely related Old World Monkey (OWM) CMVs we were able to filter out many unlikely ORFs and obtain a simplified map of the RhCMV genome. This comparative genomics analysis eliminated many genes previously characterized as RhCMV-specific while consolidating a high conservation of ORFs among OWM-CMVs and between RhCMV and HCMV. Moreover, virion proteomics independently validated the revised ORF predictions since only proteins encoded by predicted ORFs could be detected. Taken together these data suggest a much higher conservation of genome and virion structure between CMVs of humans, apes and OWMs than previously assumed. Remarkably, BAC-derived RhCMV is able to establish and maintain persistent infection despite the lack of multiple genes homologous to HCMV genes involved in tissue tropism.« less
Scientific Advances with Aspergillus Species that Are Used for Food and Biotech Applications.

PubMed

Biesebeke, Rob Te; Record, Erik

2008-01-01

Yeast and filamentous fungi have been used for centuries in diverse biotechnological processes. Fungal fermentation technology is traditionally used in relation to food production, such as for bread, beer, cheese, sake and soy sauce. Last century, the industrial application of yeast and filamentous fungi expanded rapidly, with excellent examples such as purified enzymes and secondary metabolites (e.g. antibiotics), which are used in a wide range of food as well as non-food industries. Research on protein and/or metabolite secretion by fungal species has focused on identifying bottlenecks in (post-) transcriptional regulation of protein production, metabolic rerouting, morphology and the transit of proteins through the secretion pathway. In past years, genome sequencing of some fungi (e.g. Aspergillus oryzae, Aspergillus niger) has been completed. The available genome sequences have enabled identification of genes and functionally important regions of the genome. This has directed research to focus on a post-genomics era in which transcriptomics, proteomics and metabolomics methodologies will help to explore the scientific relevance and industrial application of fungal genome sequences.
A comprehensive proteomics and genomics analysis reveals novel transmembrane proteins in human platelets and mouse megakaryocytes including G6b-B, a novel ITIM protein

PubMed Central

Senis, Yotis A.; Tomlinson, Michael G.; García, Ángel; Dumon, Stephanie; Heath, Victoria L.; Herbert, John; Cobbold, Stephen P.; Spalton, Jennifer C.; Ayman, Sinem; Antrobus, Robin; Zitzmann, Nicole; Bicknell, Roy; Frampton, Jon; Authi, Kalwant; Martin, Ashley; Wakelam, Michael J.O.; Watson, Stephen P.

2007-01-01

Summary The platelet surface is poorly characterized due to the low abundance of many membrane proteins and the lack of specialist tools for their investigation. In this study we have identified novel human platelet and mouse megakaryocyte membrane proteins using specialist proteomic and genomic approaches. Three separate methods were used to enrich platelet surface proteins prior to identification by liquid chromatography and tandem mass spectrometry: lectin affinity chromatography; biotin/NeutrAvidin affinity chromatography; and free flow electrophoresis. Many known, abundant platelet surface transmembrane proteins and several novel proteins were identified using each receptor enrichment strategy. In total, two or more unique peptides were identified for 46, 68 and 22 surface membrane, intracellular membrane and membrane proteins of unknown sub-cellular localization, respectively. The majority of these were single transmembrane proteins. To complement the proteomic studies, we analysed the transcriptome of a highly purified preparation of mature primary mouse megakaryocytes using serial analysis of gene expression in view of the increasing importance of mutant mouse models in establishing protein function in platelets. This approach identified all of the major classes of platelet transmembrane receptors, including multi-transmembrane proteins. Strikingly, 17 of the 25 most megakaryocyte-specific genes (relative to 30 other SAGE libraries) were transmembrane proteins, illustrating the unique nature of the megakaryocyte/platelet surface. The list of novel plasma membrane proteins identified using proteomics includes the immunoglobulin superfamily member G6b, which undergoes extensive alternate splicing. Specific antibodies were used to demonstrate expression of the G6b-B isoform, which contains an immunoreceptor tyrosine-based inhibition motif. G6b-B undergoes tyrosine phosphorylation and association with the SH2-containing phosphatase, SHP-1, in stimulated platelets suggesting that it may play a novel role in limiting platelet activation. PMID:17186946

Some links on this page may take you to non-federal websites. Their policies may differ from this site.