protein expression database: Topics by Science.gov

Sample records for protein expression database

Identification of Fur, Aconitase, and Other Proteins Expressed by Mycobacterium tuberculosis under Conditions of Low and High Concentrations of Iron by Combined Two-Dimensional Gel Electrophoresis and Mass Spectrometry

PubMed Central

Wong, Diane K.; Lee, Bai-Yu; Horwitz, Marcus A.; Gibson, Bradford W.

1999-01-01

Iron plays a critical role in the pathophysiology of Mycobacterium tuberculosis. To gain a better understanding of iron regulation by this organism, we have used two-dimensional (2-D) gel electrophoresis, mass spectrometry, and database searching to study protein expression in M. tuberculosis under conditions of high and low iron concentration. Proteins in cellular extracts from M. tuberculosis Erdman strain grown under low-iron (1 μM) and high-iron (70 μM) conditions were separated by 2-D polyacrylamide gel electrophoresis, which allowed high-resolution separation of several hundred proteins, as visualized by Coomassie staining. The expression of at least 15 proteins was induced, and the expression of at least 12 proteins was decreased under low-iron conditions. In-gel trypsin digestion was performed on these differentially expressed proteins, and the digestion mixtures were analyzed by matrix-assisted laser desorption ionization time-of-flight mass spectrometry to determine the molecular masses of the resulting tryptic peptides. Partial sequence data on some of the peptides were obtained by using after source decay and/or collision-induced dissociation. The fragmentation data were used to search computerized peptide mass and protein sequence databases for known proteins. Ten iron-regulated proteins were identified, including Fur and aconitase proteins, both of which are known to be regulated by iron in other bacterial systems. Our study shows that, where large protein sequence databases are available from genomic studies, the combined use of 2-D gel electrophoresis, mass spectrometry, and database searching to analyze proteins expressed under defined environmental conditions is a powerful tool for identifying expressed proteins and their physiologic relevance. PMID:9864233
Genic insights from integrated human proteomics in GeneCards.

PubMed

Fishilevich, Simon; Zimmerman, Shahar; Kohn, Asher; Iny Stein, Tsippi; Olender, Tsviya; Kolker, Eugene; Safran, Marilyn; Lancet, Doron

2016-01-01

GeneCards is a one-stop shop for searchable human gene annotations (http://www.genecards.org/). Data are automatically mined from ∼120 sources and presented in an integrated web card for every human gene. We report the application of recent advances in proteomics to enhance gene annotation and classification in GeneCards. First, we constructed the Human Integrated Protein Expression Database (HIPED), a unified database of protein abundance in human tissues, based on the publically available mass spectrometry (MS)-based proteomics sources ProteomicsDB, Multi-Omics Profiling Expression Database, Protein Abundance Across Organisms and The MaxQuant DataBase. The integrated database, residing within GeneCards, compares favourably with its individual sources, covering nearly 90% of human protein-coding genes. For gene annotation and comparisons, we first defined a protein expression vector for each gene, based on normalized abundances in 69 normal human tissues. This vector is portrayed in the GeneCards expression section as a bar graph, allowing visual inspection and comparison. These data are juxtaposed with transcriptome bar graphs. Using the protein expression vectors, we further defined a pairwise metric that helps assess expression-based pairwise proximity. This new metric for finding functional partners complements eight others, including sharing of pathways, gene ontology (GO) terms and domains, implemented in the GeneCards Suite. In parallel, we calculated proteome-based differential expression, highlighting a subset of tissues that overexpress a gene and subserving gene classification. This textual annotation allows users of VarElect, the suite's next-generation phenotyper, to more effectively discover causative disease variants. Finally, we define the protein-RNA expression ratio and correlation as yet another attribute of every gene in each tissue, adding further annotative information. The results constitute a significant enhancement of several GeneCards sections and help promote and organize the genome-wide structural and functional knowledge of the human proteome. Database URL:http://www.genecards.org/. © The Author(s) 2016. Published by Oxford University Press.
Metagenomic Taxonomy-Guided Database-Searching Strategy for Improving Metaproteomic Analysis.

PubMed

Xiao, Jinqiu; Tanca, Alessandro; Jia, Ben; Yang, Runqing; Wang, Bo; Zhang, Yu; Li, Jing

2018-04-06

Metaproteomics provides a direct measure of the functional information by investigating all proteins expressed by a microbiota. However, due to the complexity and heterogeneity of microbial communities, it is very hard to construct a sequence database suitable for a metaproteomic study. Using a public database, researchers might not be able to identify proteins from poorly characterized microbial species, while a sequencing-based metagenomic database may not provide adequate coverage for all potentially expressed protein sequences. To address this challenge, we propose a metagenomic taxonomy-guided database-search strategy (MT), in which a merged database is employed, consisting of both taxonomy-guided reference protein sequences from public databases and proteins from metagenome assembly. By applying our MT strategy to a mock microbial mixture, about two times as many peptides were detected as with the metagenomic database only. According to the evaluation of the reliability of taxonomic attribution, the rate of misassignments was comparable to that obtained using an a priori matched database. We also evaluated the MT strategy with a human gut microbial sample, and we found 1.7 times as many peptides as using a standard metagenomic database. In conclusion, our MT strategy allows the construction of databases able to provide high sensitivity and precision in peptide identification in metaproteomic studies, enabling the detection of proteins from poorly characterized species within the microbiota.
HypoxiaDB: a database of hypoxia-regulated proteins

PubMed Central

Khurana, Pankaj; Sugadev, Ragumani; Jain, Jaspreet; Singh, Shashi Bala

2013-01-01

There has been intense interest in the cellular response to hypoxia, and a large number of differentially expressed proteins have been identified through various high-throughput experiments. These valuable data are scattered, and there have been no systematic attempts to document the various proteins regulated by hypoxia. Compilation, curation and annotation of these data are important in deciphering their role in hypoxia and hypoxia-related disorders. Therefore, we have compiled HypoxiaDB, a database of hypoxia-regulated proteins. It is a comprehensive, manually-curated, non-redundant catalog of proteins whose expressions are shown experimentally to be altered at different levels and durations of hypoxia. The database currently contains 72 000 manually curated entries taken on 3500 proteins extracted from 73 peer-reviewed publications selected from PubMed. HypoxiaDB is distinctive from other generalized databases: (i) it compiles tissue-specific protein expression changes under different levels and duration of hypoxia. Also, it provides manually curated literature references to support the inclusion of the protein in the database and establish its association with hypoxia. (ii) For each protein, HypoxiaDB integrates data on gene ontology, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway, protein–protein interactions, protein family (Pfam), OMIM (Online Mendelian Inheritance in Man), PDB (Protein Data Bank) structures and homology to other sequenced genomes. (iii) It also provides pre-compiled information on hypoxia-proteins, which otherwise requires tedious computational analysis. This includes information like chromosomal location, identifiers like Entrez, HGNC, Unigene, Uniprot, Ensembl, Vega, GI numbers and Genbank accession numbers associated with the protein. These are further cross-linked to respective public databases augmenting HypoxiaDB to the external repositories. (iv) In addition, HypoxiaDB provides an online sequence-similarity search tool for users to compare their protein sequences with HypoxiaDB protein database. We hope that HypoxiaDB will enrich our knowledge about hypoxia-related biology and eventually will lead to the development of novel hypothesis and advancements in diagnostic and therapeutic activities. HypoxiaDB is freely accessible for academic and non-profit users via http://www.hypoxiadb.com. Database URL: http://www.hypoxiadb.com PMID:24178989
GExplore: a web server for integrated queries of protein domains, gene expression and mutant phenotypes

PubMed Central

2009-01-01

Background The majority of the genes even in well-studied multi-cellular model organisms have not been functionally characterized yet. Mining the numerous genome wide data sets related to protein function to retrieve potential candidate genes for a particular biological process remains a challenge. Description GExplore has been developed to provide a user-friendly database interface for data mining at the gene expression/protein function level to help in hypothesis development and experiment design. It supports combinatorial searches for proteins with certain domains, tissue- or developmental stage-specific expression patterns, and mutant phenotypes. GExplore operates on a stand-alone database and has fast response times, which is essential for exploratory searches. The interface is not only user-friendly, but also modular so that it accommodates additional data sets in the future. Conclusion GExplore is an online database for quick mining of data related to gene and protein function, providing a multi-gene display of data sets related to the domain composition of proteins as well as expression and phenotype data. GExplore is publicly available at: http://genome.sfu.ca/gexplore/ PMID:19917126
GenomewidePDB 2.0: A Newly Upgraded Versatile Proteogenomic Database for the Chromosome-Centric Human Proteome Project.

PubMed

Jeong, Seul-Ki; Hancock, William S; Paik, Young-Ki

2015-09-04

Since the launch of the Chromosome-centric Human Proteome Project (C-HPP) in 2012, the number of "missing" proteins has fallen to 2932, down from ∼5932 since the number was first counted in 2011. We compared the characteristics of missing proteins with those of already annotated proteins with respect to transcriptional expression pattern and the time periods in which newly identified proteins were annotated. We learned that missing proteins commonly exhibit lower levels of transcriptional expression and less tissue-specific expression compared with already annotated proteins. This makes it more difficult to identify missing proteins as time goes on. One of the C-HPP goals is to identify alternative spliced product of proteins (ASPs), which are usually difficult to find by shot-gun proteomic methods due to their sequence similarities with the representative proteins. To resolve this problem, it may be necessary to use a targeted proteomics approach (e.g., selected and multiple reaction monitoring [S/MRM] assays) and an innovative bioinformatics platform that enables the selection of target peptides for rarely expressed missing proteins or ASPs. Given that the success of efforts to identify missing proteins may rely on more informative public databases, it was necessary to upgrade the available integrative databases. To this end, we attempted to improve the features and utility of GenomewidePDB by integrating transcriptomic information (e.g., alternatively spliced transcripts), annotated peptide information, and an advanced search interface that can find proteins of interest when applying a targeted proteomics strategy. This upgraded version of the database, GenomewidePDB 2.0, may not only expedite identification of the remaining missing proteins but also enhance the exchange of information among the proteome community. GenomewidePDB 2.0 is available publicly at http://genomewidepdb.proteomix.org/.
Identification and Validation of Human Missing Proteins and Peptides in Public Proteome Databases: Data Mining Strategy.

PubMed

Elguoshy, Amr; Hirao, Yoshitoshi; Xu, Bo; Saito, Suguru; Quadery, Ali F; Yamamoto, Keiko; Mitsui, Toshiaki; Yamamoto, Tadashi

2017-12-01

In an attempt to complete human proteome project (HPP), Chromosome-Centric Human Proteome Project (C-HPP) launched the journey of missing protein (MP) investigation in 2012. However, 2579 and 572 protein entries in the neXtProt (2017-1) are still considered as missing and uncertain proteins, respectively. Thus, in this study, we proposed a pipeline to analyze, identify, and validate human missing and uncertain proteins in open-access transcriptomics and proteomics databases. Analysis of RNA expression pattern for missing proteins in Human protein Atlas showed that 28% of them, such as Olfactory receptor 1I1 ( O60431 ), had no RNA expression, suggesting the necessity to consider uncommon tissues for transcriptomic and proteomic studies. Interestingly, 21% had elevated expression level in a particular tissue (tissue-enriched proteins), indicating the importance of targeting such proteins in their elevated tissues. Additionally, the analysis of RNA expression level for missing proteins showed that 95% had no or low expression level (0-10 transcripts per million), indicating that low abundance is one of the major obstacles facing the detection of missing proteins. Moreover, missing proteins are predicted to generate fewer predicted unique tryptic peptides than the identified proteins. Searching for these predicted unique tryptic peptides that correspond to missing and uncertain proteins in the experimental peptide list of open-access MS-based databases (PA, GPM) resulted in the detection of 402 missing and 19 uncertain proteins with at least two unique peptides (≥9 aa) at <(5 × 10 -4 )% FDR. Finally, matching the native spectra for the experimentally detected peptides with their SRMAtlas synthetic counterparts at three transition sources (QQQ, QTOF, QTRAP) gave us an opportunity to validate 41 missing proteins by ≥2 proteotypic peptides.
Proteomics: Protein Identification Using Online Databases

ERIC Educational Resources Information Center

Eurich, Chris; Fields, Peter A.; Rice, Elizabeth

2012-01-01

Proteomics is an emerging area of systems biology that allows simultaneous study of thousands of proteins expressed in cells, tissues, or whole organisms. We have developed this activity to enable high school or college students to explore proteomic databases using mass spectrometry data files generated from yeast proteins in a college laboratory…
GRBase, a new gene regulation data base available by anonymous ftp.

PubMed Central

Collier, B; Danielsen, M

1994-01-01

The Gene Regulation Database (GRBase) is a compendium of information on the structure and function of proteins involved in the control of gene expression in eukaryotes. These proteins include transcription factors, proteins involved in signal transduction, and receptors. The database can be obtained by FTP in Filemaker Pro, text, and postscript formats. The database will be expanded in the coming year to include reviews on families of proteins involved in gene regulation and to allow online searching. PMID:7937071
IDPT: Insights into potential intrinsically disordered proteins through transcriptomic analysis of genes for prostate carcinoma epigenetic data.

PubMed

Mallik, Saurav; Sen, Sagnik; Maulik, Ujjwal

2016-07-15

Involvement of intrinsically disordered proteins (IDPs) with various dreadful diseases like cancer is an interesting research topic. In order to gain novel insights into the regulation of IDPs, in this article, we perform a transcriptomic analysis of mRNAs (genes) for transcripts encoding IDPs on a human multi-omics prostate carcinoma dataset having both gene expression and methylation data. In this regard, firstly the genes that consist of both the expression and methylation data, and that are corresponding to the cancer-related prostate-tissue-specific disordered proteins of MobiDb database, are selected. We apply standard t-test for determining differentially expressed genes as well as differentially methylated genes. A network having these genes and their targeter miRNAs from Diana Tarbase v7.0 database and corresponding Transcription Factors from TRANSFAC and ITFP databases, is then built. Thereafter, we perform literature search, and KEGG pathway and Gene Ontology analyses using DAVID database. Finally, we report several significant potential gene-markers (with the corresponding IDPs) that have inverse relationship between differential expression and methylation patterns, and that are hub genes of the TF-miRNA-gene network. Copyright © 2016 Elsevier B.V. All rights reserved.
MOPED 2.5—An Integrated Multi-Omics Resource: Multi-Omics Profiling Expression Database Now Includes Transcriptomics Data

PubMed Central

Montague, Elizabeth; Stanberry, Larissa; Higdon, Roger; Janko, Imre; Lee, Elaine; Anderson, Nathaniel; Choiniere, John; Stewart, Elizabeth; Yandl, Gregory; Broomall, William; Kolker, Natali

2014-01-01

Abstract Multi-omics data-driven scientific discovery crucially rests on high-throughput technologies and data sharing. Currently, data are scattered across single omics repositories, stored in varying raw and processed formats, and are often accompanied by limited or no metadata. The Multi-Omics Profiling Expression Database (MOPED, http://moped.proteinspire.org) version 2.5 is a freely accessible multi-omics expression database. Continual improvement and expansion of MOPED is driven by feedback from the Life Sciences Community. In order to meet the emergent need for an integrated multi-omics data resource, MOPED 2.5 now includes gene relative expression data in addition to protein absolute and relative expression data from over 250 large-scale experiments. To facilitate accurate integration of experiments and increase reproducibility, MOPED provides extensive metadata through the Data-Enabled Life Sciences Alliance (DELSA Global, http://delsaglobal.org) metadata checklist. MOPED 2.5 has greatly increased the number of proteomics absolute and relative expression records to over 500,000, in addition to adding more than four million transcriptomics relative expression records. MOPED has an intuitive user interface with tabs for querying different types of omics expression data and new tools for data visualization. Summary information including expression data, pathway mappings, and direct connection between proteins and genes can be viewed on Protein and Gene Details pages. These connections in MOPED provide a context for multi-omics expression data exploration. Researchers are encouraged to submit omics data which will be consistently processed into expression summaries. MOPED as a multi-omics data resource is a pivotal public database, interdisciplinary knowledge resource, and platform for multi-omics understanding. PMID:24910945
SGDB: a database of synthetic genes re-designed for optimizing protein over-expression.

PubMed

Wu, Gang; Zheng, Yuanpu; Qureshi, Imran; Zin, Htar Thant; Beck, Tyler; Bulka, Blazej; Freeland, Stephen J

2007-01-01

Here we present the Synthetic Gene Database (SGDB): a relational database that houses sequences and associated experimental information on synthetic (artificially engineered) genes from all peer-reviewed studies published to date. At present, the database comprises information from more than 200 published experiments. This resource not only provides reference material to guide experimentalists in designing new genes that improve protein expression, but also offers a dataset for analysis by bioinformaticians who seek to test ideas regarding the underlying factors that influence gene expression. The SGDB was built under MySQL database management system. We also offer an XML schema for standardized data description of synthetic genes. Users can access the database at http://www.evolvingcode.net/codon/sgdb/index.php, or batch downloads all information through XML files. Moreover, users may visually compare the coding sequences of a synthetic gene and its natural counterpart with an integrated web tool at http://www.evolvingcode.net/codon/sgdb/aligner.php, and discuss questions, findings and related information on an associated e-forum at http://www.evolvingcode.net/forum/viewforum.php?f=27.
Analysis of high accuracy, quantitative proteomics data in the MaxQB database.

PubMed

Schaab, Christoph; Geiger, Tamar; Stoehr, Gabriele; Cox, Juergen; Mann, Matthias

2012-03-01

MS-based proteomics generates rapidly increasing amounts of precise and quantitative information. Analysis of individual proteomic experiments has made great strides, but the crucial ability to compare and store information across different proteome measurements still presents many challenges. For example, it has been difficult to avoid contamination of databases with low quality peptide identifications, to control for the inflation in false positive identifications when combining data sets, and to integrate quantitative data. Although, for example, the contamination with low quality identifications has been addressed by joint analysis of deposited raw data in some public repositories, we reasoned that there should be a role for a database specifically designed for high resolution and quantitative data. Here we describe a novel database termed MaxQB that stores and displays collections of large proteomics projects and allows joint analysis and comparison. We demonstrate the analysis tools of MaxQB using proteome data of 11 different human cell lines and 28 mouse tissues. The database-wide false discovery rate is controlled by adjusting the project specific cutoff scores for the combined data sets. The 11 cell line proteomes together identify proteins expressed from more than half of all human genes. For each protein of interest, expression levels estimated by label-free quantification can be visualized across the cell lines. Similarly, the expression rank order and estimated amount of each protein within each proteome are plotted. We used MaxQB to calculate the signal reproducibility of the detected peptides for the same proteins across different proteomes. Spearman rank correlation between peptide intensity and detection probability of identified proteins was greater than 0.8 for 64% of the proteome, whereas a minority of proteins have negative correlation. This information can be used to pinpoint false protein identifications, independently of peptide database scores. The information contained in MaxQB, including high resolution fragment spectra, is accessible to the community via a user-friendly web interface at http://www.biochem.mpg.de/maxqb.
Complementary DNA sequencing and identification of mRNAs from the venomous gland of Agkistrodon piscivorus leucostoma.

PubMed

Jia, Ying; Cantu, Bruno A; Sánchez, Elda E; Pérez, John C

2008-06-15

To advance our knowledge on the snake venom composition and transcripts expressed in venom gland at the molecular level, we constructed a cDNA library from the venom gland of Agkistrodon piscivorus leucostoma for the generation of expressed sequence tags (ESTs) database. From the randomly sequenced 2112 independent clones, we have obtained ESTs for 1309 (62%) cDNAs, which showed significant deduced amino acid sequence similarity (scores >80) to previously characterized proteins in National Center for Biotechnology Information (NCBI) database. Ribosomal proteins make up 47 clones (2%) and the remaining 756 (36%) cDNAs represent either unknown identity or show BLASTX sequence identity scores of <80 with known GenBank accessions. The most highly expressed gene encoding phospholipase A(2) (PLA(2)) accounting for 35% of A. p. leucostoma venom gland cDNAs was identified and further confirmed by crude venom applied to sodium dodecyl sulfate/polyacrylamide gel electrophoresis (SDS-PAGE) electrophoresis and protein sequencing. A total of 180 representative genes were obtained from the sequence assemblies and deposited to EST database. Clones showing sequence identity to disintegrins, thrombin-like enzymes, hemorrhagic toxins, fibrinogen clotting inhibitors and plasminogen activators were also identified in our EST database. These data can be used to develop a research program that will help us identify genes encoding proteins that are of medical importance or proteins involved in the mechanisms of the toxin venom.
Analysis of differential protein expression in normal and neoplastic human breast epithelial cell lines

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williams, K.; Chubb, C.; Huberman, E.

High resolution two dimensional get electrophoresis (2DE) and database analysis was used to establish protein expression patterns for cultured normal human mammary epithelial cells and thirteen breast cancer cell lines. The Human Breast Epithelial Cell database contains the 2DE protein patterns, including relative protein abundances, for each cell line, plus a composite pattern that contains all the common and specifically expressed proteins from all the cell lines. Significant differences in protein expression, both qualitative and quantitative, were observed not only between normal cells and tumor cells, but also among the tumor cell lines. Eight percent of the consistently detected proteinsmore » were found in significantly (P < 0.001) variable levels among the cell lines. Using a combination of immunostaining, comigration with purified protein, subcellular fractionation, and amino-terminal protein sequencing, we identified a subset of the differentially expressed proteins. These identified proteins include the cytoskeletal proteins actin, tubulin, vimentin, and cytokeratins. The cell lines can be classified into four distinct groups based on their intermediate filament protein profile. We also identified heat shock proteins; hsp27, hsp60, and hsp70 varied in abundance and in some cases in the relative phosphorylation levels among the cell lines. Finally, we identified IMP dehydrogenase in each of the cell lines, and found the levels of this enzyme in the tumor cell lines elevated 2- to 20-fold relative to the levels in normal cells.« less
PhosphoregDB: The tissue and sub-cellular distribution of mammalian protein kinases and phosphatases

PubMed Central

Forrest, Alistair RR; Taylor, Darrin F; Fink, J Lynn; Gongora, M Milena; Flegg, Cameron; Teasdale, Rohan D; Suzuki, Harukazu; Kanamori, Mutsumi; Kai, Chikatoshi; Hayashizaki, Yoshihide; Grimmond, Sean M

2006-01-01

Background Protein kinases and protein phosphatases are the fundamental components of phosphorylation dependent protein regulatory systems. We have created a database for the protein kinase-like and phosphatase-like loci of mouse that integrates protein sequence, interaction, classification and pathway information with the results of a systematic screen of their sub-cellular localization and tissue specific expression data mined from the GNF tissue atlas of mouse. Results The database lets users query where a specific kinase or phosphatase is expressed at both the tissue and sub-cellular levels. Similarly the interface allows the user to query by tissue, pathway or sub-cellular localization, to reveal which components are co-expressed or co-localized. A review of their expression reveals 30% of these components are detected in all tissues tested while 70% show some level of tissue restriction. Hierarchical clustering of the expression data reveals that expression of these genes can be used to separate the samples into tissues of related lineage, including 3 larger clusters of nervous tissue, developing embryo and cells of the immune system. By overlaying the expression, sub-cellular localization and classification data we examine correlations between class, specificity and tissue restriction and show that tyrosine kinases are more generally expressed in fewer tissues than serine/threonine kinases. Conclusion Together these data demonstrate that cell type specific systems exist to regulate protein phosphorylation and that for accurate modelling and for determination of enzyme substrate relationships the co-location of components needs to be considered. PMID:16504016
Gene discovery in Eimeria tenella by immunoscreening cDNA expression libraries of sporozoites and schizonts with chicken intestinal antibodies.

PubMed

Réfega, Susana; Girard-Misguich, Fabienne; Bourdieu, Christiane; Péry, Pierre; Labbé, Marie

2003-04-02

Specific antibodies were produced ex vivo from intestinal culture of Eimeria tenella infected chickens. The specificity of these intestinal antibodies was tested against different parasite stages. These antibodies were used to immunoscreen first generation schizont and sporozoite cDNA libraries permitting the identification of new E. tenella antigens. We obtained a total of 119 cDNA clones which were subjected to sequence analysis. The sequences coding for the proteins inducing local immune responses were compared with nucleotide or protein databases and with expressed sequence tags (ESTs) databases. We identified new Eimeria genes coding for heat shock proteins, a ribosomal protein, a pyruvate kinase and a pyridoxine kinase. Specific features of other sequences are discussed.
Differential protein expression in alligator leukocytes in response to bacterial lipopolysaccharide injection.

PubMed

Merchant, Mark; Kinney, Clint; Sanders, Paige

2009-12-01

Blood was collected from three juvenile alligators (Alligator mississippiensis) before, and again 24h after, injection with bacterial lipopolysaccharide (LPS). The leukocytes were collected from both samples, and the proteins were extracted. Each group of proteins was labeled with a different fluorescent dye and the differences in protein expression were analyzed by two dimensional differential in-gel expressions (2D-DIGE). The proteins which appeared to be increased or decreased by treatment with LPS were selected and analyzed by MALDI-TOF to determine mass and LC-MS/MS to acquire the partial protein sequences. The peptide sequences were compared to the NCBI protein sequence database to determine homology with other sequences from other species. Several proteins of interest appeared to be increased upon LPS stimulation. Proteins with homology to human transgelin-2, fish glucose-6-phosphate dehydrogenase, amphibian α-enolase, alligator lactate dehydrogenase, fish ubiquitin-activating enzyme, and fungal β-tubulin were also increased after LPS injection. Proteins with homology to fish vimentin 4, murine heterogeneous nuclear ribonucleoprotein A3, and avian calreticulin were found to be decreased in response to LPS. In addition, five proteins, four of which were up-regulated (827, 560, 512, and 650%) and one that exhibited repressed expression (307%), did not show homology to any protein in the database, and thus may represent newly discovered proteins. We are using this biochemical approach to isolate and characterize alligator proteins with potential relevant immune function.
VisANT 3.0: new modules for pathway visualization, editing, prediction and construction.

PubMed

Hu, Zhenjun; Ng, David M; Yamada, Takuji; Chen, Chunnuan; Kawashima, Shuichi; Mellor, Joe; Linghu, Bolan; Kanehisa, Minoru; Stuart, Joshua M; DeLisi, Charles

2007-07-01

With the integration of the KEGG and Predictome databases as well as two search engines for coexpressed genes/proteins using data sets obtained from the Stanford Microarray Database (SMD) and Gene Expression Omnibus (GEO) database, VisANT 3.0 supports exploratory pathway analysis, which includes multi-scale visualization of multiple pathways, editing and annotating pathways using a KEGG compatible visual notation and visualization of expression data in the context of pathways. Expression levels are represented either by color intensity or by nodes with an embedded expression profile. Multiple experiments can be navigated or animated. Known KEGG pathways can be enriched by querying either coexpressed components of known pathway members or proteins with known physical interactions. Predicted pathways for genes/proteins with unknown functions can be inferred from coexpression or physical interaction data. Pathways produced in VisANT can be saved as computer-readable XML format (VisML), graphic images or high-resolution Scalable Vector Graphics (SVG). Pathways in the format of VisML can be securely shared within an interested group or published online using a simple Web link. VisANT is freely available at http://visant.bu.edu.
Identification of Anhydrobiosis-related Genes from an Expressed Sequence Tag Database in the Cryptobiotic Midge Polypedilum vanderplanki (Diptera; Chironomidae)*

PubMed Central

Cornette, Richard; Kanamori, Yasushi; Watanabe, Masahiko; Nakahara, Yuichi; Gusev, Oleg; Mitsumasu, Kanako; Kadono-Okuda, Keiko; Shimomura, Michihiko; Mita, Kazuei; Kikawada, Takahiro; Okuda, Takashi

2010-01-01

Some organisms are able to survive the loss of almost all their body water content, entering a latent state known as anhydrobiosis. The sleeping chironomid (Polypedilum vanderplanki) lives in the semi-arid regions of Africa, and its larvae can survive desiccation in an anhydrobiotic form during the dry season. To unveil the molecular mechanisms of this resistance to desiccation, an anhydrobiosis-related Expressed Sequence Tag (EST) database was obtained from the sequences of three cDNA libraries constructed from P. vanderplanki larvae after 0, 12, and 36 h of desiccation. The database contained 15,056 ESTs distributed into 4,807 UniGene clusters. ESTs were classified according to gene ontology categories, and putative expression patterns were deduced for all clusters on the basis of the number of clones in each library; expression patterns were confirmed by real-time PCR for selected genes. Among up-regulated genes, antioxidants, late embryogenesis abundant (LEA) proteins, and heat shock proteins (Hsps) were identified as important groups for anhydrobiosis. Genes related to trehalose metabolism and various transporters were also strongly induced by desiccation. Those results suggest that the oxidative stress response plays a central role in successful anhydrobiosis. Similarly, protein denaturation and aggregation may be prevented by marked up-regulation of Hsps and the anhydrobiosis-specific LEA proteins. A third major feature is the predicted increase in trehalose synthesis and in the expression of various transporter proteins allowing the distribution of trehalose and other solutes to all tissues. PMID:20833722

ImmunemiR - A Database of Prioritized Immune miRNA Disease Associations and its Interactome.

PubMed

Prabahar, Archana; Natarajan, Jeyakumar

2017-01-01

MicroRNAs are the key regulators of gene expression and their abnormal expression in the immune system may be associated with several human diseases such as inflammation, cancer and autoimmune diseases. Elucidation of miRNA disease association through the interactome will deepen the understanding of its disease mechanisms. A specialized database for immune miRNAs is highly desirable to demonstrate the immune miRNA disease associations in the interactome. miRNAs specific to immune related diseases were retrieved from curated databases such as HMDD, miR2disease and PubMed literature based on MeSH classification of immune system diseases. The additional data such as miRNA target genes, genes coding protein-protein interaction information were compiled from related resources. Further, miRNAs were prioritized to specific immune diseases using random walk ranking algorithm. In total 245 immune miRNAs associated with 92 OMIM disease categories were identified from external databases. The resultant data were compiled as ImmunemiR, a database of prioritized immune miRNA disease associations. This database provides both text based annotation information and network visualization of its interactome. To our knowledge, ImmunemiR is the first available database to provide a comprehensive repository of human immune disease associated miRNAs with network visualization options of its target genes, protein-protein interactions (PPI) and its disease associations. It is freely available at http://www.biominingbu.org/immunemir/. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
An emerging cyberinfrastructure for biodefense pathogen and pathogen-host data.

PubMed

Zhang, C; Crasta, O; Cammer, S; Will, R; Kenyon, R; Sullivan, D; Yu, Q; Sun, W; Jha, R; Liu, D; Xue, T; Zhang, Y; Moore, M; McGarvey, P; Huang, H; Chen, Y; Zhang, J; Mazumder, R; Wu, C; Sobral, B

2008-01-01

The NIAID-funded Biodefense Proteomics Resource Center (RC) provides storage, dissemination, visualization and analysis capabilities for the experimental data deposited by seven Proteomics Research Centers (PRCs). The data and its publication is to support researchers working to discover candidates for the next generation of vaccines, therapeutics and diagnostics against NIAID's Category A, B and C priority pathogens. The data includes transcriptional profiles, protein profiles, protein structural data and host-pathogen protein interactions, in the context of the pathogen life cycle in vivo and in vitro. The database has stored and supported host or pathogen data derived from Bacillus, Brucella, Cryptosporidium, Salmonella, SARS, Toxoplasma, Vibrio and Yersinia, human tissue libraries, and mouse macrophages. These publicly available data cover diverse data types such as mass spectrometry, yeast two-hybrid (Y2H), gene expression profiles, X-ray and NMR determined protein structures and protein expression clones. The growing database covers over 23 000 unique genes/proteins from different experiments and organisms. All of the genes/proteins are annotated and integrated across experiments using UniProt Knowledgebase (UniProtKB) accession numbers. The web-interface for the database enables searching, querying and downloading at the level of experiment, group and individual gene(s)/protein(s) via UniProtKB accession numbers or protein function keywords. The system is accessible at http://www.proteomicsresource.org/.
Detection of alternative splice variants at the proteome level in Aspergillus flavus.

PubMed

Chang, Kung-Yen; Georgianna, D Ryan; Heber, Steffen; Payne, Gary A; Muddiman, David C

2010-03-05

Identification of proteins from proteolytic peptides or intact proteins plays an essential role in proteomics. Researchers use search engines to match the acquired peptide sequences to the target proteins. However, search engines depend on protein databases to provide candidates for consideration. Alternative splicing (AS), the mechanism where the exon of pre-mRNAs can be spliced and rearranged to generate distinct mRNA and therefore protein variants, enable higher eukaryotic organisms, with only a limited number of genes, to have the requisite complexity and diversity at the proteome level. Multiple alternative isoforms from one gene often share common segments of sequences. However, many protein databases only include a limited number of isoforms to keep minimal redundancy. As a result, the database search might not identify a target protein even with high quality tandem MS data and accurate intact precursor ion mass. We computationally predicted an exhaustive list of putative isoforms of Aspergillus flavus proteins from 20 371 expressed sequence tags to investigate whether an alternative splicing protein database can assign a greater proportion of mass spectrometry data. The newly constructed AS database provided 9807 new alternatively spliced variants in addition to 12 832 previously annotated proteins. The searches of the existing tandem MS spectra data set using the AS database identified 29 new proteins encoded by 26 genes. Nine fungal genes appeared to have multiple protein isoforms. In addition to the discovery of splice variants, AS database also showed potential to improve genome annotation. In summary, the introduction of an alternative splicing database helps identify more proteins and unveils more information about a proteome.
Cytoscape: a software environment for integrated models of biomolecular interaction networks.

PubMed

Shannon, Paul; Markiel, Andrew; Ozier, Owen; Baliga, Nitin S; Wang, Jonathan T; Ramage, Daniel; Amin, Nada; Schwikowski, Benno; Ideker, Trey

2003-11-01

Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
Inconsistencies in the red blood cell membrane proteome analysis: generation of a database for research and diagnostic applications

PubMed Central

Hegedűs, Tamás; Chaubey, Pururawa Mayank; Várady, György; Szabó, Edit; Sarankó, Hajnalka; Hofstetter, Lia; Roschitzki, Bernd; Sarkadi, Balázs

2015-01-01

Based on recent results, the determination of the easily accessible red blood cell (RBC) membrane proteins may provide new diagnostic possibilities for assessing mutations, polymorphisms or regulatory alterations in diseases. However, the analysis of the current mass spectrometry-based proteomics datasets and other major databases indicates inconsistencies—the results show large scattering and only a limited overlap for the identified RBC membrane proteins. Here, we applied membrane-specific proteomics studies in human RBC, compared these results with the data in the literature, and generated a comprehensive and expandable database using all available data sources. The integrated web database now refers to proteomic, genetic and medical databases as well, and contains an unexpected large number of validated membrane proteins previously thought to be specific for other tissues and/or related to major human diseases. Since the determination of protein expression in RBC provides a method to indicate pathological alterations, our database should facilitate the development of RBC membrane biomarker platforms and provide a unique resource to aid related further research and diagnostics. Database URL: http://rbcc.hegelab.org PMID:26078478
An expressed sequence tag (EST) data mining strategy succeeding in the discovery of new G-protein coupled receptors.

PubMed

Wittenberger, T; Schaller, H C; Hellebrand, S

2001-03-30

We have developed a comprehensive expressed sequence tag database search method and used it for the identification of new members of the G-protein coupled receptor superfamily. Our approach proved to be especially useful for the detection of expressed sequence tag sequences that do not encode conserved parts of a protein, making it an ideal tool for the identification of members of divergent protein families or of protein parts without conserved domain structures in the expressed sequence tag database. At least 14 of the expressed sequence tags found with this strategy are promising candidates for new putative G-protein coupled receptors. Here, we describe the sequence and expression analysis of five new members of this receptor superfamily, namely GPR84, GPR86, GPR87, GPR90 and GPR91. We also studied the genomic structure and chromosomal localization of the respective genes applying in silico methods. A cluster of six closely related G-protein coupled receptors was found on the human chromosome 3q24-3q25. It consists of four orphan receptors (GPR86, GPR87, GPR91, and H963), the purinergic receptor P2Y1, and the uridine 5'-diphosphoglucose receptor KIAA0001. It seems likely that these receptors evolved from a common ancestor and therefore might have related ligands. In conclusion, we describe a data mining procedure that proved to be useful for the identification and first characterization of new genes and is well applicable for other gene families. Copyright 2001 Academic Press.
An emerging cyberinfrastructure for biodefense pathogen and pathogen–host data

PubMed Central

Zhang, C.; Crasta, O.; Cammer, S.; Will, R.; Kenyon, R.; Sullivan, D.; Yu, Q.; Sun, W.; Jha, R.; Liu, D.; Xue, T.; Zhang, Y.; Moore, M.; McGarvey, P.; Huang, H.; Chen, Y.; Zhang, J.; Mazumder, R.; Wu, C.; Sobral, B.

2008-01-01

The NIAID-funded Biodefense Proteomics Resource Center (RC) provides storage, dissemination, visualization and analysis capabilities for the experimental data deposited by seven Proteomics Research Centers (PRCs). The data and its publication is to support researchers working to discover candidates for the next generation of vaccines, therapeutics and diagnostics against NIAID's Category A, B and C priority pathogens. The data includes transcriptional profiles, protein profiles, protein structural data and host–pathogen protein interactions, in the context of the pathogen life cycle in vivo and in vitro. The database has stored and supported host or pathogen data derived from Bacillus, Brucella, Cryptosporidium, Salmonella, SARS, Toxoplasma, Vibrio and Yersinia, human tissue libraries, and mouse macrophages. These publicly available data cover diverse data types such as mass spectrometry, yeast two-hybrid (Y2H), gene expression profiles, X-ray and NMR determined protein structures and protein expression clones. The growing database covers over 23 000 unique genes/proteins from different experiments and organisms. All of the genes/proteins are annotated and integrated across experiments using UniProt Knowledgebase (UniProtKB) accession numbers. The web-interface for the database enables searching, querying and downloading at the level of experiment, group and individual gene(s)/protein(s) via UniProtKB accession numbers or protein function keywords. The system is accessible at http://www.proteomicsresource.org/. PMID:17984082
Proteomic analysis of pollination-induced corolla senescence in petunia.

PubMed

Bai, Shuangyi; Willard, Belinda; Chapin, Laura J; Kinter, Michael T; Francis, David M; Stead, Anthony D; Jones, Michelle L

2010-02-01

Senescence represents the last phase of petal development during which macromolecules and organelles are degraded and nutrients are recycled to developing tissues. To understand better the post-transcriptional changes regulating petal senescence, a proteomic approach was used to profile protein changes during the senescence of Petuniaxhybrida 'Mitchell Diploid' corollas. Total soluble proteins were extracted from unpollinated petunia corollas at 0, 24, 48, and 72 h after flower opening and at 24, 48, and 72 h after pollination. Two-dimensional gel electrophoresis (2-DE) was used to identify proteins that were differentially expressed in non-senescing (unpollinated) and senescing (pollinated) corollas, and image analysis was used to determine which proteins were up- or down-regulated by the experimentally determined cut-off of 2.1-fold for P <0.05. One hundred and thirty-three differentially expressed protein spots were selected for sequencing. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) was used to determine the identity of these proteins. Searching translated EST databases and the NCBI non-redundant protein database, it was possible to assign a putative identification to greater than 90% of these proteins. Many of the senescence up-regulated proteins were putatively involved in defence and stress responses or macromolecule catabolism. Some proteins, not previously characterized during flower senescence, were identified, including an orthologue of the tomato abscisic acid stress ripening protein 4 (ASR4). Gene expression patterns did not always correlate with protein expression, confirming that both proteomic and genomic approaches will be required to obtain a detailed understanding of the regulation of petal senescence.
The Histone Database: an integrated resource for histones and histone fold-containing proteins

PubMed Central

Mariño-Ramírez, Leonardo; Levine, Kevin M.; Morales, Mario; Zhang, Suiyuan; Moreland, R. Travis; Baxevanis, Andreas D.; Landsman, David

2011-01-01

Eukaryotic chromatin is composed of DNA and protein components—core histones—that act to compactly pack the DNA into nucleosomes, the fundamental building blocks of chromatin. These nucleosomes are connected to adjacent nucleosomes by linker histones. Nucleosomes are highly dynamic and, through various core histone post-translational modifications and incorporation of diverse histone variants, can serve as epigenetic marks to control processes such as gene expression and recombination. The Histone Sequence Database is a curated collection of sequences and structures of histones and non-histone proteins containing histone folds, assembled from major public databases. Here, we report a substantial increase in the number of sequences and taxonomic coverage for histone and histone fold-containing proteins available in the database. Additionally, the database now contains an expanded dataset that includes archaeal histone sequences. The database also provides comprehensive multiple sequence alignments for each of the four core histones (H2A, H2B, H3 and H4), the linker histones (H1/H5) and the archaeal histones. The database also includes current information on solved histone fold-containing structures. The Histone Sequence Database is an inclusive resource for the analysis of chromatin structure and function focused on histones and histone fold-containing proteins. Database URL: The Histone Sequence Database is freely available and can be accessed at http://research.nhgri.nih.gov/histones/. PMID:22025671
Database resources of the National Center for Biotechnology Information

PubMed Central

2015-01-01

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. PMID:25398906
Database resources of the National Center for Biotechnology Information

PubMed Central

2016-01-01

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:26615191
Differential gene expression analysis in glioblastoma cells and normal human brain cells based on GEO database.

PubMed

Wang, Anping; Zhang, Guibin

2017-11-01

The differentially expressed genes between glioblastoma (GBM) cells and normal human brain cells were investigated to performed pathway analysis and protein interaction network analysis for the differentially expressed genes. GSE12657 and GSE42656 gene chips, which contain gene expression profile of GBM were obtained from Gene Expression Omniub (GEO) database of National Center for Biotechnology Information (NCBI). The 'limma' data packet in 'R' software was used to analyze the differentially expressed genes in the two gene chips, and gene integration was performed using 'RobustRankAggreg' package. Finally, pheatmap software was used for heatmap analysis and Cytoscape, DAVID, STRING and KOBAS were used for protein-protein interaction, Gene Ontology (GO) and KEGG analyses. As results: i) 702 differentially expressed genes were identified in GSE12657, among those genes, 548 were significantly upregulated and 154 were significantly downregulated (p<0.01, fold-change >1), and 1,854 differentially expressed genes were identified in GSE42656, among the genes, 1,068 were significantly upregulated and 786 were significantly downregulated (p<0.01, fold-change >1). A total of 167 differentially expressed genes including 100 upregulated genes and 67 downregulated genes were identified after gene integration, and the genes showed significantly different expression levels in GBM compared with normal human brain cells (p<0.05). ii) Interactions between the protein products of 101 differentially expressed genes were identified using STRING and expression network was established. A key gene, called CALM3, was identified by Cytoscape software. iii) GO enrichment analysis showed that differentially expressed genes were mainly enriched in 'neurotransmitter:sodium symporter activity' and 'neurotransmitter transporter activity', which can affect the activity of neurotransmitter transportation. KEGG pathway analysis showed that the differentially expressed genes were mainly enriched in 'protein processing in endoplasmic reticulum', which can affect protein processing in endoplasmic reticulum. The results showed that: i) 167 differentially expressed genes were identified from two gene chips after integration; and ii) protein interaction network was established, and GO and KEGG pathway analyses were successfully performed to identify and annotate the key gene, which provide new insights for the studies on GBN at gene level.
Molecular differences between mature and immature dental pulp cells: Bioinformatics and preliminary results.

PubMed

Chen, Long; Jiang, Yifeng; Du, Zhen

2018-04-01

Although previous studies have demonstrated that dental pulp stem cells (DPSCs) from mature and immature teeth exhibit potential for multi-directional differentiation, the molecular and biological difference between the DPSCs from mature and immature permanent teeth has not been fully investigated. In the present study, 500 differentially expressed genes from dental pulp cells (DPCs) in mature and immature permanent teeth were obtained from the Gene Expression Omnibus online database. Based on bioinformatics analysis using the Database for Annotation, Visualization and Integrated Discovery, these genes were divided into a number of subgroups associated with immunity, inflammation and cell signaling. The results of the present study suggest that immune features, response to infection and cell signaling may be different in DPCs from mature and immature permanent teeth; furthermore, DPCs from immature permanent teeth may be more suitable for use in tissue engineering or stem cell therapy. The Online Mendelian Inheritance in Man database stated that Sonic Hedgehog (SHH), a differentially expressed gene in DPCs from mature and immature permanent teeth, serves a crucial role in the development of craniofacial tissues, including teeth, which further confirmed that SHH may cause DPCs from mature and immature permanent teeth to exhibit different biological characteristics. The Search Tool for the Retrieval of Interacting Genes/Proteins database revealed that SHH has functional protein associations with a number of other proteins, including Glioma-associated oncogene (GLI)1, GLI2, growth arrest-specific protein 1, bone morphogenetic protein (BMP)2 and BMP4, in mice and humans. It was also demonstrated that SHH may interact with other genes to regulate the biological characteristics of DPCs. The results of the present study may provide a useful reference basis for selecting suitable DPSCs and molecules for the treatment of these cells to optimize features for tissue engineering or stem cell therapy. Quantitative polymerase chain reaction should be performed to confirm the differential expression of these genes prior to the beginning of a functional study.
Exploring of the molecular mechanism of rhinitis via bioinformatics methods

PubMed Central

Song, Yufen; Yan, Zhaohui

2018-01-01

The aim of this study was to analyze gene expression profiles for exploring the function and regulatory network of differentially expressed genes (DEGs) in pathogenesis of rhinitis by a bioinformatics method. The gene expression profile of GSE43523 was downloaded from the Gene Expression Omnibus database. The dataset contained 7 seasonal allergic rhinitis samples and 5 non-allergic normal samples. DEGs between rhinitis samples and normal samples were identified via the limma package of R. The webGestal database was used to identify enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of the DEGs. The differentially co-expressed pairs of the DEGs were identified via the DCGL package in R, and the differential co-expression network was constructed based on these pairs. A protein-protein interaction (PPI) network of the DEGs was constructed based on the Search Tool for the Retrieval of Interacting Genes database. A total of 263 DEGs were identified in rhinitis samples compared with normal samples, including 125 downregulated ones and 138 upregulated ones. The DEGs were enriched in 7 KEGG pathways. 308 differential co-expression gene pairs were obtained. A differential co-expression network was constructed, containing 212 nodes. In total, 148 PPI pairs of the DEGs were identified, and a PPI network was constructed based on these pairs. Bioinformatics methods could help us identify significant genes and pathways related to the pathogenesis of rhinitis. Steroid biosynthesis pathway and metabolic pathways might play important roles in the development of allergic rhinitis (AR). Genes such as CDC42 effector protein 5, solute carrier family 39 member A11 and PR/SET domain 10 might be also associated with the pathogenesis of AR, which provided references for the molecular mechanisms of AR. PMID:29257233
Proteomic analysis of tardigrades: towards a better understanding of molecular mechanisms by anhydrobiotic organisms.

PubMed

Schokraie, Elham; Hotz-Wagenblatt, Agnes; Warnken, Uwe; Mali, Brahim; Frohme, Marcus; Förster, Frank; Dandekar, Thomas; Hengherr, Steffen; Schill, Ralph O; Schnölzer, Martina

2010-03-03

Tardigrades are small, multicellular invertebrates which are able to survive times of unfavourable environmental conditions using their well-known capability to undergo cryptobiosis at any stage of their life cycle. Milnesium tardigradum has become a powerful model system for the analysis of cryptobiosis. While some genetic information is already available for Milnesium tardigradum the proteome is still to be discovered. Here we present to the best of our knowledge the first comprehensive study of Milnesium tardigradum on the protein level. To establish a proteome reference map we developed optimized protocols for protein extraction from tardigrades in the active state and for separation of proteins by high resolution two-dimensional gel electrophoresis. Since only limited sequence information of M. tardigradum on the genome and gene expression level is available to date in public databases we initiated in parallel a tardigrade EST sequencing project to allow for protein identification by electrospray ionization tandem mass spectrometry. 271 out of 606 analyzed protein spots could be identified by searching against the publicly available NCBInr database as well as our newly established tardigrade protein database corresponding to 144 unique proteins. Another 150 spots could be identified in the tardigrade clustered EST database corresponding to 36 unique contigs and ESTs. Proteins with annotated function were further categorized in more detail by their molecular function, biological process and cellular component. For the proteins of unknown function more information could be obtained by performing a protein domain annotation analysis. Our results include proteins like protein member of different heat shock protein families and LEA group 3, which might play important roles in surviving extreme conditions. The proteome reference map of Milnesium tardigradum provides the basis for further studies in order to identify and characterize the biochemical mechanisms of tolerance to extreme desiccation. The optimized proteomics workflow will enable application of sensitive quantification techniques to detect differences in protein expression, which are characteristic of the active and anhydrobiotic states of tardigrades.
Proteomic Analysis of Tardigrades: Towards a Better Understanding of Molecular Mechanisms by Anhydrobiotic Organisms

PubMed Central

Schokraie, Elham; Hotz-Wagenblatt, Agnes; Warnken, Uwe; Mali, Brahim; Frohme, Marcus; Förster, Frank; Dandekar, Thomas; Hengherr, Steffen; Schill, Ralph O.; Schnölzer, Martina

2010-01-01

Background Tardigrades are small, multicellular invertebrates which are able to survive times of unfavourable environmental conditions using their well-known capability to undergo cryptobiosis at any stage of their life cycle. Milnesium tardigradum has become a powerful model system for the analysis of cryptobiosis. While some genetic information is already available for Milnesium tardigradum the proteome is still to be discovered. Principal Findings Here we present to the best of our knowledge the first comprehensive study of Milnesium tardigradum on the protein level. To establish a proteome reference map we developed optimized protocols for protein extraction from tardigrades in the active state and for separation of proteins by high resolution two-dimensional gel electrophoresis. Since only limited sequence information of M. tardigradum on the genome and gene expression level is available to date in public databases we initiated in parallel a tardigrade EST sequencing project to allow for protein identification by electrospray ionization tandem mass spectrometry. 271 out of 606 analyzed protein spots could be identified by searching against the publicly available NCBInr database as well as our newly established tardigrade protein database corresponding to 144 unique proteins. Another 150 spots could be identified in the tardigrade clustered EST database corresponding to 36 unique contigs and ESTs. Proteins with annotated function were further categorized in more detail by their molecular function, biological process and cellular component. For the proteins of unknown function more information could be obtained by performing a protein domain annotation analysis. Our results include proteins like protein member of different heat shock protein families and LEA group 3, which might play important roles in surviving extreme conditions. Conclusions The proteome reference map of Milnesium tardigradum provides the basis for further studies in order to identify and characterize the biochemical mechanisms of tolerance to extreme desiccation. The optimized proteomics workflow will enable application of sensitive quantification techniques to detect differences in protein expression, which are characteristic of the active and anhydrobiotic states of tardigrades. PMID:20224743
Phenome-genome association studies of pancreatic cancer: new targets for therapy and diagnosis.

PubMed

Narayanan, Ramaswamy

2015-01-01

Pancreatic cancer, has a very high mortality rate and requires novel molecular targets for diagnosis and therapy. Genetic association studies over databases offer an attractive starting point for gene discovery. The National Center for Biotechnology Information (NCBI) Phenome Genome Integrator (PheGenI) tool was enriched for pancreatic cancer-associated traits. The genes associated with the trait were characterized using diverse bioinformatics tools for Genome-Wide Association (GWA), transcriptome and proteome profile and protein classes for motif and domain. Two hundred twenty-six genes were identified that had a genetic association with pancreatic cancer in the human genome. This included 25 uncharacterized open reading frames (ORFs). Bioinformatics analysis of these ORFs identified putative druggable proteins and biomarkers including enzymes, transporters and G-protein-coupled receptor signaling proteins. Secreted proteins including a neuroendocrine factor and a chemokine were identified. Five out of these ORFs encompassed non coding RNAs. The ORF protein expression was detected in numerous body fluids, such as ascites, bile, pancreatic juice, milk, plasma, serum and saliva. Transcriptome and proteome analyses showed a correlation of mRNA and protein expression for nine ORFs. Analysis of the Catalogue of Somatic Mutations in Cancer (COSMIC) database revealed a strong correlation across copy number variations and mRNA over-expression for four ORFs. Mining of the International Cancer Gene Consortium (ICGC) database identified somatic mutations in a significant number of pancreatic patients' tumors for most of these ORFs. The pancreatic cancer-associated ORFs were also found to be genetically associated with other neoplasms, including leukemia, malignant melanoma, neuroblastoma and prostate carcinomas, as well as other unrelated diseases and disorders, such as Alzheimer's disease, Crohn's disease, coronary diseases, attention deficit disorder and addiction. Based on Genome-Wide Association Studies (GWAS), copy number variations, somatic mutational status and correlation of gene expression in pancreatic tumors at the mRNA and protein level, expression specificity in normal tissues and detection in body fluids, six ORFs emerged as putative leads for pancreatic cancer. These six targets provide a basis for accelerated drug discovery and diagnostic marker development for pancreatic cancer. Copyright© 2015, International Institute of Anticancer Research (Dr. John G. Delinasios), All rights reserved.
Tissue Molecular Anatomy Project (TMAP): an expression database for comparative cancer proteomics.

PubMed

Medjahed, Djamel; Luke, Brian T; Tontesh, Tawady S; Smythers, Gary W; Munroe, David J; Lemkin, Peter F

2003-08-01

By mining publicly accessible databases, we have developed a collection of tissue-specific predictive protein expression maps as a function of cancer histological state. Data analysis is applied to the differential expression of gene products in pooled libraries from the normal to the altered state(s). We wish to report the initial results of our survey across different tissues and explore the extent to which this comparative approach may help uncover panels of potential biomarkers of tumorigenesis which would warrant further examination in the laboratory.
Analysis of gene expression profile microarray data in complex regional pain syndrome.

PubMed

Tan, Wulin; Song, Yiyan; Mo, Chengqiang; Jiang, Shuangjian; Wang, Zhongxing

2017-09-01

The aim of the present study was to predict key genes and proteins associated with complex regional pain syndrome (CRPS) using bioinformatics analysis. The gene expression profiling microarray data, GSE47603, which included peripheral blood samples from 4 patients with CRPS and 5 healthy controls, was obtained from the Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) in CRPS patients compared with healthy controls were identified using the GEO2R online tool. Functional enrichment analysis was then performed using The Database for Annotation Visualization and Integrated Discovery online tool. Protein‑protein interaction (PPI) network analysis was subsequently performed using Search Tool for the Retrieval of Interaction Genes database and analyzed with Cytoscape software. A total of 257 DEGs were identified, including 243 upregulated genes and 14 downregulated ones. Genes in the human leukocyte antigen (HLA) family were most significantly differentially expressed. Enrichment analysis demonstrated that signaling pathways, including immune response, cell motion, adhesion and angiogenesis were associated with CRPS. PPI network analysis revealed that key genes, including early region 1A binding protein p300 (EP300), CREB‑binding protein (CREBBP), signal transducer and activator of transcription (STAT)3, STAT5A and integrin α M were associated with CRPS. The results suggest that the immune response may therefore serve an important role in CRPS development. In addition, genes in the HLA family, such as HLA‑DQB1 and HLA‑DRB1, may present potential biomarkers for the diagnosis of CRPS. Furthermore, EP300, its paralog CREBBP, and the STAT family genes, STAT3 and STAT5 may be important in the development of CRPS.
Introducing the CPL/MUW proteome database: interpretation of human liver and liver cancer proteome profiles by referring to isolated primary cells.

PubMed

Wimmer, Helge; Gundacker, Nina C; Griss, Johannes; Haudek, Verena J; Stättner, Stefan; Mohr, Thomas; Zwickl, Hannes; Paulitschke, Verena; Baron, David M; Trittner, Wolfgang; Kubicek, Markus; Bayer, Editha; Slany, Astrid; Gerner, Christopher

2009-06-01

Interpretation of proteome data with a focus on biomarker discovery largely relies on comparative proteome analyses. Here, we introduce a database-assisted interpretation strategy based on proteome profiles of primary cells. Both 2-D-PAGE and shotgun proteomics are applied. We obtain high data concordance with these two different techniques. When applying mass analysis of tryptic spot digests from 2-D gels of cytoplasmic fractions, we typically identify several hundred proteins. Using the same protein fractions, we usually identify more than thousand proteins by shotgun proteomics. The data consistency obtained when comparing these independent data sets exceeds 99% of the proteins identified in the 2-D gels. Many characteristic differences in protein expression of different cells can thus be independently confirmed. Our self-designed SQL database (CPL/MUW - database of the Clinical Proteomics Laboratories at the Medical University of Vienna accessible via www.meduniwien.ac.at/proteomics/database) facilitates (i) quality management of protein identification data, which are based on MS, (ii) the detection of cell type-specific proteins and (iii) of molecular signatures of specific functional cell states. Here, we demonstrate, how the interpretation of proteome profiles obtained from human liver tissue and hepatocellular carcinoma tissue is assisted by the Clinical Proteomics Laboratories at the Medical University of Vienna-database. Therefore, we suggest that the use of reference experiments supported by a tailored database may substantially facilitate data interpretation of proteome profiling experiments.

Creation of a Human Secretome: A Novel Composite Library of Human Secreted Proteins: Validation Using Ovarian Cancer Gene Expression Data and a Virtual Secretome Array.

PubMed

Vathipadiekal, Vinod; Wang, Victoria; Wei, Wei; Waldron, Levi; Drapkin, Ronny; Gillette, Michael; Skates, Steven; Birrer, Michael

2015-11-01

To generate a comprehensive "Secretome" of proteins potentially found in the blood and derive a virtual Affymetrix array. To validate the utility of this database for the discovery of novel serum-based biomarkers using ovarian cancer transcriptomic data. The secretome was constructed by aggregating the data from databases of known secreted proteins, transmembrane or membrane proteins, signal peptides, G-protein coupled receptors, or proteins existing in the extracellular region, and the virtual array was generated by mapping them to Affymetrix probeset identifiers. Whole-genome microarray data from ovarian cancer, normal ovarian surface epithelium, and fallopian tube epithelium were used to identify transcripts upregulated in ovarian cancer. We established the secretome from eight public databases and a virtual array consisting of 16,521 Affymetrix U133 Plus 2.0 probesets. Using ovarian cancer transcriptomic data, we identified candidate blood-based biomarkers for ovarian cancer and performed bioinformatic validation by demonstrating rediscovery of known biomarkers including CA125 and HE4. Two novel top biomarkers (FGF18 and GPR172A) were validated in serum samples from an independent patient cohort. We present the secretome, comprising the most comprehensive resource available for protein products that are potentially found in the blood. The associated virtual array can be used to translate gene-expression data into cancer biomarker discovery. A list of blood-based biomarkers for ovarian cancer detection is reported and includes CA125 and HE4. FGF18 and GPR172A were identified and validated by ELISA as being differentially expressed in the serum of ovarian cancer patients compared with controls. ©2015 American Association for Cancer Research.
SCOPe: Manual Curation and Artifact Removal in the Structural Classification of Proteins - extended Database.

PubMed

Chandonia, John-Marc; Fox, Naomi K; Brenner, Steven E

2017-02-03

SCOPe (Structural Classification of Proteins-extended, http://scop.berkeley.edu) is a database of relationships between protein structures that extends the Structural Classification of Proteins (SCOP) database. SCOP is an expert-curated ordering of domains from the majority of proteins of known structure in a hierarchy according to structural and evolutionary relationships. SCOPe classifies the majority of protein structures released since SCOP development concluded in 2009, using a combination of manual curation and highly precise automated tools, aiming to have the same accuracy as fully hand-curated SCOP releases. SCOPe also incorporates and updates the ASTRAL compendium, which provides several databases and tools to aid in the analysis of the sequences and structures of proteins classified in SCOPe. SCOPe continues high-quality manual classification of new superfamilies, a key feature of SCOP. Artifacts such as expression tags are now separated into their own class, in order to distinguish them from the homology-based annotations in the remainder of the SCOPe hierarchy. SCOPe 2.06 contains 77,439 Protein Data Bank entries, double the 38,221 structures classified in SCOP. Copyright © 2016 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Integrating In Silico Resources to Map a Signaling Network

PubMed Central

Liu, Hanqing; Beck, Tim N.; Golemis, Erica A.; Serebriiskii, Ilya G.

2013-01-01

The abundance of publicly available life science databases offer a wealth of information that can support interpretation of experimentally derived data and greatly enhance hypothesis generation. Protein interaction and functional networks are not simply new renditions of existing data: they provide the opportunity to gain insights into the specific physical and functional role a protein plays as part of the biological system. In this chapter, we describe different in silico tools that can quickly and conveniently retrieve data from existing data repositories and discuss how the available tools are best utilized for different purposes. While emphasizing protein-protein interaction databases (e.g., BioGrid and IntAct), we also introduce metasearch platforms such as STRING and GeneMANIA, pathway databases (e.g., BioCarta and Pathway Commons), text mining approaches (e.g., PubMed and Chilibot), and resources for drug-protein interactions, genetic information for model organisms and gene expression information based on microarray data mining. Furthermore, we provide a simple step-by-step protocol to building customized protein-protein interaction networks in Cytoscape, a powerful network assembly and visualization program, integrating data retrieved from these various databases. As we illustrate, generation of composite interaction networks enables investigators to extract significantly more information about a given biological system than utilization of a single database or sole reliance on primary literature. PMID:24233784
Significance of aquaporins’ expression in the prognosis of gastric cancer

PubMed Central

Thapa, Saroj; Chetry, Mandika; Huang, Kaiyu; Peng, Yangpei; Wang, Jinsheng; Wang, Jiaoni; Zhou, Yingying; Shen, Yigen; Xue, Yangjing; Ji, Kangting

2018-01-01

Gastric carcinoma is one of the most lethal malignancy at present with leading cause of cancer-related deaths worldwide. Aquaporins (AQPs) are a family of small, integral membrane proteins, which have been evidenced to play a crucial role in cell migration and proliferation of different cancer cells including gastric cancers. However, the aberrant expression of specific AQPs and its correlation to detect predictive and prognostic significance in gastric cancer remains elusive. In the present study, we comprehensively explored immunohistochemistry based map of protein expression profiles in normal tissues, cancer and cell lines from publicly available Human Protein Atlas (HPA) database. Moreover, to improve our understanding of general gastric biology and guide to find novel predictive prognostic gastric cancer biomarker, we also retrieved ‘The Kaplan–Meier plotter’ (KM plotter) online database with specific AQPs mRNA to overall survival (OS) in different clinicopathological features. We revealed that ubiquitous expression of AQPs protein can be effective tools to generate gastric cancer biomarker. Furthermore, high level AQP3, AQP9, and AQP11 mRNA expression were correlated with better OS in all gastric patients, whereas AQP0, AQP1, AQP4, AQP5, AQP6, AQP8, and AQP10 mRNA expression were associated with poor OS. With regard to the clinicopathological features including Laurens classification, clinical stage, human epidermal growth factor receptor 2 (HER2) status, and different treatment strategy, we could illustrate significant role of individual AQP mRNA expression in the prognosis of gastric cancer patients. Thus, our results indicated that AQP’s protein and mRNA expression in gastric cancer patients provide effective role to predict prognosis and act as an essential agent to therapeutic strategy. PMID:29678898
Regulatory interactions between long noncoding RNA LINC00968 and miR-9-3p in non-small cell lung cancer: A bioinformatic analysis based on miRNA microarray, GEO and TCGA.

PubMed

Li, Dong-Yao; Chen, Wen-Jie; Shang, Jun; Chen, Gang; Li, Shi-Kang

2018-06-01

Long non-coding RNAs (lncRNAs) have been demonstrated to mediate carcinogenesis in various types of cancer. However, the regulatory role of lncRNA LINC00968 in lung adenocarcinoma remains unclear. The microRNA (miRNA) expression in LINC00968-overexpressing human lung adenocarcinoma A549 cells was detected using miRNA microarray analysis. miR-9-3p was selected for further analysis, and its expression was verified in the Gene Expression Omnibus (GEO) database. In addition, the regulatory axis of LINC00968 was validated using The Cancer Genome Atlas (TCGA) database. Results of the GEO database indicated miR-9-3p expression in lung adenocarcinoma was significantly higher compared with normal tissues. Functional enrichment analyses of the target genes of miR-9-3p indicated protein binding and the AMP-activated protein kinase pathway were the most enriched Gene Ontology and KEGG terms, respectively. Combining target genes with the correlated genes of LINC00968 and miR-9-3p, 120 objective genes were obtained, which were used to construct a protein-protein interaction (PPI) network. Cyclin A2 (CCNA2) was identified to have a vital role in the PPI network. Significant correlations were detected between LINC00968, miR-9-3p and CCNA2 in lung adenocarcinoma. The LINC00968/miR-9-3p/CCNA2 regulatory axis provides a new foundation for further evaluating the regulatory mechanisms of LINC00968 in lung adenocarcinoma.
Analysis of the oncogene BRAF mutation and the correlation of the expression of wild-type BRAF and CREB1 in endometriosis

PubMed Central

Lv, Xiao; Ma, Yue; Long, Zaiqiu

2018-01-01

B-Raf proto-oncogene, serine/threonine kinase (BRAF) has previously been identified as a candidate target gene in endometriosis. Wild-type and mutated BRAF serve important roles in different diseases. The aim of the present study was to explore BRAF mutation, the mRNA and protein expression of wild-type BRAF (wtBRAF) in endometriosis, and the association between the expression levels of wtBRAF and the predicted transcription factor cAMP responsive element binding protein 1 (CREB1). In the present study, BRAF mutation was detected using Sanger sequencing among 30 ectopic and matched eutopic endometrium samples of patients with endometriosis as well as 25 normal endometrium samples, and no BRAF mutation was detected in exons 11 or 15. A region of ~2,000 bp upstream of the BRAF gene was then screened using NCBI and UCSC databases, and CREB1 was identified as a potential transcription factor of BRAF by analysis with the JASPAR and the TRANSFAC databases. Quantitative polymerase chain reaction was used to analysis the mRNA expression levels of wtBRAF and CREB1, and the corresponding protein expression levels were evaluated using immunohistochemistry and western blot analysis. The results revealed that the mRNA and protein expression levels of wtBRAF and CREB1 were significantly upregulated in the eutopic endometrial tissues of patients with endometriosis compared with normal endometrial tissues (P<0.05) and no significant difference in wtBRAF and CREB1 levels was detected between the ectopic and eutopic endometrium (P>0.05). In addition, correlation analysis revealed that the protein expression of CREB1 was positively correlated with the transcript level and protein expression of wtBRAF. It is reasonable to speculate that CREB1 may activate the transcription of wtBRAF through directly binding to its promoter, increasing BRAF expression and regulating the cell proliferation, migration and invasion of endometriosis. PMID:29286077
Expression of Anaplasma marginale ankyrin repeat-containing proteins during infection of the mammalian host and tick vector

USDA-ARS?s Scientific Manuscript database

Using searches of the NCBI conserved domain database and SMART genomic architecture analysis, we identified three ankyrin repeat-containing genes in Anaplasma marginale: AM705, AM926 and AM638. Recombinant protein was used to immunize mice and generate fusion hybridomas secreting protein-specific mo...
A dehydration-inducible gene in the truffle Tuber borchii identifies a novel group of dehydrins

PubMed Central

Abba', Simona; Ghignone, Stefano; Bonfante, Paola

2006-01-01

Background The expressed sequence tag M6G10 was originally isolated from a screening for differentially expressed transcripts during the reproductive stage of the white truffle Tuber borchii. mRNA levels for M6G10 increased dramatically during fruiting body maturation compared to the vegetative mycelial stage. Results Bioinformatics tools, phylogenetic analysis and expression studies were used to support the hypothesis that this sequence, named TbDHN1, is the first dehydrin (DHN)-like coding gene isolated in fungi. Homologs of this gene, all defined as "coding for hypothetical proteins" in public databases, were exclusively found in ascomycetous fungi and in plants. Although complete (or almost complete) fungal genomes and EST collections of some Basidiomycota and Glomeromycota are already available, DHN-like proteins appear to be represented only in Ascomycota. A new and previously uncharacterized conserved signature pattern was identified and proposed to Uniprot database as the main distinguishing feature of this new group of DHNs. Expression studies provide experimental evidence of a transcript induction of TbDHN1 during cellular dehydration. Conclusion Expression pattern and sequence similarities to known plant DHNs indicate that TbDHN1 is the first characterized DHN-like protein in fungi. The high similarity of TbDHN1 with homolog coding sequences implies the existence of a novel fungal/plant group of LEA Class II proteins characterized by a previously undescribed signature pattern. PMID:16512918
AIM: a comprehensive Arabidopsis interactome module database and related interologs in plants.

PubMed

Wang, Yi; Thilmony, Roger; Zhao, Yunjun; Chen, Guoping; Gu, Yong Q

2014-01-01

Systems biology analysis of protein modules is important for understanding the functional relationships between proteins in the interactome. Here, we present a comprehensive database named AIM for Arabidopsis (Arabidopsis thaliana) interactome modules. The database contains almost 250,000 modules that were generated using multiple analysis methods and integration of microarray expression data. All the modules in AIM are well annotated using multiple gene function knowledge databases. AIM provides a user-friendly interface for different types of searches and offers a powerful graphical viewer for displaying module networks linked to the enrichment annotation terms. Both interactive Venn diagram and power graph viewer are integrated into the database for easy comparison of modules. In addition, predicted interologs from other plant species (homologous proteins from different species that share a conserved interaction module) are available for each Arabidopsis module. AIM is a powerful systems biology platform for obtaining valuable insights into the function of proteins in Arabidopsis and other plants using the modules of the Arabidopsis interactome. Database URL:http://probes.pw.usda.gov/AIM Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.
Molecular cloning and expression of the CRISP family of proteins in the boar.

PubMed

Vadnais, Melissa L; Foster, Douglas N; Roberts, Kenneth P

2008-12-01

The family of mammalian cysteine-rich secretory proteins (CRISP) have been well characterized in the rat, mouse, and human. Here we report the molecular cloning and expression analysis of CRISP1, CRISP2, and CRISP3 in the boar. A partial sequence published in the National Center for Biotechnology Information (NCBI) database was used to derive the full-length sequences for CRISP1 and CRISP2 using rapid amplification of cDNA ends. RT-PCR confirmed the expression of these mRNAs in the boar reproductive tract, and real time RT-PCR showed CRISP1 to be highly expressed throughout the epididymis, with CRISP2 highly expressed in the testis. A search of the porcine genomic sequence in the NCBI database identified a BAC (CH242-199E6) encoding the CRISP1 gene. This BAC is derived from porcine Chromosome 7 and is syntenic with the regions of the mouse, rat, and human genomes encoding the CRISP gene family. This BAC was found to encode a third CRISP protein with a predicted amino acid sequence of high similarity to human CRISP3. Using RT-PCR we show that CRISP3 expression in the boar reproductive tract is confined to the prostate. Recombinant porcine (rp) CRISP2 protein was produced and purified. When incubated with capacitated boar sperm, rpCRISP2 induced an acrosome reaction, consistent with its demonstrated ability to alter the activity of calcium channels.
NCBI GEO: mining millions of expression profiles--database and tools.

PubMed

Barrett, Tanya; Suzek, Tugba O; Troup, Dennis B; Wilhite, Stephen E; Ngau, Wing-Chi; Ledoux, Pierre; Rudnev, Dmitry; Lash, Alex E; Fujibuchi, Wataru; Edgar, Ron

2005-01-01

The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest fully public repository for high-throughput molecular abundance data, primarily gene expression data. The database has a flexible and open design that allows the submission, storage and retrieval of many data types. These data include microarray-based experiments measuring the abundance of mRNA, genomic DNA and protein molecules, as well as non-array-based technologies such as serial analysis of gene expression (SAGE) and mass spectrometry proteomic technology. GEO currently holds over 30,000 submissions representing approximately half a billion individual molecular abundance measurements, for over 100 organisms. Here, we describe recent database developments that facilitate effective mining and visualization of these data. Features are provided to examine data from both experiment- and gene-centric perspectives using user-friendly Web-based interfaces accessible to those without computational or microarray-related analytical expertise. The GEO database is publicly accessible through the World Wide Web at http://www.ncbi.nlm.nih.gov/geo.
Genome-wide identification of lineage-specific genes in Arabidopsis, Oryza and Populus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, Xiaohan; Jawdy, Sara; Tschaplinski, Timothy J

2009-01-01

Protein sequences were compared among Arabidopsis, Oryza and Populus to identify differential gene (DG) sets that are in one but not the other two genomes. The DG sets were screened against a plant transcript database, the NR protein database and six newly-sequenced genomes (Carica, Glycine, Medicago, Sorghum, Vitis and Zea) to identify a set of species-specific genes (SS). Gene expression, protein motif and intron number were examined. 192, 641 and 109 SS genes were identified in Arabidopsis, Oryza and Populus, respectively. Some SS genes were preferentially expressed in flowers, roots, xylem and cambium or up-regulated by stress. Six conserved motifsmore » in Arabidopsis and Oryza SS proteins were found in other distant lineages. The SS gene sets were enriched with intronless genes. The results reflect functional and/or anatomical differences between monocots and eudicots or between herbaceous and woody plants. The Populus-specific genes are candidates for carbon sequestration and biofuel research.« less
Proteome reference map and regulation network of neonatal rat cardiomyocyte

PubMed Central

Li, Zi-jian; Liu, Ning; Han, Qi-de; Zhang, You-yi

2011-01-01

Aim: To study and establish a proteome reference map and regulation network of neonatal rat cardiomyocyte. Methods: Cultured cardiomyocytes of neonatal rats were used. All proteins expressed in the cardiomyocytes were separated and identified by two-dimensional polyacrylamide gel electrophoresis (2-DE) and matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS). Biological networks and pathways of the neonatal rat cardiomyocytes were analyzed using the Ingenuity Pathway Analysis (IPA) program (www.ingenuity.com). A 2-DE database was made accessible on-line by Make2ddb package on a web server. Results: More than 1000 proteins were separated on 2D gels, and 148 proteins were identified. The identified proteins were used for the construction of an extensible markup language-based database. Biological networks and pathways were constructed to analyze the functions associate with cardiomyocyte proteins in the database. The 2-DE database of rat cardiomyocyte proteins can be accessed at http://2d.bjmu.edu.cn. Conclusion: A proteome reference map and regulation network of the neonatal rat cardiomyocytes have been established, which may serve as an international platform for storage, analysis and visualization of cardiomyocyte proteomic data. PMID:21841810
Databases for Microbiologists

DOE PAGES

Zhulin, Igor B.

2015-05-26

Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. Finally, the purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists.
Databases for Microbiologists

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhulin, Igor B.

Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. Finally, the purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists.
Databases for Microbiologists

PubMed Central

2015-01-01

Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. The purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists. PMID:26013493
GermOnline 4.0 is a genomics gateway for germline development, meiosis and the mitotic cell cycle.

PubMed

Lardenois, Aurélie; Gattiker, Alexandre; Collin, Olivier; Chalmel, Frédéric; Primig, Michael

2010-01-01

GermOnline 4.0 is a cross-species database portal focusing on high-throughput expression data relevant for germline development, the meiotic cell cycle and mitosis in healthy versus malignant cells. It is thus a source of information for life scientists as well as clinicians who are interested in gene expression and regulatory networks. The GermOnline gateway provides unlimited access to information produced with high-density oligonucleotide microarrays (3'-UTR GeneChips), genome-wide protein-DNA binding assays and protein-protein interaction studies in the context of Ensembl genome annotation. Samples used to produce high-throughput expression data and to carry out genome-wide in vivo DNA binding assays are annotated via the MIAME-compliant Multiomics Information Management and Annotation System (MIMAS 3.0). Furthermore, the Saccharomyces Genomics Viewer (SGV) was developed and integrated into the gateway. SGV is a visualization tool that outputs genome annotation and DNA-strand specific expression data produced with high-density oligonucleotide tiling microarrays (Sc_tlg GeneChips) which cover the complete budding yeast genome on both DNA strands. It facilitates the interpretation of expression levels and transcript structures determined for various cell types cultured under different growth and differentiation conditions. Database URL: www.germonline.org/
Database resources of the National Center for Biotechnology Information.

PubMed

2016-01-04

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Database resources of the National Center for Biotechnology Information.

PubMed

2015-01-01

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
PAGE-1, an X chromosome-linked GAGE-like gene that is expressed in normal and neoplastic prostate, testis, and uterus

PubMed Central

Brinkmann, Ulrich; Vasmatzis, George; Lee, Byungkook; Yerushalmi, Noga; Essand, Magnus; Pastan, Ira

1998-01-01

We have used a combination of computerized database mining and experimental expression analyses to identify a gene that is preferentially expressed in normal male and female reproductive tissues, prostate, testis, fallopian tube, uterus, and placenta, as well as in prostate cancer, testicular cancer, and uterine cancer. This gene is located on the human X chromosome, and it is homologous to a family of genes encoding GAGE-like proteins. GAGE proteins are expressed in a variety of tumors and in testis. We designate the novel gene PAGE-1 because the expression pattern in the Cancer Genome Anatomy Project libraries indicates that it is predominantly expressed in normal and neoplastic prostate. Further database analysis indicates the presence of other genes with high homology to PAGE-1, which were found in cDNA libraries derived from testis, pooled libraries (with testis), and in a germ cell tumor library. The expression of PAGE-1 in normal and malignant prostate, testicular, and uterine tissues makes it a possible target for the diagnosis and possibly for the vaccine-based therapy of neoplasms of prostate, testis, and uterus. PMID:9724777

PAGE-1, an X chromosome-linked GAGE-like gene that is expressed in normal and neoplastic prostate, testis, and uterus.

PubMed

Brinkmann, U; Vasmatzis, G; Lee, B; Yerushalmi, N; Essand, M; Pastan, I

1998-09-01

We have used a combination of computerized database mining and experimental expression analyses to identify a gene that is preferentially expressed in normal male and female reproductive tissues, prostate, testis, fallopian tube, uterus, and placenta, as well as in prostate cancer, testicular cancer, and uterine cancer. This gene is located on the human X chromosome, and it is homologous to a family of genes encoding GAGE-like proteins. GAGE proteins are expressed in a variety of tumors and in testis. We designate the novel gene PAGE-1 because the expression pattern in the Cancer Genome Anatomy Project libraries indicates that it is predominantly expressed in normal and neoplastic prostate. Further database analysis indicates the presence of other genes with high homology to PAGE-1, which were found in cDNA libraries derived from testis, pooled libraries (with testis), and in a germ cell tumor library. The expression of PAGE-1 in normal and malignant prostate, testicular, and uterine tissues makes it a possible target for the diagnosis and possibly for the vaccine-based therapy of neoplasms of prostate, testis, and uterus.
Single step generation of protein arrays from DNA by cell-free expression and in situ immobilisation (PISA method).

PubMed

He, M; Taussig, M J

2001-08-01

We describe a format for production of protein arrays termed 'protein in situ array' (PISA). A PISA is rapidly generated in one step directly from PCR-generated DNA fragments by cell-free protein expression and in situ immobilisation at a surface. The template for expression is DNA encoding individual proteins or domains, which is produced by PCR using primers designed from information in DNA databases. Coupled transcription and translation is carried out on a surface to which the tagged protein adheres as soon as it is synthesised. Because proteins generated by cell-free synthesis are usually soluble and functional, this method can overcome problems of insolubility or degradation associated with bacterial expression of recombinant proteins. Moreover, the use of PCR-generated DNA enables rapid production of proteins or domains based on genome information alone and will be particularly useful where cloned material is not available. Here we show that human single-chain antibody fragments (three domain, V(H)/K form) and an enzyme (luciferase) can be functionally arrayed by the PISA method.
Fragger: a protein fragment picker for structural queries.

PubMed

Berenger, Francois; Simoncini, David; Voet, Arnout; Shrestha, Rojan; Zhang, Kam Y J

2017-01-01

Protein modeling and design activities often require querying the Protein Data Bank (PDB) with a structural fragment, possibly containing gaps. For some applications, it is preferable to work on a specific subset of the PDB or with unpublished structures. These requirements, along with specific user needs, motivated the creation of a new software to manage and query 3D protein fragments. Fragger is a protein fragment picker that allows protein fragment databases to be created and queried. All fragment lengths are supported and any set of PDB files can be used to create a database. Fragger can efficiently search a fragment database with a query fragment and a distance threshold. Matching fragments are ranked by distance to the query. The query fragment can have structural gaps and the allowed amino acid sequences matching a query can be constrained via a regular expression of one-letter amino acid codes. Fragger also incorporates a tool to compute the backbone RMSD of one versus many fragments in high throughput. Fragger should be useful for protein design, loop grafting and related structural bioinformatics tasks.
YPED: An Integrated Bioinformatics Suite and Database for Mass Spectrometry-based Proteomics Research

PubMed Central

Colangelo, Christopher M.; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L.; Carriero, Nicholas J.; Gulcicek, Erol E.; Lam, TuKiet T.; Wu, Terence; Bjornson, Robert D.; Bruce, Can; Nairn, Angus C.; Rinehart, Jesse; Miller, Perry L.; Williams, Kenneth R.

2015-01-01

We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography–tandem mass spectrometry (LC–MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED’s database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. PMID:25712262
YPED: an integrated bioinformatics suite and database for mass spectrometry-based proteomics research.

PubMed

Colangelo, Christopher M; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L; Carriero, Nicholas J; Gulcicek, Erol E; Lam, TuKiet T; Wu, Terence; Bjornson, Robert D; Bruce, Can; Nairn, Angus C; Rinehart, Jesse; Miller, Perry L; Williams, Kenneth R

2015-02-01

We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography-tandem mass spectrometry (LC-MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED's database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Exploring the human seminal plasma proteome: an unexplored gold mine of biomarker for male infertility and male reproduction disorder.

PubMed

Gilany, Kambiz; Minai-Tehrani, Arash; Savadi-Shiraz, Elham; Rezadoost, Hassan; Lakpour, Niknam

2015-01-01

The human seminal fluid is a complex body fluid. It is not known how many proteins are expressed in the seminal plasma; however in analog with the blood it is possible up to 10,000 proteins are expressed in the seminal plasma. The human seminal fluid is a rich source of potential biomarkers for male infertility and reproduction disorder. In this review, the ongoing list of proteins identified from the human seminal fluid was collected. To date, 4188 redundant proteins of the seminal fluid are identified using different proteomics technology, including 2-DE, SDS-PAGE-LC-MS/MS, MudPIT. However, this was reduced to a database of 2168 non-redundant protein using UniProtKB/Swiss-Prot reviewed database. The core concept of proteome were analyzed including pI, MW, Amino Acids, Chromosome and PTM distribution in the human seminal plasma proteome. Additionally, the biological process, molecular function and KEGG pathway were investigated using DAVID software. Finally, the biomarker identified in different male reproductive system disorder was investigated using proteomics platforms so far. In this study, an attempt was made to update the human seminal plasma proteome database. Our finding showed that human seminal plasma studies used to date seem to have converged on a set of proteins that are repeatedly identified in many studies and that represent only a small fraction of the entire human seminal plasma proteome.
FunSimMat: a comprehensive functional similarity database

PubMed Central

Schlicker, Andreas; Albrecht, Mario

2008-01-01

Functional similarity based on Gene Ontology (GO) annotation is used in diverse applications like gene clustering, gene expression data analysis, protein interaction prediction and evaluation. However, there exists no comprehensive resource of functional similarity values although such a database would facilitate the use of functional similarity measures in different applications. Here, we describe FunSimMat (Functional Similarity Matrix, http://funsimmat.bioinf.mpi-inf.mpg.de/), a large new database that provides several different semantic similarity measures for GO terms. It offers various precomputed functional similarity values for proteins contained in UniProtKB and for protein families in Pfam and SMART. The web interface allows users to efficiently perform both semantic similarity searches with GO terms and functional similarity searches with proteins or protein families. All results can be downloaded in tab-delimited files for use with other tools. An additional XML–RPC interface gives automatic online access to FunSimMat for programs and remote services. PMID:17932054
Distribution of cellular HSV-1 receptor expression in human brain.

PubMed

Lathe, Richard; Haas, Juergen G

2017-06-01

Herpes simplex virus type 1 (HSV-1) is a neurotropic virus linked to a range of acute and chronic neurological disorders affecting distinct regions of the brain. Unusually, HSV-1 entry into cells requires the interaction of viral proteins glycoprotein D (gD) and glycoprotein B (gB) with distinct cellular receptor proteins. Several different gD and gB receptors have been identified, including TNFRSF14/HVEM and PVRL1/nectin 1 as gD receptors and PILRA, MAG, and MYH9 as gB receptors. We investigated the expression of these receptor molecules in different areas of the adult and developing human brain using online transcriptome databases. Whereas all HSV-1 receptors showed distinct expression patterns in different brain areas, the Allan Brain Atlas (ABA) reported increased expression of both gD and gB receptors in the hippocampus. Specifically, for PVRL1, TNFRFS14, and MYH9, the differential z scores for hippocampal expression, a measure of relative levels of increased expression, rose to 2.9, 2.9, and 2.5, respectively, comparable to the z score for the archetypical hippocampus-enriched mineralocorticoid receptor (NR3C2, z = 3.1). These data were confirmed at the Human Brain Transcriptome (HBT) database, but HBT data indicate that MAG expression is also enriched in hippocampus. The HBT database allowed the developmental pattern of expression to be investigated; we report that all HSV1 receptors markedly increase in expression levels between gestation and the postnatal/adult periods. These results suggest that differential receptor expression levels of several HSV-1 gD and gB receptors in the adult hippocampus are likely to underlie the susceptibility of this brain region to HSV-1 infection.
A proteomic chronology of gene expression through the cell cycle in human myeloid leukemia cells.

PubMed

Ly, Tony; Ahmad, Yasmeen; Shlien, Adam; Soroka, Dominique; Mills, Allie; Emanuele, Michael J; Stratton, Michael R; Lamond, Angus I

2014-01-01

Technological advances have enabled the analysis of cellular protein and RNA levels with unprecedented depth and sensitivity, allowing for an unbiased re-evaluation of gene regulation during fundamental biological processes. Here, we have chronicled the dynamics of protein and mRNA expression levels across a minimally perturbed cell cycle in human myeloid leukemia cells using centrifugal elutriation combined with mass spectrometry-based proteomics and RNA-Seq, avoiding artificial synchronization procedures. We identify myeloid-specific gene expression and variations in protein abundance, isoform expression and phosphorylation at different cell cycle stages. We dissect the relationship between protein and mRNA levels for both bulk gene expression and for over ∼6000 genes individually across the cell cycle, revealing complex, gene-specific patterns. This data set, one of the deepest surveys to date of gene expression in human cells, is presented in an online, searchable database, the Encyclopedia of Proteome Dynamics (http://www.peptracker.com/epd/). DOI: http://dx.doi.org/10.7554/eLife.01630.001.
A proteomic chronology of gene expression through the cell cycle in human myeloid leukemia cells

PubMed Central

Ly, Tony; Ahmad, Yasmeen; Shlien, Adam; Soroka, Dominique; Mills, Allie; Emanuele, Michael J; Stratton, Michael R; Lamond, Angus I

2014-01-01

Technological advances have enabled the analysis of cellular protein and RNA levels with unprecedented depth and sensitivity, allowing for an unbiased re-evaluation of gene regulation during fundamental biological processes. Here, we have chronicled the dynamics of protein and mRNA expression levels across a minimally perturbed cell cycle in human myeloid leukemia cells using centrifugal elutriation combined with mass spectrometry-based proteomics and RNA-Seq, avoiding artificial synchronization procedures. We identify myeloid-specific gene expression and variations in protein abundance, isoform expression and phosphorylation at different cell cycle stages. We dissect the relationship between protein and mRNA levels for both bulk gene expression and for over ∼6000 genes individually across the cell cycle, revealing complex, gene-specific patterns. This data set, one of the deepest surveys to date of gene expression in human cells, is presented in an online, searchable database, the Encyclopedia of Proteome Dynamics (http://www.peptracker.com/epd/). DOI: http://dx.doi.org/10.7554/eLife.01630.001 PMID:24596151
Dynamic changes in proteins during apple (Malus x domestica) fruit ripening and storage

PubMed Central

Shi, Yun; Jiang, Li; Zhang, Li; Kang, Ruoyi; Yu, Zhifang

2014-01-01

A proteomic study, using two-dimensional polyacrylamide gel electrophoresis and matrix-assisted laser desorption/ionization time-of-flight/time-of-flight, was conducted in apple fruit (cv. ‘Golden Delicious’) starting at 10 days prior to harvest through 50 days in storage. Total protein was extracted using a phenol/sodium dodecyl sulfate protocol. More than 400 protein spots were detected in each gel and 55 differentially expressed proteins (p<0.05) were subjected to matrix-assisted laser desorption/ionization time-of-flight/time-of-flight analysis. Fifty-three of these proteins were finally identified using an apple expressed sequence tag database downloaded from Genome Database for Rosaceae and placed into six categories. The categories and the percentage of proteins placed in each category were stress response and defense (49.0%), energy and metabolism (34.0%), fruit ripening and senescence (5.6%), signal transduction (3.8%), cell structure (3.8%) and protein synthesis (3.8%). Proteins involved in several multiple metabolic pathways, including glycolysis, pentose–phosphate pathway, anti-oxidative systems, photosynthesis and cell wall synthesis, were downregulated, especially during the climacteric burst in respiration and during the senescent stages of fruit development. Proteins classified as allergens or involved in cell wall degradation were upregulated during the ripening process. Some protein spots exhibited a mixed pattern (increasing to maximal abundance followed by a decrease), such as 1-aminocyclopropane-1-carboxylate oxidase, L-ascorbate peroxidase and abscisic acid response proteins. The identification of differentially expressed proteins associated with physiological processes identified in the current study provides a baseline of information for understanding the metabolic processes and regulatory mechanisms that occur in climacteric apple fruit during ripening and senescence. PMID:26504530
Revealing the potential pathogenesis of glioma by utilizing a glioma associated protein-protein interaction network.

PubMed

Pan, Weiran; Li, Gang; Yang, Xiaoxiao; Miao, Jinming

2015-04-01

This study aims to explore the potential mechanism of glioma through bioinformatic approaches. The gene expression profile (GSE4290) of glioma tumor and non-tumor samples was downloaded from Gene Expression Omnibus database. A total of 180 samples were available, including 23 non-tumor and 157 tumor samples. Then the raw data were preprocessed using robust multiarray analysis, and 8,890 differentially expressed genes (DEGs) were identified by using t-test (false discovery rate < 0.0005). Furthermore, 16 known glioma related genes were abstracted from Genetic Association Database. After mapping 8,890 DEGs and 16 known glioma related genes to Human Protein Reference Database, a glioma associated protein-protein interaction network (GAPN) was constructed. In addition, 51 sub-networks in GAPN were screened out through Molecular Complex Detection (score ≥ 1), and sub-network 1 was found to have the closest interaction (score = 3). What' more, for the top 10 sub-networks, Gene Ontology (GO) enrichment analysis (p value < 0.05) was performed, and DEGs involved in sub-network 1 and 2, such as BRMS1L and CCNA1, were predicted to regulate cell growth, cell cycle, and DNA replication via interacting with known glioma related genes. Finally, the overlaps of DEGs and human essential, housekeeping, tissue-specific genes were calculated (p value = 1.0, 1.0, and 0.00014, respectively) and visualized by Venn Diagram package in R. About 61% of human tissue-specific genes were DEGs as well. This research shed new light on the pathogenesis of glioma based on DEGs and GAPN, and our findings might provide potential targets for clinical glioma treatment.
Cloning, expression and purification of d-tagatose 3-epimerase gene from Escherichia coli JM109.

PubMed

He, Xiaoliang; Zhou, Xiaohui; Yang, Zi; Xu, Le; Yu, Yuxiu; Jia, Lingling; Li, Guoqing

2015-10-01

An unknown d-tagatose 3-epimerase (DTE) containing a IoIE domain was identified and cloned from Escherichia coli. This gene was subcloned into the prokaryotic expression vector pET-15b, and induced by IPTG in E. coli BL21 expression system. Through His-select gel column purification and fast-protein liquid chromatography, highly purified and stable DTE protein was produced. The molecular weight of the DTE protein was estimated to be 29.8kDa. The latest 83 DTE sequences from public database were selected and analyzed by molecular clustering, multi-sequence alignment. DTEs were roughly divided into five categories. Copyright © 2015 Elsevier Inc. All rights reserved.
Proteomics screening of adenosine triphosphate-interacting proteins in the liver of diazinon-treated rats.

PubMed

Pourtaji, A; Robati, R Yazdian; Lari, P; Hosseinzadeh, H; Ramezani, M; Abnous, K

2016-10-01

Diazinon (DZN) is one of the most important organophosphorus compounds used to control pests in agriculture in many countries. Several studies have shown that exposure to DZN may alter protein expression in the liver. In order to further investigate the mechanism of DZN toxicity, differentially expressed ATP-interacting proteins, following subacute exposure to toxin, were separated and identified in rat liver. Male rats were equally divided into four groups: control (corn oil) and DZN (15 mg/kg) by gavage once a day for 4 weeks. After homogenization of liver tissue, lysates were incubated ATP-sepharose beads. After several washes, ATP-interacting proteins were eluted and separated on 2-D polyacrylamide gels. Deferentially expressed proteins were cut and identified using matrix-assisted laser desorption/ionization/time-of-flight and Mascot database. Identified proteins were classified according to their biological process using protein analysis through evolutionary relationships (PANTHER) Web site. In this work, we showed that several key proteins involved in biological processes such as antioxidant system, oxidative stress, apoptosis, and metabolism were differentially expressed after subacute exposure to DZN. © The Author(s) 2015.
Effect of the difference in vehicles on gene expression in the rat liver--analysis of the control data in the Toxicogenomics Project Database.

PubMed

Takashima, Kayoko; Mizukawa, Yumiko; Morishita, Katsumi; Okuyama, Manabu; Kasahara, Toshihiko; Toritsuka, Naoki; Miyagishima, Toshikazu; Nagao, Taku; Urushidani, Tetsuro

2006-05-08

The Toxicogenomics Project is a 5-year collaborative project by the Japanese government and pharmaceutical companies in 2002. Its aim is to construct a large-scale toxicology database of 150 compounds orally administered to rats. The test consists of a single administration test (3, 6, 9 and 24 h) and a repeated administration test (3, 7, 14 and 28 days), and the conventional toxicology data together with the gene expression data in liver as analyzed by using Affymetrix GeneChip are being accumulated. In the project, either methylcellulose or corn oil is employed as vehicle. We examined whether the vehicle itself affects the analysis of gene expression and found that corn oil alone affected the food consumption and biochemical parameters mainly related to lipid metabolism, and this accompanied typical changes in the gene expression. Most of the genes modulated by corn oil were related to cholesterol or fatty acid metabolism (e.g., CYP7A1, CYP8B1, 3-hydroxy-3-methylglutaryl-Coenzyme A reductase, squalene epoxidase, angiopoietin-like protein 4, fatty acid synthase, fatty acid binding proteins), suggesting that the response was physiologic to the oil intake. Many of the lipid-related genes showed circadian rhythm within a day, but the expression pattern of general clock genes (e.g., period 2, arylhydrocarbon nuclear receptor translocator-like, D site albumin promoter binding protein) were unaffected by corn oil, suggesting that the effects are specific for lipid metabolism. These results would be useful for usage of the database especially when drugs with different vehicle control are compared.
Estrogen alters the profile of the transcriptome in river snail Bellamya aeruginosa.

PubMed

Lei, Kun; Liu, Ruizhi; An, Li-Hui; Luo, Ying-Feng; LeBlanc, Gerald A

2015-03-01

We evaluated the transcriptome dynamics of the freshwater river snail Bellamya aeruginosa exposed to 17β-estradiol (E2) using the Roche/454 GS-FLX platform. In total, 41,869 unigenes, with an average length of 586 bp, representing 36,181 contigs and 5,688 singlets were obtained. Among them, 18.08, 36.85, and 25.47 % matched sequences in the GenBank non-redundant nucleic acid database, non-redundant protein database, and Swiss protein database, respectively. Annotation of the unigenes with gene ontology, and then mapping them to biological pathways, revealed large groups of genes related to growth, development, reproduction, signal transduction, and defense mechanisms. Significant differences were found in gene expression in both liver and testicular tissues between control and E2-exposed organisms. These changes in gene expression will help in understanding the molecular mechanisms of the response to physiological stress in the river snail exposed to estrogen, and will facilitate research into biological processes and underlying physiological adaptations to xenoestrogen exposure in gastropods.
The 24th annual Nucleic Acids Research database issue: a look back and upcoming changes

PubMed Central

Rigden, Daniel J

2017-01-01

Abstract This year's Database Issue of Nucleic Acids Research contains 152 papers that include descriptions of 54 new databases and update papers on 98 databases, of which 16 have not been previously featured in NAR. As always, these databases cover a broad range of molecular biology subjects, including genome structure, gene expression and its regulation, proteins, protein domains, and protein–protein interactions. Following the recent trend, an increasing number of new and established databases deal with the issues of human health, from cancer-causing mutations to drugs and drug targets. In accordance with this trend, three recently compiled databases that have been selected by NAR reviewers and editors as ‘breakthrough’ contributions, denovo-db, the Monarch Initiative, and Open Targets, cover human de novo gene variants, disease-related phenotypes in model organisms, and a bioinformatics platform for therapeutic target identification and validation, respectively. We expect these databases to attract the attention of numerous researchers working in various areas of genetics and genomics. Looking back at the past 12 years, we present here the ‘golden set’ of databases that have consistently served as authoritative, comprehensive, and convenient data resources widely used by the entire community and offer some lessons on what makes a successful database. The Database Issue is freely available online at the https://academic.oup.com/nar web site. An updated version of the NAR Molecular Biology Database Collection is available at http://www.oxfordjournals.org/nar/database/a/. PMID:28053160
Computational genomic analysis of PARK7 interactome reveals high BBS1 gene expression as a prognostic factor favoring survival in malignant pleural mesothelioma.

PubMed

Vavougios, Georgios D; Solenov, Evgeniy I; Hatzoglou, Chrissi; Baturina, Galina S; Katkova, Liubov E; Molyvdas, Paschalis Adam; Gourgoulianis, Konstantinos I; Zarogiannis, Sotirios G

2015-10-01

The aim of our study was to assess the differential gene expression of Parkinson protein 7 (PARK7) interactome in malignant pleural mesothelioma (MPM) using data mining techniques to identify novel candidate genes that may play a role in the pathogenicity of MPM. We constructed the PARK7 interactome using the ConsensusPathDB database. We then interrogated the Oncomine Cancer Microarray database using the Gordon Mesothelioma Study, for differential gene expression of the PARK7 interactome. In ConsensusPathDB, 38 protein interactors of PARK7 were identified. In the Gordon Mesothelioma Study, 34 of them were assessed out of which SUMO1, UBC3, KIAA0101, HDAC2, DAXX, RBBP4, BBS1, NONO, RBBP7, HTRA2, and STUB1 were significantly overexpressed whereas TRAF6 and MTA2 were significantly underexpressed in MPM patients (network 2). Furthermore, Kaplan-Meier analysis revealed that MPM patients with high BBS1 expression had a median overall survival of 16.5 vs. 8.7 mo of those that had low expression. For validation purposes, we performed a meta-analysis in Oncomine database in five sarcoma datasets. Eight network 2 genes (KIAA0101, HDAC2, SUMO1, RBBP4, NONO, RBBP7, HTRA2, and MTA2) were significantly differentially expressed in an array of 18 different sarcoma types. Finally, Gene Ontology annotation enrichment analysis revealed significant roles of the PARK7 interactome in NuRD, CHD, and SWI/SNF protein complexes. In conclusion, we identified 13 novel genes differentially expressed in MPM, never reported before. Among them, BBS1 emerged as a novel predictor of overall survival in MPM. Finally, we identified that PARK7 interactome is involved in novel pathways pertinent in MPM disease. Copyright © 2015 the American Physiological Society.
Changes in inflorescence protein during advanced stages of floret development in Buchloe dactyloides (Poaceae).

PubMed

Zhou, Y-J; Xue, J-G; Wang, X-G; Zhang, X-Q

2012-11-12

Buffalograss, Buchloe dactyloides, is a dioecious species native to the Great Plains of North America. The florets at the early stages of development possess both gynoecium and androecium organ primordia but later become unisexual. Very little is known about the proteomic changes that occur when the florets change from hermaphroditism to unisexuality. We compared the protein composition of florets at the hermaphroditic stage with that at the unisexual stage. The development stage of the floret was determined by stereomicroscopic observation. Two-dimensional gel electrophoresis was used to separate the proteins extracted from female and male inflorescences. Stage- specific protein maps, with an average of about 400 spots per map, were analyzed with the protein analysis software. Eighteen spots were found to be differentially expressed between the hermaphrodite and unisexual stages. Of these, 12 were present at both stages but with a different expression value. Four specific spots appeared at the hermaphrodite stage and disappeared at the unisexual stage. Two specific protein spots were associated with female and male floret differentiation. One appears to be associated with contabescence in the female floret and the final protein appears to lead to the abortion of gynoecium in the male floret. The MALDI TOF/TOF technique was used for peptide mass fingerprinting of the differentially expressed proteins and the MASCOT software was used to search the protein database. However, only two protein spots were identified from the database. These were aldolase1 and Os05g0574400 (similar to malate dehydrogenase). This type of proteomic study can help to identify novel protein products and determine the mechanisms involved in the floral sex differentiation process in buffalo grass.
A novel method to identify pathways associated with renal cell carcinoma based on a gene co-expression network

PubMed Central

RUAN, XIYUN; LI, HONGYUN; LIU, BO; CHEN, JIE; ZHANG, SHIBAO; SUN, ZEQIANG; LIU, SHUANGQING; SUN, FAHAI; LIU, QINGYONG

2015-01-01

The aim of the present study was to develop a novel method for identifying pathways associated with renal cell carcinoma (RCC) based on a gene co-expression network. A framework was established where a co-expression network was derived from the database as well as various co-expression approaches. First, the backbone of the network based on differentially expressed (DE) genes between RCC patients and normal controls was constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database. The differentially co-expressed links were detected by Pearson’s correlation, the empirical Bayesian (EB) approach and Weighted Gene Co-expression Network Analysis (WGCNA). The co-expressed gene pairs were merged by a rank-based algorithm. We obtained 842; 371; 2,883 and 1,595 co-expressed gene pairs from the co-expression networks of the STRING database, Pearson’s correlation EB method and WGCNA, respectively. Two hundred and eighty-one differentially co-expressed (DC) gene pairs were obtained from the merged network using this novel method. Pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and the network enrichment analysis (NEA) method were performed to verify feasibility of the merged method. Results of the KEGG and NEA pathway analyses showed that the network was associated with RCC. The suggested method was computationally efficient to identify pathways associated with RCC and has been identified as a useful complement to traditional co-expression analysis. PMID:26058425

A maize database resource that captures tissue-specific and subcellular-localized gene expression, via fluorescent tags and confocal imaging (Maize Cell Genomics Database).

PubMed

Krishnakumar, Vivek; Choi, Yongwook; Beck, Erin; Wu, Qingyu; Luo, Anding; Sylvester, Anne; Jackson, David; Chan, Agnes P

2015-01-01

Maize is a global crop and a powerful system among grain crops for genetic and genomic studies. However, the development of novel biological tools and resources to aid in the functional identification of gene sequences is greatly needed. Towards this goal, we have developed a collection of maize marker lines for studying native gene expression in specific cell types and subcellular compartments using fluorescent proteins (FPs). To catalog FP expression, we have developed a public repository, the Maize Cell Genomics (MCG) Database, (http://maize.jcvi.org/cellgenomics), to organize a large data set of confocal images generated from the maize marker lines. To date, the collection represents major subcellular structures and also developmentally important progenitor cell populations. The resource is available to the research community, for example to study protein localization or interactions under various experimental conditions or mutant backgrounds. A subset of the marker lines can also be used to induce misexpression of target genes through a transactivation system. For future directions, the image repository can be expanded to accept new image submissions from the research community, and to perform customized large-scale computational image analysis. This community resource will provide a suite of new tools for gaining biological insights by following the dynamics of protein expression at the subcellular, cellular and tissue levels. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Differences in expression of retinal pigment epithelium mRNA between normal canines

PubMed Central

2004-01-01

Abstract A reference database of differences in mRNA expression in normal healthy canine retinal pigment epithelium (RPE) has been established. This database identifies non-informative differences in mRNA expression that can be used in screening canine RPE for mutations associated with clinical effects on vision. Complementary DNA (cDNA) pools were prepared from mRNA harvested from RPE, amplified by PCR, and used in a subtractive hybridization protocol (representational differential analysis) to identify differences in RPE mRNA expression between canines. The effect of relatedness of the test canines on the frequency of occurrence of differences was evaluated by using 2 unrelated canines for comparison with 2 female sibling canines of blue heeler/bull terrier lineage. Differentially expressed cDNA species were cloned, sequenced, and identified by comparison to public database entries. The most frequently observed differentially expressed sequence from the unrelated canine comparison was cDNA with 21 base pairs (bp) identical to the human epithelial membrane protein 1 gene (present in 8 of 20 clones). Different clones from the same-sex sibling RPE contained repetitions of several short sequence motifs including the human epithelial membrane protein 1 (4 of 25 clones). Other prevalent differences between sibling RPE included sequences similar to a chicken genetic marker sequence motif (5 of 25), and 6 clones with homology to porcine major histocompatibility loci. In addition to identifying several repetitively occurring, noninformative, differentially expressed RPE mRNA species, the findings confirm that fewer differences occurred between siblings, highlighting the importance of using closely related subjects in representational difference analysis studies. PMID:15352545
SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics

PubMed Central

2013-01-01

Background Alternative splicing is an important and widespread mechanism for generating protein diversity and regulating protein expression. High-throughput identification and analysis of alternative splicing in the protein level has more advantages than in the mRNA level. The combination of alternative splicing database and tandem mass spectrometry provides a powerful technique for identification, analysis and characterization of potential novel alternative splicing protein isoforms from proteomics. Therefore, based on the peptidomic database of human protein isoforms for proteomics experiments, our objective is to design a new alternative splicing database to 1) provide more coverage of genes, transcripts and alternative splicing, 2) exclusively focus on the alternative splicing, and 3) perform context-specific alternative splicing analysis. Results We used a three-step pipeline to create a synthetic alternative splicing database (SASD) to identify novel alternative splicing isoforms and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. First, we extracted information on gene structures of all genes in the Ensembl Genes 71 database and incorporated the Integrated Pathway Analysis Database. Then, we compiled artificial splicing transcripts. Lastly, we translated the artificial transcripts into alternative splicing peptides. The SASD is a comprehensive database containing 56,630 genes (Ensembl gene IDs), 95,260 transcripts (Ensembl transcript IDs), and 11,919,779 Alternative Splicing peptides, and also covering about 1,956 pathways, 6,704 diseases, 5,615 drugs, and 52 organs. The database has a web-based user interface that allows users to search, display and download a single gene/transcript/protein, custom gene set, pathway, disease, drug, organ related alternative splicing. Moreover, the quality of the database was validated with comparison to other known databases and two case studies: 1) in liver cancer and 2) in breast cancer. Conclusions The SASD provides the scientific community with an efficient means to identify, analyze, and characterize novel Exon Skipping and Intron Retention protein isoforms from mass spectrometry and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. PMID:24267658
Proteomic analysis of mature and immature ejaculated spermatozoa from fertile men

PubMed Central

Cui, Zhihong; Sharma, Rakesh; Agarwal, Ashok

2016-01-01

Dysfunctional spermatozoa maturation is the main reason for the decrease in sperm motility and morphology in infertile men. Ejaculated spermatozoa from healthy fertile men were separated into four fractions using three-layer density gradient. Proteins were extracted and bands were digested on a LTQ-Orbitrap Elite hybrid mass spectrometer system. Functional annotations of proteins were obtained using bioinformatics tools and pathway databases. Western blotting was performed to verify the expression levels of the proteins of interest. 1469 proteins were identified in four fractions of spermatozoa. The number of detected proteins decreased according to the maturation level of spermatozoa. During spermatozoa maturation, proteins involved in gamete generation, cell motility, energy metabolism and oxidative phosphorylation processes showed increasing expression levels and those involved in protein biosynthesis, protein transport, protein ubiquitination, and response to oxidative stress processes showed decreasing expression levels. We validated four proteins (HSP 70 1A, clusterin, tektin 2 and tektin 3) by Western blotting. The study shows protein markers that may provide insight into the ejaculated spermatozoa proteins in different stages of sperm maturation that may be altered or modified in infertile men. PMID:26510506
Identification of GRB2 and GAB1 Coexpression as an Unfavorable Prognostic Factor for Hepatocellular Carcinoma by a Combination of Expression Profile and Network Analysis

PubMed Central

Yang, Mei; Wang, Danhua; Yu, Lingxiang; Guo, Chaonan; Guo, Xiaodong; Lin, Na

2013-01-01

Aim To screen novel markers for hepatocellular carcinoma (HCC) by a combination of expression profile, interaction network analysis and clinical validation. Methods HCC significant molecules which are differentially expressed or had genetic variations in HCC tissues were obtained from five existing HCC related databases (OncoDB.HCC, HCC.net, dbHCCvar, EHCO and Liverome). Then, the protein-protein interaction (PPI) network of these molecules was constructed. Three topological features of the network ('Degree', 'Betweenness', and 'Closeness') and the k-core algorithm were used to screen candidate HCC markers which play crucial roles in tumorigenesis of HCC. Furthermore, the clinical significance of two candidate HCC markers growth factor receptor-bound 2 (GRB2) and GRB2-associated-binding protein 1 (GAB1) was validated. Results In total, 6179 HCC significant genes and 977 HCC significant proteins were collected from existing HCC related databases. After network analysis, 331 candidate HCC markers were identified. Especially, GAB1 has the highest k-coreness suggesting its central localization in HCC related network, and the interaction between GRB2 and GAB1 has the largest edge-betweenness implying it may be biologically important to the function of HCC related network. As the results of clinical validation, the expression levels of both GRB2 and GAB1 proteins were significantly higher in HCC tissues than those in their adjacent nonneoplastic tissues. More importantly, the combined GRB2 and GAB1 protein expression was significantly associated with aggressive tumor progression and poor prognosis in patients with HCC. Conclusion This study provided an integrative analysis by combining expression profile and interaction network analysis to identify a list of biologically significant HCC related markers and pathways. Further experimental validation indicated that the aberrant expression of GRB2 and GAB1 proteins may be strongly related to tumor progression and prognosis in patients with HCC. The overexpression of GRB2 in combination with upregulation of GAB1 may be an unfavorable prognostic factor for HCC. PMID:24391994
Generation and Analysis of Expressed Sequence Tags from Olea europaea L.

PubMed Central

Ozdemir Ozgenturk, Nehir; Oruç, Fatma; Sezerman, Ugur; Kuçukural, Alper; Vural Korkut, Senay; Toksoz, Feriha; Un, Cemal

2010-01-01

Olive (Olea europaea L.) is an important source of edible oil which was originated in Near-East region. In this study, two cDNA libraries were constructed from young olive leaves and immature olive fruits for generation of ESTs to discover the novel genes and search the function of unknown genes of olive. The randomly selected 3840 colonies were sequenced for EST collection from both libraries. Readable 2228 sequences for olive leaf and 1506 sequences for olive fruit were assembled into 205 and 69 contigs, respectively, whereas 2478 were singletons. Putative functions of all 2752 differentially expressed unique sequences were designated by gene homology based on BLAST and annotated using BLAST2GO. While 1339 ESTs show no homology to the database, 2024 ESTs have homology (under 80%) with hypothetical proteins, putative proteins, expressed proteins, and unknown proteins in NCBI-GenBank. 635 EST's unique genes sequence have been identified by over 80% homology to known function in other species which were not previously described in Olea family. Only 3.1% of total EST's was shown similarity with olive database existing in NCBI. This generated EST's data and consensus sequences were submitted to NCBI as valuable source for functional genome studies of olive. PMID:21197085
Proteomics-based approach identified differentially expressed proteins with potential roles in endometrial carcinoma.

PubMed

Li, Zhengyu; Min, Wenjiao; Huang, Canhua; Bai, Shujun; Tang, Minghai; Zhao, Xia

2010-01-01

We used proteomic approaches to identify altered expressed proteins in endometrial carcinoma, with the aim of discovering potential biomarkers or therapeutic targets for endometrial carcinoma. The global proteins extracted from endometrial carcinoma and normal endometrial tissues were separated by 2-dimensional electrophoresis and analyzed with PDQuest (Bio-Rad, Hercules, Calif) software. The differentially expressed spots were identified by mass spectrometry and searched against NCBInr protein database. Those proteins with potential roles were confirmed by Western blotting and immunohistochemical assays. Ninety-nine proteins were identified by mass spectrometry, and a cluster diagram analysis indicated that these proteins were involved in metabolism, cell transformation, protein folding, translation and modification, proliferation and apoptosis, signal transduction, cytoskeleton, and so on. In confirmatory immunoblotting and immunohistochemical analyses, overexpressions of epidermal fatty acid-binding protein, calcyphosine, and cyclophilin A were also observed in endometrial carcinoma tissues, which were consistent with the proteomic results. Our results suggested that these identified proteins, including epidermal fatty acid-binding protein, calcyphosine, and cyclophilin A, might be of potential values in the studies of endometrial carcinogenesis or investigations of diagnostic biomarkers or treatment targets for endometrial carcinoma.
Protein markers for identification of Yersinia pestis and their variation related to culture

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wunschel, David S.; Engelmann, Heather E.; Victry, Kristin D.

2013-12-11

The detection of high consequence pathogens, such as Yersinia pestis, is well established in biodefense laboratories for bioterror situations. Laboratory protocols are well established using specified culture media and a growth temperature of 37 °C for expression of specific antigens. Direct detection of Y. pestis protein markers, without prior culture, depends on their expression. Unfortunately protein expression can be impacted by the culture medium which cannot be predicted ahead of time. Furthermore, higher biomass yields are obtained at the optimal growth temperature (i.e. 28 °C–30 °C) and therefore are more likely to be used for bulk production. Analysis of Y.more » pestis grown on several types of media at 30 °C showed that several protein markers were found to be differentially detected in different media. Analysis of the identified proteins against a comprehensive database provided an additional level of organism identification. Peptides corresponding to variable regions of some proteins could separate large groups of strains and aid in organism identification. This work illustrates the need to understand variability of protein expression for detection targets. The potential for relating expression changes of known proteins to specific media factors, even in nutrient rich and chemically complex culture medium, may provide the opportunity to draw forensic information from protein profiles.« less
An ontology-based search engine for protein-protein interactions

PubMed Central

2010-01-01

Background Keyword matching or ID matching is the most common searching method in a large database of protein-protein interactions. They are purely syntactic methods, and retrieve the records in the database that contain a keyword or ID specified in a query. Such syntactic search methods often retrieve too few search results or no results despite many potential matches present in the database. Results We have developed a new method for representing protein-protein interactions and the Gene Ontology (GO) using modified Gödel numbers. This representation is hidden from users but enables a search engine using the representation to efficiently search protein-protein interactions in a biologically meaningful way. Given a query protein with optional search conditions expressed in one or more GO terms, the search engine finds all the interaction partners of the query protein by unique prime factorization of the modified Gödel numbers representing the query protein and the search conditions. Conclusion Representing the biological relations of proteins and their GO annotations by modified Gödel numbers makes a search engine efficiently find all protein-protein interactions by prime factorization of the numbers. Keyword matching or ID matching search methods often miss the interactions involving a protein that has no explicit annotations matching the search condition, but our search engine retrieves such interactions as well if they satisfy the search condition with a more specific term in the ontology. PMID:20122195
An ontology-based search engine for protein-protein interactions.

PubMed

Park, Byungkyu; Han, Kyungsook

2010-01-18

Keyword matching or ID matching is the most common searching method in a large database of protein-protein interactions. They are purely syntactic methods, and retrieve the records in the database that contain a keyword or ID specified in a query. Such syntactic search methods often retrieve too few search results or no results despite many potential matches present in the database. We have developed a new method for representing protein-protein interactions and the Gene Ontology (GO) using modified Gödel numbers. This representation is hidden from users but enables a search engine using the representation to efficiently search protein-protein interactions in a biologically meaningful way. Given a query protein with optional search conditions expressed in one or more GO terms, the search engine finds all the interaction partners of the query protein by unique prime factorization of the modified Gödel numbers representing the query protein and the search conditions. Representing the biological relations of proteins and their GO annotations by modified Gödel numbers makes a search engine efficiently find all protein-protein interactions by prime factorization of the numbers. Keyword matching or ID matching search methods often miss the interactions involving a protein that has no explicit annotations matching the search condition, but our search engine retrieves such interactions as well if they satisfy the search condition with a more specific term in the ontology.
Expression and regulation of long noncoding RNAs during the osteogenic differentiation of periodontal ligament stem cells in the inflammatory microenvironment.

PubMed

Zhang, Qingbin; Chen, Li; Cui, Shiman; Li, Yan; Zhao, Qi; Cao, Wei; Lai, Shixiang; Yin, Sanjun; Zuo, Zhixiang; Ren, Jian

2017-10-25

Although long noncoding RNAs (lncRNAs) have been emerging as critical regulators in various tissues and biological processes, little is known about their expression and regulation during the osteogenic differentiation of periodontal ligament stem cells (PDLSCs) in inflammatory microenvironment. In this study, we have identified 63 lncRNAs that are not annotated in previous database. These novel lncRNAs were not randomly located in the genome but preferentially located near protein-coding genes related to particular functions and diseases, such as stem cell maintenance and differentiation, development disorders and inflammatory diseases. Moreover, we have identified 650 differentially expressed lncRNAs among different subsets of PDLSCs. Pathway enrichment analysis for neighboring protein-coding genes of these differentially expressed lncRNAs revealed stem cell differentiation related functions. Many of these differentially expressed lncRNAs function as competing endogenous RNAs that regulate protein-coding transcripts through competing shared miRNAs.
ESTuber db: an online database for Tuber borchii EST sequences.

PubMed

Lazzari, Barbara; Caprera, Andrea; Cosentino, Cristian; Stella, Alessandra; Milanesi, Luciano; Viotti, Angelo

2007-03-08

The ESTuber database (http://www.itb.cnr.it/estuber) includes 3,271 Tuber borchii expressed sequence tags (EST). The dataset consists of 2,389 sequences from an in-house prepared cDNA library from truffle vegetative hyphae, and 882 sequences downloaded from GenBank and representing four libraries from white truffle mycelia and ascocarps at different developmental stages. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts. Data were collected in a MySQL database, which can be queried via a php-based web interface. Sequences included in the ESTuber db were clustered and annotated against three databases: the GenBank nr database, the UniProtKB database and a third in-house prepared database of fungi genomic sequences. An algorithm was implemented to infer statistical classification among Gene Ontology categories from the ontology occurrences deduced from the annotation procedure against the UniProtKB database. Ontologies were also deduced from the annotation of more than 130,000 EST sequences from five filamentous fungi, for intra-species comparison purposes. Further analyses were performed on the ESTuber db dataset, including tandem repeats search and comparison of the putative protein dataset inferred from the EST sequences to the PROSITE database for protein patterns identification. All the analyses were performed both on the complete sequence dataset and on the contig consensus sequences generated by the EST assembly procedure. The resulting web site is a resource of data and links related to truffle expressed genes. The Sequence Report and Contig Report pages are the web interface core structures which, together with the Text search utility and the Blast utility, allow easy access to the data stored in the database.
Protein Expression Profile using Two-Dimensional Gel Analysis in Squamous Cervical Cancer Patients

PubMed Central

Bae, Su-Mi; Min, Hyun-Jin; Ding, Guo Hua; Kwak, Sun-Young; Cho, Young-Lae; Nam, Kye-Hyun; Park, Choong Hak; Kim, Yong-Wan; Kim, Chong-Kook; Han, Byoung-Don; Lee, Young-Joo; Kim, Do Kang

2006-01-01

Purpose Screening in cervical cancer is now progressing to discover candidate genes and proteins that may serve as biological markers and that play a role in tumor progression. We examined the protein expression patterns of the squamous cell carcinoma (SCC) tissues from Korean women with using two- dimensional polyacrylamide gel electrophoresis (2-DE) and matrix assisted laser desorption/ionization-time of flight (MALDI- TOF) mass spectrometer. Materials and Methods Normal cervix and SCC tissues were solubilized and 2-DE was performed using pH 3~10 linear IPG strips of 17 cm length. The protein expression was evaluated using PDQuest 2-D software™. The differentially expressed protein spots were identified with a MALDI-TOF mass spectrometer, and the peptide mass spectra identifications were performed using the Mascot program and by searching the Swiss-prot or NCBInr databases. Results A total of 35 proteins were detected in SCC. 17 proteins were up-regulated and 18 proteins weredown-regulated. Among the proteins that were identified, 12 proteins (pigment epithelium derived factor, annexin A2 and A5, keratin 19 and 20, heat shock protein 27, smooth muscle protein 22 alpha, α-enolase, squamous cell carcinoma antigen 1 and 2, glutathione S-transferase and apolipoprotein a1) were protein previously known to be involved in tumor, and 21 proteins were newly identified in this study. Conclusion 2-DE offers the total protein expression profiles of SCC tissues; further characterization of these differentially expressed proteins will give a chance to identify the badly needed tumor-specific diagnostic markers for SCC. PMID:19771267
A Graph-Centric Approach for Metagenome-Guided Peptide and Protein Identification in Metaproteomics

PubMed Central

Tang, Haixu; Li, Sujun; Ye, Yuzhen

2016-01-01

Metaproteomic studies adopt the common bottom-up proteomics approach to investigate the protein composition and the dynamics of protein expression in microbial communities. When matched metagenomic and/or metatranscriptomic data of the microbial communities are available, metaproteomic data analyses often employ a metagenome-guided approach, in which complete or fragmental protein-coding genes are first directly predicted from metagenomic (and/or metatranscriptomic) sequences or from their assemblies, and the resulting protein sequences are then used as the reference database for peptide/protein identification from MS/MS spectra. This approach is often limited because protein coding genes predicted from metagenomes are incomplete and fragmental. In this paper, we present a graph-centric approach to improving metagenome-guided peptide and protein identification in metaproteomics. Our method exploits the de Bruijn graph structure reported by metagenome assembly algorithms to generate a comprehensive database of protein sequences encoded in the community. We tested our method using several public metaproteomic datasets with matched metagenomic and metatranscriptomic sequencing data acquired from complex microbial communities in a biological wastewater treatment plant. The results showed that many more peptides and proteins can be identified when assembly graphs were utilized, improving the characterization of the proteins expressed in the microbial communities. The additional proteins we identified contribute to the characterization of important pathways such as those involved in degradation of chemical hazards. Our tools are released as open-source software on github at https://github.com/COL-IU/Graph2Pro. PMID:27918579
On-the-fly selection of cell-specific enhancers, genes, miRNAs and proteins across the human body using SlideBase

PubMed Central

Ienasescu, Hans; Li, Kang; Andersson, Robin; Vitezic, Morana; Rennie, Sarah; Chen, Yun; Vitting-Seerup, Kristoffer; Lagoni, Emil; Boyd, Mette; Bornholdt, Jette; de Hoon, Michiel J. L.; Kawaji, Hideya; Lassmann, Timo; Hayashizaki, Yoshihide; Forrest, Alistair R. R.; Carninci, Piero; Sandelin, Albin

2016-01-01

Genomics consortia have produced large datasets profiling the expression of genes, micro-RNAs, enhancers and more across human tissues or cells. There is a need for intuitive tools to select subsets of such data that is the most relevant for specific studies. To this end, we present SlideBase, a web tool which offers a new way of selecting genes, promoters, enhancers and microRNAs that are preferentially expressed/used in a specified set of cells/tissues, based on the use of interactive sliders. With the help of sliders, SlideBase enables users to define custom expression thresholds for individual cell types/tissues, producing sets of genes, enhancers etc. which satisfy these constraints. Changes in slider settings result in simultaneous changes in the selected sets, updated in real time. SlideBase is linked to major databases from genomics consortia, including FANTOM, GTEx, The Human Protein Atlas and BioGPS. Database URL: http://slidebase.binf.ku.dk PMID:28025337
Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation

NASA Astrophysics Data System (ADS)

Sheynkman, Gloria M.; Shortreed, Michael R.; Cesnik, Anthony J.; Smith, Lloyd M.

2016-06-01

Mass spectrometry-based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications.
Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation

PubMed Central

Sheynkman, Gloria M.; Shortreed, Michael R.; Cesnik, Anthony J.; Smith, Lloyd M.

2016-01-01

Mass spectrometry–based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications. PMID:27049631
Distinct profiles of expressed sequence tags during intestinal regeneration in the sea cucumber Holothuria glaberrima

PubMed Central

Rojas-Cartagena, Carmencita; Ortíz-Pineda, Pablo; Ramírez-Gómez, Francisco; Suárez-Castillo, Edna C.; Matos-Cruz, Vanessa; Rodríguez, Carlos; Ortíz-Zuazaga, Humberto; García-Arrarás, José E.

2010-01-01

Repair and regeneration are key processes for tissue maintenance, and their disruption may lead to disease states. Little is known about the molecular mechanisms that underline the repair and regeneration of the digestive tract. The sea cucumber Holothuria glaberrima represents an excellent model to dissect and characterize the molecular events during intestinal regeneration. To study the gene expression profile, cDNA libraries were constructed from normal, 3-day, and 7-day regenerating intestines of H. glaberrima. Clones were randomly sequenced and queried against the nonredundant protein database at the National Center for Biotechnology Information. RT-PCR analyses were made of several genes to determine their expression profile during intestinal regeneration. A total of 5,173 sequences from three cDNA libraries were obtained. About 46.2, 35.6, and 26.2% of the sequences for the normal, 3-days, and 7-days cDNA libraries, respectively, shared significant similarity with known sequences in the protein database of GenBank but only present 10% of similarity among them. Analysis of the libraries in terms of functional processes, protein domains, and most common sequences suggests that a differential expression profile is taking place during the regeneration process. Further examination of the expressed sequence tag dataset revealed that 12 putative genes are differentially expressed at significant level (R > 6). Experimental validation by RT-PCR analysis reveals that at least three genes (unknown C-4677-1, melanotransferrin, and centaurin) present a differential expression during regeneration. These findings strongly suggest that the gene expression profile varies among regeneration stages and provide evidence for the existence of differential gene expression. PMID:17579180
The Gene Expression Omnibus Database.

PubMed

Clough, Emily; Barrett, Tanya

2016-01-01

The Gene Expression Omnibus (GEO) database is an international public repository that archives and freely distributes high-throughput gene expression and other functional genomics data sets. Created in 2000 as a worldwide resource for gene expression studies, GEO has evolved with rapidly changing technologies and now accepts high-throughput data for many other data applications, including those that examine genome methylation, chromatin structure, and genome-protein interactions. GEO supports community-derived reporting standards that specify provision of several critical study elements including raw data, processed data, and descriptive metadata. The database not only provides access to data for tens of thousands of studies, but also offers various Web-based tools and strategies that enable users to locate data relevant to their specific interests, as well as to visualize and analyze the data. This chapter includes detailed descriptions of methods to query and download GEO data and use the analysis and visualization tools. The GEO homepage is at http://www.ncbi.nlm.nih.gov/geo/.
The Gene Expression Omnibus database

PubMed Central

Clough, Emily; Barrett, Tanya

2016-01-01

The Gene Expression Omnibus (GEO) database is an international public repository that archives and freely distributes high-throughput gene expression and other functional genomics data sets. Created in 2000 as a worldwide resource for gene expression studies, GEO has evolved with rapidly changing technologies and now accepts high-throughput data for many other data applications, including those that examine genome methylation, chromatin structure, and genome–protein interactions. GEO supports community-derived reporting standards that specify provision of several critical study elements including raw data, processed data, and descriptive metadata. The database not only provides access to data for tens of thousands of studies, but also offers various Web-based tools and strategies that enable users to locate data relevant to their specific interests, as well as to visualize and analyze the data. This chapter includes detailed descriptions of methods to query and download GEO data and use the analysis and visualization tools. The GEO homepage is at http://www.ncbi.nlm.nih.gov/geo/. PMID:27008011

A Database of Reaction Monitoring Mass Spectrometry Assays for Elucidating Therapeutic Response in Cancer

PubMed Central

Remily-Wood, Elizabeth R.; Liu, Richard Z.; Xiang, Yun; Chen, Yi; Thomas, C. Eric; Rajyaguru, Neal; Kaufman, Laura M.; Ochoa, Joana E.; Hazlehurst, Lori; Pinilla-Ibarz, Javier; Lancet, Jeffrey; Zhang, Guolin; Haura, Eric; Shibata, David; Yeatman, Timothy; Smalley, Keiran S.M.; Dalton, William S.; Huang, Emina; Scott, Ed; Bloom, Gregory C.; Eschrich, Steven A.; Koomen, John M.

2012-01-01

Purpose The Quantitative Assay Database (QuAD), http://proteome.moffitt.org/QUAD/, facilitates widespread implementation of quantitative mass spectrometry in cancer biology and clinical research through sharing of methods and reagents for monitoring protein expression and modification. Experimental Design Liquid chromatography coupled to multiple reaction monitoring mass spectrometry (LC-MRM) assays are developed using SDS-PAGE fractionated lysates from cancer cell lines. Pathway maps created using GeneGO Metacore provide the biological relationships between proteins and illustrate concepts for multiplexed analysis; each protein can be selected to examine assay development at the protein and peptide level. Results The coupling of SDS-PAGE and LC-MRM screening has been used to detect 876 peptides from 218 cancer-related proteins in model systems including colon, lung, melanoma, leukemias, and myeloma, which has led to the development of 95 quantitative assays including stable-isotope labeled peptide standards. Methods are published online and peptide standards are made available to the research community. Protein expression measurements for heat shock proteins, including a comparison with ELISA and monitoring response to the HSP90 inhibitor, 17-DMAG, are used to illustrate the components of the QuAD and its potential utility. Conclusions and Clinical Relevance This resource enables quantitative assessment of protein components of signaling pathways and biological processes and holds promise for systematic investigation of treatment responses in cancer. PMID:21656910
Generation and analysis of expressed sequence tags from the bone marrow of Chinese Sika deer.

PubMed

Yao, Baojin; Zhao, Yu; Zhang, Mei; Li, Juan

2012-03-01

Sika deer is one of the best-known and highly valued animals of China. Despite its economic, cultural, and biological importance, there has not been a large-scale sequencing project for Sika deer to date. With the ultimate goal of sequencing the complete genome of this organism, we first established a bone marrow cDNA library for Sika deer and generated a total of 2,025 reads. After processing the sequences, 2,017 high-quality expressed sequence tags (ESTs) were obtained. These ESTs were assembled into 1,157 unigenes, including 238 contigs and 919 singletons. Comparative analyses indicated that 888 (76.75%) of the unigenes had significant matches to sequences in the non-redundant protein database, In addition to highly expressed genes, such as stearoyl-CoA desaturase, cytochrome c oxidase, adipocyte-type fatty acid-binding protein, adiponectin and thymosin beta-4, we also obtained vascular endothelial growth factor-A and heparin-binding growth-associated molecule, both of which are of great importance for angiogenesis research. There were 244 (21.09%) unigenes with no significant match to any sequence in current protein or nucleotide databases, and these sequences may represent genes with unknown function in Sika deer. Open reading frame analysis of the sequences was performed using the getorf program. In addition, the sequences were functionally classified using the gene ontology hierarchy, clusters of orthologous groups of proteins and Kyoto encyclopedia of genes and genomes databases. Analysis of ESTs described in this paper provides an important resource for the transcriptome exploration of Sika deer, and will also facilitate further studies on functional genomics, gene discovery and genome annotation of Sika deer.
MicroRNA expression, target genes, and signaling pathways in infants with a ventricular septal defect.

PubMed

Chai, Hui; Yan, Zhaoyuan; Huang, Ke; Jiang, Yuanqing; Zhang, Lin

2018-02-01

This study aimed to systematically investigate the relationship between miRNA expression and the occurrence of ventricular septal defect (VSD), and characterize the miRNA target genes and pathways that can lead to VSD. The miRNAs that were differentially expressed in blood samples from VSD and normal infants were screened and validated by implementing miRNA microarrays and qRT-PCR. The target genes regulated by differentially expressed miRNAs were predicted using three target gene databases. The functions and signaling pathways of the target genes were enriched using the GO database and KEGG database, respectively. The transcription and protein expression of specific target genes in critical pathways were compared in the VSD and normal control groups using qRT-PCR and western blotting, respectively. Compared with the normal control group, the VSD group had 22 differentially expressed miRNAs; 19 were downregulated and three were upregulated. The 10,677 predicted target genes participated in many biological functions related to cardiac development and morphogenesis. Four target genes (mGLUR, Gq, PLC, and PKC) were involved in the PKC pathway and four (ECM, FAK, PI3 K, and PDK1) were involved in the PI3 K-Akt pathway. The transcription and protein expression of these eight target genes were significantly upregulated in the VSD group. The 22 miRNAs that were dysregulated in the VSD group were mainly downregulated, which may result in the dysregulation of several key genes and biological functions related to cardiac development. These effects could also be exerted via the upregulation of eight specific target genes, the subsequent over-activation of the PKC and PI3 K-Akt pathways, and the eventual abnormal cardiac development and VSD.
[Establishment of a comprehensive database for laryngeal cancer related genes and the miRNAs].

PubMed

Li, Mengjiao; E, Qimin; Liu, Jialin; Huang, Tingting; Liang, Chuanyu

2015-09-01

By collecting and analyzing the laryngeal cancer related genes and the miRNAs, to build a comprehensive laryngeal cancer-related gene database, which differs from the current biological information database with complex and clumsy structure and focuses on the theme of gene and miRNA, and it could make the research and teaching more convenient and efficient. Based on the B/S architecture, using Apache as a Web server, MySQL as coding language of database design and PHP as coding language of web design, a comprehensive database for laryngeal cancer-related genes was established, providing with the gene tables, protein tables, miRNA tables and clinical information tables of the patients with laryngeal cancer. The established database containsed 207 laryngeal cancer related genes, 243 proteins, 26 miRNAs, and their particular information such as mutations, methylations, diversified expressions, and the empirical references of laryngeal cancer relevant molecules. The database could be accessed and operated via the Internet, by which browsing and retrieval of the information were performed. The database were maintained and updated regularly. The database for laryngeal cancer related genes is resource-integrated and user-friendly, providing a genetic information query tool for the study of laryngeal cancer.
Investigation of candidate genes for osteoarthritis based on gene expression profiles.

PubMed

Dong, Shuanghai; Xia, Tian; Wang, Lei; Zhao, Qinghua; Tian, Jiwei

2016-12-01

To explore the mechanism of osteoarthritis (OA) and provide valid biological information for further investigation. Gene expression profile of GSE46750 was downloaded from Gene Expression Omnibus database. The Linear Models for Microarray Data (limma) package (Bioconductor project, http://www.bioconductor.org/packages/release/bioc/html/limma.html) was used to identify differentially expressed genes (DEGs) in inflamed OA samples. Gene Ontology function enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment analysis of DEGs were performed based on Database for Annotation, Visualization and Integrated Discovery data, and protein-protein interaction (PPI) network was constructed based on the Search Tool for the Retrieval of Interacting Genes/Proteins database. Regulatory network was screened based on Encyclopedia of DNA Elements. Molecular Complex Detection was used for sub-network screening. Two sub-networks with highest node degree were integrated with transcriptional regulatory network and KEGG functional enrichment analysis was processed for 2 modules. In total, 401 up- and 196 down-regulated DEGs were obtained. Up-regulated DEGs were involved in inflammatory response, while down-regulated DEGs were involved in cell cycle. PPI network with 2392 protein interactions was constructed. Moreover, 10 genes including Interleukin 6 (IL6) and Aurora B kinase (AURKB) were found to be outstanding in PPI network. There are 214 up- and 8 down-regulated transcription factor (TF)-target pairs in the TF regulatory network. Module 1 had TFs including SPI1, PRDM1, and FOS, while module 2 contained FOSL1. The nodes in module 1 were enriched in chemokine signaling pathway, while the nodes in module 2 were mainly enriched in cell cycle. The screened DEGs including IL6, AGT, and AURKB might be potential biomarkers for gene therapy for OA by being regulated by TFs such as FOS and SPI1, and participating in the cell cycle and cytokine-cytokine receptor interaction pathway. Copyright © 2016 Turkish Association of Orthopaedics and Traumatology. Production and hosting by Elsevier B.V. All rights reserved.
The systematic annotation of the three main GPCR families in Reactome.

PubMed

Jassal, Bijay; Jupe, Steven; Caudy, Michael; Birney, Ewan; Stein, Lincoln; Hermjakob, Henning; D'Eustachio, Peter

2010-07-29

Reactome is an open-source, freely available database of human biological pathways and processes. A major goal of our work is to provide an integrated view of cellular signalling processes that spans from ligand-receptor interactions to molecular readouts at the level of metabolic and transcriptional events. To this end, we have built the first catalogue of all human G protein-coupled receptors (GPCRs) known to bind endogenous or natural ligands. The UniProt database has records for 797 proteins classified as GPCRs and sorted into families A/1, B/2 and C/3 on the basis of amino acid sequence. To these records we have added details from the IUPHAR database and our own manual curation of relevant literature to create reactions in which 563 GPCRs bind ligands and also interact with specific G-proteins to initiate signalling cascades. We believe the remaining 234 GPCRs are true orphans. The Reactome GPCR pathway can be viewed as a detailed interactive diagram and can be exported in many forms. It provides a template for the orthology-based inference of GPCR reactions for diverse model organism species, and can be overlaid with protein-protein interaction and gene expression datasets to facilitate overrepresentation studies and other forms of pathway analysis. Database URL: http://www.reactome.org.
Investigation of SnSPR1, a novel and abundant surface protein of Sarcocystis neurona merozoites.

PubMed

Zhang, Deqing; Howe, Daniel K

2008-04-15

An expressed sequence tag (EST) sequencing project has produced over 15,000 partial cDNA sequences from the equine pathogen Sarcocystis neurona. While many of the sequences are clear homologues of previously characterized genes, a significant number of the S. neurona ESTs do not exhibit similarity to anything in the extensive sequence databases that have been generated. In an effort to characterize parasite proteins that are novel to S. neurona, a seemingly unique gene was selected for further investigation based on its abundant representation in the collection of ESTs and the predicted presence of a signal peptide and glycolipid anchor addition on the encoded protein. The gene was expressed in E. coli, and monospecific polyclonal antiserum against the recombinant protein was produced by immunization of a rabbit. Characterization of the native protein in S. neurona merozoites and schizonts revealed that it is a low molecular weight surface protein that is expressed throughout intracellular development of the parasite. The protein was designated Surface Protein 1 (SPR1) to reflect its display on the outer surface of merozoites and to distinguish it from the ubiquitous SAG/SRS surface antigens of the heteroxenous Coccidia. Interestingly, infection assays in the presence of the polyclonal antiserum suggested that SnSPR1 plays some role in attachment and/or invasion of host cells by S. neurona merozoites. The work described herein represents a general template for selecting and characterizing the various unidentified gene sequences that are plentiful in the EST databases for S. neurona and other apicomplexans. Furthermore, this study illustrates the value of investigating these novel sequences since it can offer new candidates for diagnostic or vaccine development while also providing greater insight into the biology of these parasites.
Analysis of 10,000 ESTs from lymphocytes of the cynomolgus monkey to improve our understanding of its immune system

PubMed Central

Chen, Wei-Hua; Wang, Xue-Xia; Lin, Wei; He, Xiao-Wei; Wu, Zhen-Qiang; Lin, Ying; Hu, Song-Nian; Wang, Xiao-Ning

2006-01-01

Background The cynomolgus monkey (Macaca fascicularis) is one of the most widely used surrogate animal models for an increasing number of human diseases and vaccines, especially immune-system-related ones. Towards a better understanding of the gene expression background upon its immunogenetics, we constructed a cDNA library from Epstein-Barr virus (EBV)-transformed B lymphocytes of a cynomolgus monkey and sequenced 10,000 randomly picked clones. Results After processing, 8,312 high-quality expressed sequence tags (ESTs) were generated and assembled into 3,728 unigenes. Annotations of these uniquely expressed transcripts demonstrated that out of the 2,524 open reading frame (ORF) positive unigenes (mitochondrial and ribosomal sequences were not included), 98.8% shared significant similarities (E-value less than 1e-10) with the NCBI nucleotide (nt) database, while only 67.7% (E-value less than 1e-5) did so with the NCBI non-redundant protein (nr) database. Further analysis revealed that 90.0% of the unigenes that shared no similarities to the nr database could be assigned to human chromosomes, in which 75 did not match significantly to any cynomolgus monkey and human ESTs. The mapping regions to known human genes on the human genome were described in detail. The protein family and domain analysis revealed that the first, second and fourth of the most abundantly expressed protein families were all assigned to immunoglobulin and major histocompatibility complex (MHC)-related proteins. The expression profiles of these genes were compared with that of homologous genes in human blood, lymph nodes and a RAMOS cell line, which demonstrated expression changes after transformation with EBV. The degree of sequence similarity of the MHC class I and II genes to the human reference sequences was evaluated. The results indicated that class I molecules showed weak amino acid identities (<90%), while class II showed slightly higher ones. Conclusion These results indicated that the genes expressed in the cynomolgus monkey could be used to identify novel protein-coding genes and revise those incomplete or incorrect annotations in the human genome by comparative methods, since the old world monkeys and humans share high similarities at the molecular level, especially within coding regions. The identification of multiple genes involved in the immune response, their sequence variations to the human homologues, and their responses to EBV infection could provide useful information to improve our understanding of the cynomolgus monkey immune system. PMID:16618371
Histoplasma capsulatum proteome response to decreased iron availability

PubMed Central

Winters, Michael S; Spellman, Daniel S; Chan, Qilin; Gomez, Francisco J; Hernandez, Margarita; Catron, Brittany; Smulian, Alan G; Neubert, Thomas A; Deepe, George S

2008-01-01

Background A fundamental pathogenic feature of the fungus Histoplasma capsulatum is its ability to evade innate and adaptive immune defenses. Once ingested by macrophages the organism is faced with several hostile environmental conditions including iron limitation. H. capsulatum can establish a persistent state within the macrophage. A gap in knowledge exists because the identities and number of proteins regulated by the organism under host conditions has yet to be defined. Lack of such knowledge is an important problem because until these proteins are identified it is unlikely that they can be targeted as new and innovative treatment for histoplasmosis. Results To investigate the proteomic response by H. capsulatum to decreasing iron availability we have created H. capsulatum protein/genomic databases compatible with current mass spectrometric (MS) search engines. Databases were assembled from the H. capsulatum G217B strain genome using gene prediction programs and expressed sequence tag (EST) libraries. Searching these databases with MS data generated from two dimensional (2D) in-gel digestions of proteins resulted in over 50% more proteins identified compared to searching the publicly available fungal databases alone. Using 2D gel electrophoresis combined with statistical analysis we discovered 42 H. capsulatum proteins whose abundance was significantly modulated when iron concentrations were lowered. Altered proteins were identified by mass spectrometry and database searching to be involved in glycolysis, the tricarboxylic acid cycle, lysine metabolism, protein synthesis, and one protein sequence whose function was unknown. Conclusion We have created a bioinformatics platform for H. capsulatum and demonstrated the utility of a proteomic approach by identifying a shift in metabolism the organism utilizes to cope with the hostile conditions provided by the host. We have shown that enzyme transcripts regulated by other fungal pathogens in response to lowering iron availability are also regulated in H. capsulatum at the protein level. We also identified H. capsulatum proteins sensitive to iron level reductions which have yet to be connected to iron availability in other pathogens. These data also indicate the complexity of the response by H. capsulatum to nutritional deprivation. Finally, we demonstrate the importance of a strain specific gene/protein database for H. capsulatum proteomic analysis. PMID:19108728
Isthmin 1 Is a Secreted Protein Expressed in Skin, Mucosal Tissues, and NK, NKT, and Th17 Cells

PubMed Central

Valle-Rios, Ricardo; Maravillas-Montero, José L.; Burkhardt, Amanda M.; Martinez, Cynthia; Buhren, Bettina Alexandra; Homey, Bernhard; Gerber, Peter Arne; Robinson, Octavio; Hevezi, Peter

2014-01-01

Using a comprehensive microarray database of human gene expression, we identified that in mammals, a secreted protein known as isthmin 1 (ISM1) is expressed in skin, mucosal tissues, and selected lymphocyte populations. ISM1 was originally identified in Xenopus brain during development, and it encodes a predicted ∼50-kDa protein containing a signal peptide, a thrombospondin domain, and an adhesion-associated domain. We confirmed the pattern of expression of ISM1 in both human and mouse tissues. ISM1 is expressed by DX5+ lung lymphocytes that include NK and NKT-like cells, and is also expressed by some CD4+ T cells upon activation but its expression increases significantly when CD4+ T cells were polarized to the Th17 lineage in vitro. The presence of IFN-γ during CD4+ T cell polarization inhibits ISM1 expression. Given that ISM1 has been reported to have anti-angiogenic properties, these observations suggest that ISM1 is a mediator of lymphocyte effector functions and may participate in both innate and acquired immune responses. PMID:24956034
Isthmin 1 is a secreted protein expressed in skin, mucosal tissues, and NK, NKT, and th17 cells.

PubMed

Valle-Rios, Ricardo; Maravillas-Montero, José L; Burkhardt, Amanda M; Martinez, Cynthia; Buhren, Bettina Alexandra; Homey, Bernhard; Gerber, Peter Arne; Robinson, Octavio; Hevezi, Peter; Zlotnik, Albert

2014-10-01

Using a comprehensive microarray database of human gene expression, we identified that in mammals, a secreted protein known as isthmin 1 (ISM1) is expressed in skin, mucosal tissues, and selected lymphocyte populations. ISM1 was originally identified in Xenopus brain during development, and it encodes a predicted ∼50-kDa protein containing a signal peptide, a thrombospondin domain, and an adhesion-associated domain. We confirmed the pattern of expression of ISM1 in both human and mouse tissues. ISM1 is expressed by DX5(+) lung lymphocytes that include NK and NKT-like cells, and is also expressed by some CD4(+) T cells upon activation but its expression increases significantly when CD4(+) T cells were polarized to the Th17 lineage in vitro. The presence of IFN-γ during CD4(+) T cell polarization inhibits ISM1 expression. Given that ISM1 has been reported to have anti-angiogenic properties, these observations suggest that ISM1 is a mediator of lymphocyte effector functions and may participate in both innate and acquired immune responses.
Differential protein expression in ovaries of uninfected and Babesia-infected southern cattle ticks, Rhipicephalus (Boophilus) microplus.

PubMed

Rachinsky, Anna; Guerrero, Felix D; Scoles, Glen A

2007-12-01

We used gel electrophoresis and mass spectrometry to investigate differences in protein expression in ovarian tissues from Babesia bovis-infected and uninfected southern cattle tick, Rhipicephalus (Boophilus) microplus. Soluble and membrane proteins were extracted from ovaries of adult female ticks, and analyzed by isoelectric focusing (IEF) and one-dimensional or two-dimensional (2-D) gel electrophoresis. Protein patterns were analyzed for differences in expression between infected and uninfected ticks. 2-D separation of proteins revealed a number of proteins that appeared to be up- or down-regulated in response to infection with Babesia, in particular membrane/membrane-associated proteins and proteins in a low molecular mass range between 6 and 36kDa. A selection of differentially expressed proteins was subjected to analysis by capillary-HPLC-electrospray tandem mass spectrometry (HPLC-ESI-MS/MS). Among the ovarian proteins that were up-regulated in infected ticks were calreticulin, two myosin subunits, an endoplasmic reticulum protein, a peptidyl-prolyl cis-trans isomerase (PPIase), a cytochrome c oxidase subunit, a glutamine synthetase, and a family of Kunitz-type serine protease inhibitors. Among the down-regulated ovarian proteins were another PPIase, a hemoglobin subunit, and a lysozyme. This study is part of an ongoing effort to establish a proteome database that can be utilized to investigate specific proteins involved in successful pathogen transmission.
High MRPS23 expression contributes to hepatocellular carcinoma proliferation and indicates poor survival outcomes.

PubMed

Pu, Meng; Wang, Jianlin; Huang, Qike; Zhao, Ge; Xia, Congcong; Shang, Runze; Zhang, Zhuochao; Bian, Zhenyuan; Yang, Xishegn; Tao, Kaishan

2017-07-01

Hepatocellular carcinoma is one of the most prevalent neoplasms and the leading cause of cancer-related mortality worldwide. Mitochondrial ribosomal protein S23 is encoded by a nuclear gene and participates in mitochondrial protein translation. Mitochondrial ribosomal protein S23 overexpression has been found in many types of cancer. In this study, we explored mitochondrial ribosomal protein S23 expression in primary hepatocellular carcinoma tissues compared with matched adjacent non-tumoral liver tissues using mitochondrial ribosomal protein S23 messenger RNA and protein levels collected from public databases and clinical samples. Immunohistochemistry was performed to analyze the relationship between mitochondrial ribosomal protein S23 and various clinicopathological features. The results indicated that mitochondrial ribosomal protein S23 was significantly overexpressed in hepatocellular carcinoma. High mitochondrial ribosomal protein S23 expression was correlated with the tumor size and tumor-metastasis-node stage. Moreover, patients with high mitochondrial ribosomal protein S23 expression levels presented poorer survival rates. Mitochondrial ribosomal protein S23 was an independent prognostic factor for survival, especially at the early stage of hepatocellular carcinoma. In addition, the downregulation of mitochondrial ribosomal protein S23 decreased the proliferation of hepatocellular carcinoma in vitro and in vivo. In conclusion, we verified for the first time that mitochondrial ribosomal protein S23 expression was upregulated in hepatocellular carcinoma. High mitochondrial ribosomal protein S23 levels can predict poor clinical outcomes in hepatocellular carcinoma, and this protein plays a key role in tumor proliferation. Therefore, mitochondrial ribosomal protein S23 may be a potential therapeutic target for hepatocellular carcinoma.
MARCKS Regulates Growth, Radiation Sensitivity and is a Novel Prognostic Factor for Glioma

PubMed Central

Jarboe, John S.; Anderson, Joshua C.; Duarte, Christine W.; Mehta, Tapan; Nowsheen, Somaira; Hicks, Patricia H.; Whitley, Alexander C.; Rohrbach, Timothy D.; McCubrey, Raymond O.; Chiu, Sherard; Burleson, Tamara M.; Bonner, James A.; Gillespie, G. Yancey; Yang, Eddy S.; Willey, Christopher D.

2013-01-01

Purpose This study assessed whether Myristoylated Alanine Rich C-Kinase Substrate (MARCKS) can regulate glioblastoma (GBM) growth, radiation sensitivity and clinical outcome. Experimental Design MARCKS protein levels were analyzed in five GBM explant cell lines and eight patient-derived xenograft tumors by immunoblot, and these levels were correlated to proliferation rates and intracranial growth rates, respectively. Manipulation of MARCKS protein levels was assessed by lentiviral-mediated shRNA knockdown in the U251 cell line and MARCKS over-expression in the U87 cell line. The effect of manipulation of MARCKS on proliferation, radiation sensitivity and senescence was assessed. MARCKS gene expression was correlated with survival outcomes in the Repository of Molecular Brain Neoplasia Data (REMBRANDT) Database and The Cancer Genome Atlas (TCGA). Results MARCKS protein expression was inversely correlated with GBM proliferation and intracranial xenograft growth rates. Genetic silencing of MARCKS promoted GBM proliferation and radiation resistance, while MARCKS overexpression greatly reduced GBM growth potential and induced senescence. We found MARCKS gene expression to be directly correlated with survival in both the REMBRANDT and TCGA databases. Specifically, patients with high MARCKS expressing tumors of the Proneural molecular subtype had significantly increased survival rates. This effect was most pronounced in tumors with unmethylated O6-methylguanine DNA methyltransferase (MGMT) promoters, a traditionally poor prognostic factor. Conclusions MARCKS levels impact GBM growth and radiation sensitivity. High MARCKS expressing GBM tumors are associated with improved survival, particularly with unmethylated MGMT promoters. These findings suggest the use of MARCKS as a novel target and biomarker for prognosis in the Proneural subtype of GBM. PMID:22619307
How much do we know about the coupling of G-proteins to serotonin receptors?

PubMed Central

2014-01-01

Serotonin receptors are G-protein-coupled receptors (GPCRs) involved in a variety of psychiatric disorders. G-proteins, heterotrimeric complexes that couple to multiple receptors, are activated when their receptor is bound by the appropriate ligand. Activation triggers a cascade of further signalling events that ultimately result in cell function changes. Each of the several known G-protein types can activate multiple pathways. Interestingly, since several G-proteins can couple to the same serotonin receptor type, receptor activation can result in induction of different pathways. To reach a better understanding of the role, interactions and expression of G-proteins a literature search was performed in order to list all the known heterotrimeric combinations and serotonin receptor complexes. Public databases were analysed to collect transcript and protein expression data relating to G-proteins in neural tissues. Only a very small number of heterotrimeric combinations and G-protein-receptor complexes out of the possible thousands suggested by expression data analysis have been examined experimentally. In addition this has mostly been obtained using insect, hamster, rat and, to a lesser extent, human cell lines. Besides highlighting which interactions have not been explored, our findings suggest additional possible interactions that should be examined based on our expression data analysis. PMID:25011628
How much do we know about the coupling of G-proteins to serotonin receptors?

PubMed

Giulietti, Matteo; Vivenzio, Viviana; Piva, Francesco; Principato, Giovanni; Bellantuono, Cesario; Nardi, Bernardo

2014-07-10

Serotonin receptors are G-protein-coupled receptors (GPCRs) involved in a variety of psychiatric disorders. G-proteins, heterotrimeric complexes that couple to multiple receptors, are activated when their receptor is bound by the appropriate ligand. Activation triggers a cascade of further signalling events that ultimately result in cell function changes. Each of the several known G-protein types can activate multiple pathways. Interestingly, since several G-proteins can couple to the same serotonin receptor type, receptor activation can result in induction of different pathways. To reach a better understanding of the role, interactions and expression of G-proteins a literature search was performed in order to list all the known heterotrimeric combinations and serotonin receptor complexes. Public databases were analysed to collect transcript and protein expression data relating to G-proteins in neural tissues. Only a very small number of heterotrimeric combinations and G-protein-receptor complexes out of the possible thousands suggested by expression data analysis have been examined experimentally. In addition this has mostly been obtained using insect, hamster, rat and, to a lesser extent, human cell lines. Besides highlighting which interactions have not been explored, our findings suggest additional possible interactions that should be examined based on our expression data analysis.
Transcriptome-wide analysis of WRKY transcription factors in wheat and their leaf rust responsive expression profiling.

PubMed

Satapathy, Lopamudra; Singh, Dharmendra; Ranjan, Prashant; Kumar, Dhananjay; Kumar, Manish; Prabhu, Kumble Vinod; Mukhopadhyay, Kunal

2014-12-01

WRKY, a plant-specific transcription factor family, has important roles in pathogen defense, abiotic cues and phytohormone signaling, yet little is known about their roles and molecular mechanism of function in response to rust diseases in wheat. We identified 100 TaWRKY sequences using wheat Expressed Sequence Tag database of which 22 WRKY sequences were novel. Identified proteins were characterized based on their zinc finger motifs and phylogenetic analysis clustered them into six clades consisting of class IIc and class III WRKY proteins. Functional annotation revealed major functions in metabolic and cellular processes in control plants; whereas response to stimuli, signaling and defense in pathogen inoculated plants, their major molecular function being binding to DNA. Tag-based expression analysis of the identified genes revealed differential expression between mock and Puccinia triticina inoculated wheat near isogenic lines. Gene expression was also performed with six rust-related microarray experiments at Gene Expression Omnibus database. TaWRKY10, 15, 17 and 56 were common in both tag-based and microarray-based differential expression analysis and could be representing rust specific WRKY genes. The obtained results will bestow insight into the functional characterization of WRKY transcription factors responsive to leaf rust pathogenesis that can be used as candidate genes in molecular breeding programs to improve biotic stress tolerance in wheat.
Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

PubMed Central

Callister, Stephen J.; McCue, Lee Ann; Turse, Joshua E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

2008-01-01

While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits. PMID:18253490
ORFer--retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files.

PubMed

Büssow, Konrad; Hoffmann, Steve; Sievert, Volker

2002-12-19

Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.
Verification of Ribosomal Proteins of Aspergillus fumigatus for Use as Biomarkers in MALDI-TOF MS Identification.

PubMed

Nakamura, Sayaka; Sato, Hiroaki; Tanaka, Reiko; Yaguchi, Takashi

2016-01-01

We have previously proposed a rapid identification method for bacterial strains based on the profiles of their ribosomal subunit proteins (RSPs), observed using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). This method can perform phylogenetic characterization based on the mass of housekeeping RSP biomarkers, ideally calculated from amino acid sequence information registered in public protein databases. With the aim of extending its field of application to medical mycology, this study investigates the actual state of information of RSPs of eukaryotic fungi registered in public protein databases through the characterization of ribosomal protein fractions extracted from genome-sequenced Aspergillus fumigatus strains Af293 and A1163 as a model. In this process, we have found that the public protein databases harbor problems. The RSP names are in confusion, so we have provisionally unified them using the yeast naming system. The most serious problem is that many incorrect sequences are registered in the public protein databases. Surprisingly, more than half of the sequences are incorrect, due chiefly to mis-annotation of exon/intron structures. These errors could be corrected by a combination of in silico inspection by sequence homology analysis and MALDI-TOF MS measurements. We were also able to confirm conserved post-translational modifications in eleven RSPs. After these verifications, the masses of 31 expressed RSPs under 20,000 Da could be accurately confirmed. These RSPs have a potential to be useful biomarkers for identifying clinical isolates of A. fumigatus .

Plant Genome Resources at the National Center for Biotechnology Information

PubMed Central

Wheeler, David L.; Smith-White, Brian; Chetvernin, Vyacheslav; Resenchuk, Sergei; Dombrowski, Susan M.; Pechous, Steven W.; Tatusova, Tatiana; Ostell, James

2005-01-01

The National Center for Biotechnology Information (NCBI) integrates data from more than 20 biological databases through a flexible search and retrieval system called Entrez. A core Entrez database, Entrez Nucleotide, includes GenBank and is tightly linked to the NCBI Taxonomy database, the Entrez Protein database, and the scientific literature in PubMed. A suite of more specialized databases for genomes, genes, gene families, gene expression, gene variation, and protein domains dovetails with the core databases to make Entrez a powerful system for genomic research. Linked to the full range of Entrez databases is the NCBI Map Viewer, which displays aligned genetic, physical, and sequence maps for eukaryotic genomes including those of many plants. A specialized plant query page allow maps from all plant genomes covered by the Map Viewer to be searched in tandem to produce a display of aligned maps from several species. PlantBLAST searches against the sequences shown in the Map Viewer allow BLAST alignments to be viewed within a genomic context. In addition, precomputed sequence similarities, such as those for proteins offered by BLAST Link, enable fluid navigation from unannotated to annotated sequences, quickening the pace of discovery. NCBI Web pages for plants, such as Plant Genome Central, complete the system by providing centralized access to NCBI's genomic resources as well as links to organism-specific Web pages beyond NCBI. PMID:16010002
RNA-Seq and molecular docking reveal multi-level pesticide resistance in the bed bug

PubMed Central

2012-01-01

Background Bed bugs (Cimex lectularius) are hematophagous nocturnal parasites of humans that have attained high impact status due to their worldwide resurgence. The sudden and rampant resurgence of C. lectularius has been attributed to numerous factors including frequent international travel, narrower pest management practices, and insecticide resistance. Results We performed a next-generation RNA sequencing (RNA-Seq) experiment to find differentially expressed genes between pesticide-resistant (PR) and pesticide-susceptible (PS) strains of C. lectularius. A reference transcriptome database of 51,492 expressed sequence tags (ESTs) was created by combining the databases derived from de novo assembled mRNA-Seq tags (30,404 ESTs) and our previous 454 pyrosequenced database (21,088 ESTs). The two-way GLMseq analysis revealed ~15,000 highly significant differentially expressed ESTs between the PR and PS strains. Among the top 5,000 differentially expressed ESTs, 109 putative defense genes (cuticular proteins, cytochrome P450s, antioxidant genes, ABC transporters, glutathione S-transferases, carboxylesterases and acetyl cholinesterase) involved in penetration resistance and metabolic resistance were identified. Tissue and development-specific expression of P450 CYP3 clan members showed high mRNA levels in the cuticle, Malpighian tubules, and midgut; and in early instar nymphs, respectively. Lastly, molecular modeling and docking of a candidate cytochrome P450 (CYP397A1V2) revealed the flexibility of the deduced protein to metabolize a broad range of insecticide substrates including DDT, deltamethrin, permethrin, and imidacloprid. Conclusions We developed significant molecular resources for C. lectularius putatively involved in metabolic resistance as well as those participating in other modes of insecticide resistance. RNA-Seq profiles of PR strains combined with tissue-specific profiles and molecular docking revealed multi-level insecticide resistance in C. lectularius. Future research that is targeted towards RNA interference (RNAi) on the identified metabolic targets such as cytochrome P450s and cuticular proteins could lay the foundation for a better understanding of the genetic basis of insecticide resistance in C. lectularius. PMID:22226239
EST-PAC a web package for EST annotation and protein sequence prediction

PubMed Central

Strahm, Yvan; Powell, David; Lefèvre, Christophe

2006-01-01

With the decreasing cost of DNA sequencing technology and the vast diversity of biological resources, researchers increasingly face the basic challenge of annotating a larger number of expressed sequences tags (EST) from a variety of species. This typically consists of a series of repetitive tasks, which should be automated and easy to use. The results of these annotation tasks need to be stored and organized in a consistent way. All these operations should be self-installing, platform independent, easy to customize and amenable to using distributed bioinformatics resources available on the Internet. In order to address these issues, we present EST-PAC a web oriented multi-platform software package for expressed sequences tag (EST) annotation. EST-PAC provides a solution for the administration of EST and protein sequence annotations accessible through a web interface. Three aspects of EST annotation are automated: 1) searching local or remote biological databases for sequence similarities using Blast services, 2) predicting protein coding sequence from EST data and, 3) annotating predicted protein sequences with functional domain predictions. In practice, EST-PAC integrates the BLASTALL suite, EST-Scan2 and HMMER in a relational database system accessible through a simple web interface. EST-PAC also takes advantage of the relational database to allow consistent storage, powerful queries of results and, management of the annotation process. The system allows users to customize annotation strategies and provides an open-source data-management environment for research and education in bioinformatics. PMID:17147782
Large-scale mass spectrometric detection of variant peptides resulting from non-synonymous nucleotide differences

PubMed Central

Sheynkman, Gloria M.; Shortreed, Michael R.; Frey, Brian L.; Scalf, Mark; Smith, Lloyd M.

2013-01-01

Each individual carries thousands of non-synonymous single nucleotide variants (nsSNVs) in their genome, each corresponding to a single amino acid polymorphism (SAP) in the encoded proteins. It is important to be able to directly detect and quantify these variations at the protein level in order to study post-transcriptional regulation, differential allelic expression, and other important biological processes. However, such variant peptides are not generally detected in standard proteomic analyses, due to their absence from the generic databases that are employed for mass spectrometry searching. Here, we extend previous work that demonstrated the use of customized SAP databases constructed from sample-matched RNA-Seq data. We collected deep coverage RNA-Seq data from the Jurkat cell line, compiled the set of nsSNVs that are expressed, used this information to construct a customized SAP database, and searched it against deep coverage shotgun MS data obtained from the same sample. This approach enabled detection of 421 SAP peptides mapping to 395 nsSNVs. We compared these peptides to peptides identified from a large generic search database containing all known nsSNVs (dbSNP) and found that more than 70% of the SAP peptides from this dbSNP-derived search were not supported by the RNA-Seq data, and thus are likely false positives. Next, we increased the SAP coverage from the RNA-Seq derived database by utilizing multiple protease digestions, thereby increasing variant detection to 695 SAP peptides mapping to 504 nsSNV sites. These detected SAP peptides corresponded to moderate to high abundance transcripts (30+ transcripts per million, TPM). The SAP peptides included 192 allelic pairs; the relative expression levels of the two alleles were evaluated for 51 of those pairs, and found to be comparable in all cases. PMID:24175627
Microarray analysis to identify the similarities and differences of pathogenesis between aortic occlusive disease and abdominal aortic aneurysm.

PubMed

Wang, Guofu; Bi, Lechang; Wang, Gaofeng; Huang, Feilai; Lu, Mingjing; Zhu, Kai

2018-06-01

Objectives Expression profile of GSE57691 was analyzed to identify the similarities and differences between aortic occlusive disease and abdominal aortic aneurysm. Methods The expression profile of GSE57691 was downloaded from Gene Expression Omnibus database, including 20 small abdominal aortic aneurysm samples, 29 large abdominal aortic aneurysm samples, 9 aortic occlusive disease samples, and 10 control samples. Using the limma package in R, the differentially expressed genes were screened. Followed by enrichment analysis was performed for the differentially expressed genes using database for annotation, visualization, and integrated discovery online tool. Based on string online tool and Cytoscape software, protein-protein interaction network and module analyses were carried out. Moreover, integrated TF platform database and Cytoscape software were used for constructing transcriptional regulatory networks. Results As a result, 1757, 354, and 396 differentially expressed genes separately were identified in aortic occlusive disease, large abdominal aortic aneurysm, and small abdominal aortic aneurysm samples. UBB was significantly enriched in proteolysis related pathways with a high degree in three groups. SPARCL1 was another gene shared by these groups and regulated by NFIA, which had a high degree in transcriptional regulatory network. ACTB, a significant upregulated gene in abdominal aortic aneurysm samples, could be regulated by CLIC4, which was significantly enriched in cell motions. ACLY and NFIB were separately identified in aortic occlusive disease and small abdominal aortic aneurysm samples, and separately enriched in lipid metabolism and negative regulation of cell proliferation. Conclusions The downregulated UBB, NFIA, and SPARCL1 might play key roles in both aortic occlusive disease and abdominal aortic aneurysm, while the upregulated ACTB might only involve in abdominal aortic aneurysm. ACLY and NFIB were specifically involved in aortic occlusive disease and small abdominal aortic aneurysm separately.
Database resources of the National Center for Biotechnology Information.

PubMed

Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bolton, Evan; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; Dicuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Krasnov, Sergey; Landsman, David; Lipman, David J; Lu, Zhiyong; Madden, Thomas L; Madej, Tom; Maglott, Donna R; Marchler-Bauer, Aron; Miller, Vadim; Karsch-Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Wang, Yanli; Wilbur, W John; Yaschenko, Eugene; Ye, Jian

2012-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Database resources of the National Center for Biotechnology Information

PubMed Central

Acland, Abigail; Agarwala, Richa; Barrett, Tanya; Beck, Jeff; Benson, Dennis A.; Bollin, Colleen; Bolton, Evan; Bryant, Stephen H.; Canese, Kathi; Church, Deanna M.; Clark, Karen; DiCuccio, Michael; Dondoshansky, Ilya; Federhen, Scott; Feolo, Michael; Geer, Lewis Y.; Gorelenkov, Viatcheslav; Hoeppner, Marilu; Johnson, Mark; Kelly, Christopher; Khotomlianski, Viatcheslav; Kimchi, Avi; Kimelman, Michael; Kitts, Paul; Krasnov, Sergey; Kuznetsov, Anatoliy; Landsman, David; Lipman, David J.; Lu, Zhiyong; Madden, Thomas L.; Madej, Tom; Maglott, Donna R.; Marchler-Bauer, Aron; Karsch-Mizrachi, Ilene; Murphy, Terence; Ostell, James; O'Sullivan, Christopher; Panchenko, Anna; Phan, Lon; Pruitt, Don Preussm Kim D.; Rubinstein, Wendy; Sayers, Eric W.; Schneider, Valerie; Schuler, Gregory D.; Sequeira, Edwin; Sherry, Stephen T.; Shumway, Martin; Sirotkin, Karl; Siyan, Karanjit; Slotta, Douglas; Soboleva, Alexandra; Soussov, Vladimir; Starchenko, Grigory; Tatusova, Tatiana A.; Trawick, Bart W.; Vakatov, Denis; Wang, Yanli; Ward, Minghong; John Wilbur, W.; Yaschenko, Eugene; Zbicz, Kerry

2014-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, PubReader, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Primer-BLAST, COBALT, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, ClinVar, MedGen, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All these resources can be accessed through the NCBI home page. PMID:24259429
Sequencing and Characterization of the Invasive Sycamore Lace Bug Corythucha ciliata (Hemiptera: Tingidae) Transcriptome

PubMed Central

Qu, Cheng; Fu, Ningning; Xu, Yihua

2016-01-01

The sycamore lace bug, Corythucha ciliata (Hemiptera: Tingidae), is an invasive forestry pest rapidly expanding in many countries. This pest poses a considerable threat to the urban forestry ecosystem, especially to Platanus spp. However, its molecular biology and biochemistry are poorly understood. This study reports the first C. ciliata transcriptome, encompassing three different life stages (Nymphs, adults female (AF) and adults male (AM)). In total, 26.53 GB of clean data and 60,879 unigenes were obtained from three RNA-seq libraries. These unigenes were annotated and classified by Nr (NCBI non-redundant protein sequences), Nt (NCBI non-redundant nucleotide sequences), Pfam (Protein family), KOG/COG (Clusters of Orthologous Groups of proteins), Swiss-Prot (A manually annotated and reviewed protein sequence database), and KO (KEGG Ortholog database). After all pairwise comparisons between these three different samples, a large number of differentially expressed genes were revealed. The dramatic differences in global gene expression profiles were found between distinct life stages (nymphs and AF, nymphs and AM) and sex difference (AF and AM), with some of the significantly differentially expressed genes (DEGs) being related to metamorphosis, digestion, immune and sex difference. The different express of unigenes were validated through quantitative Real-Time PCR (qRT-PCR) for 16 randomly selected unigenes. In addition, 17,462 potential simple sequence repeat molecular markers were identified in these transcriptome resources. These comprehensive C. ciliata transcriptomic information can be utilized to promote the development of environmentally friendly methodologies to disrupt the processes of metamorphosis, digestion, immune and sex differences. PMID:27494615
Comprehensive data resources and analytical tools for pathological association of aminoacyl tRNA synthetases with cancer

PubMed Central

Lee, Ji-Hyun; You, Sungyong; Hyeon, Do Young; Kang, Byeongsoo; Kim, Hyerim; Park, Kyoung Mii; Han, Byungwoo; Hwang, Daehee; Kim, Sunghoon

2015-01-01

Mammalian cells have cytoplasmic and mitochondrial aminoacyl-tRNA synthetases (ARSs) that catalyze aminoacylation of tRNAs during protein synthesis. Despite their housekeeping functions in protein synthesis, recently, ARSs and ARS-interacting multifunctional proteins (AIMPs) have been shown to play important roles in disease pathogenesis through their interactions with disease-related molecules. However, there are lacks of data resources and analytical tools that can be used to examine disease associations of ARS/AIMPs. Here, we developed an Integrated Database for ARSs (IDA), a resource database including cancer genomic/proteomic and interaction data of ARS/AIMPs. IDA includes mRNA expression, somatic mutation, copy number variation and phosphorylation data of ARS/AIMPs and their interacting proteins in various cancers. IDA further includes an array of analytical tools for exploration of disease association of ARS/AIMPs, identification of disease-associated ARS/AIMP interactors and reconstruction of ARS-dependent disease-perturbed network models. Therefore, IDA provides both comprehensive data resources and analytical tools for understanding potential roles of ARS/AIMPs in cancers. Database URL: http://ida.biocon.re.kr/, http://ars.biocon.re.kr/ PMID:25824651
Cloning, expression, purification, and activity assay of proteins related to D-lactic acid formation in Lactobacillus rhamnosus.

PubMed

Wang, Xiuwen; Zheng, Zhaojuan; Dou, Peipei; Qin, Jiayang; Wang, Xiaochen; Ma, Cuiqing; Tang, Hongzhi; Xu, Ping

2010-08-01

Two proteins that might be responsible for D-lactic acid (D-LA) formation were screened from the genome database of Lactobacillus rhamnosus GG. The coding genes of the two proteins in L. rhamnosus CASL, ldhD1 and ldhD2, were cloned and expressed in Escherichia coli Rosetta with an inducible expression vector pETDuet-1 (Novagen, Darmstadt, Germany), respectively. The two purified proteins, LdhD-1 and LdhD-2, migrated as a single protein band separately, both corresponding to an apparent molecular mass between 35 kDa and 45 kDa on sodium dodecyl sulfate polyacrylamide gel electrophoresis. The specific activities of LdhD-1 and LdhD-2 catalyzing pyruvate to LA were 0.02 U/mg and 0.21 U/mg, respectively. The configuration of LA converted from pyruvate was determined using high-performance liquid chromatography equipped with a chiral column. Only D-LA was detected when LdhD-1 and LdhD-2 were tested. In summary, the two proteins cloned and expressed in this study were most probably responsible for D-LA formation during fermentation of L. rhamnosus CASL.
Deciphering the complexities of the wheat flour proteome using quantitative two-dimensional electrophoresis, three proteases and tandem mass spectrometry

PubMed Central

2011-01-01

Background Wheat flour is one of the world's major food ingredients, in part because of the unique end-use qualities conferred by the abundant glutamine- and proline-rich gluten proteins. Many wheat flour proteins also present dietary problems for consumers with celiac disease or wheat allergies. Despite the importance of these proteins it has been particularly challenging to use MS/MS to distinguish the many proteins in a flour sample and relate them to gene sequences. Results Grain from the extensively characterized spring wheat cultivar Triticum aestivum 'Butte 86' was milled to white flour from which proteins were extracted, then separated and quantified by 2-DE. Protein spots were identified by separate digestions with three proteases, followed by tandem mass spectrometry analysis of the peptides. The spectra were used to interrogate an improved protein sequence database and results were integrated using the Scaffold program. Inclusion of cultivar specific sequences in the database greatly improved the results, and 233 spots were identified, accounting for 93.1% of normalized spot volume. Identified proteins were assigned to 157 wheat sequences, many for proteins unique to wheat and nearly 40% from Butte 86. Alpha-gliadins accounted for 20.4% of flour protein, low molecular weight glutenin subunits 18.0%, high molecular weight glutenin subunits 17.1%, gamma-gliadins 12.2%, omega-gliadins 10.5%, amylase/protease inhibitors 4.1%, triticins 1.6%, serpins 1.6%, purinins 0.9%, farinins 0.8%, beta-amylase 0.5%, globulins 0.4%, other enzymes and factors 1.9%, and all other 3%. Conclusions This is the first successful effort to identify the majority of abundant flour proteins for a single wheat cultivar, relate them to individual gene sequences and estimate their relative levels. Many genes for wheat flour proteins are not expressed, so this study represents further progress in describing the expressed wheat genome. Use of cultivar-specific contigs helped to overcome the difficulties of matching peptides to gene sequences for members of highly similar, rapidly evolving storage protein families. Prospects for simplifying this process for routine analyses are discussed. The ability to measure expression levels for individual flour protein genes complements information gained from efforts to sequence the wheat genome and is essential for studies of effects of environment on gene expression. PMID:21314956
Bioinformatic analysis for allergenicity assessment of Bacillus thuringiensis Cry proteins expressed in insect-resistant food crops.

PubMed

Randhawa, Gurinder Jit; Singh, Monika; Grover, Monendra

2011-02-01

The novel proteins introduced into the genetically modified (GM) crops need to be evaluated for the potential allergenicity before their introduction into the food chain to address the safety concerns of consumers. At present, there is no single definitive test that can be relied upon to predict allergic response in humans to a new protein; hence a composite approach to allergic response prediction is described in this study. The present study reports on the evaluation of the Cry proteins, encoded by cry1Ac, cry1Ab, cry2Ab, cry1Ca, cry1Fa/cry1Ca hybrid, being expressed in Bt food crops that are under field trials in India, for potential allergenic cross-reactivity using bioinformatics search tools. The sequence identity of amino acids was analyzed using FASTA3 of AllergenOnline version 10.0 and BLASTX of NCBI Entrez to identify any potential sequence matches to allergen proteins. As a step further in the detection of allergens, an independent database of domains in the allergens available in the AllergenOnline database was also developed. The results indicated no significant alignment and similarity of Cry proteins at domain level with any of the known allergens revealing that there is no potential risk of allergenic cross-reactivity. Copyright © 2010 Elsevier Ltd. All rights reserved.
Proteomic analysis of the Theileria annulata schizont

PubMed Central

Witschi, M.; Xia, D.; Sanderson, S.; Baumgartner, M.; Wastling, J.M.; Dobbelaere, D.A.E.

2013-01-01

The apicomplexan parasite, Theileria annulata, is the causative agent of tropical theileriosis, a devastating lymphoproliferative disease of cattle. The schizont stage transforms bovine leukocytes and provides an intriguing model to study host/pathogen interactions. The genome of T. annulata has been sequenced and transcriptomic data are rapidly accumulating. In contrast, little is known about the proteome of the schizont, the pathogenic, transforming life cycle stage of the parasite. Using one-dimensional (1-D) gel LC-MS/MS, a proteomic analysis of purified T. annulata schizonts was carried out. In whole parasite lysates, 645 proteins were identified. Proteins with transmembrane domains (TMDs) were under-represented and no proteins with more than four TMDs could be detected. To tackle this problem, Triton X-114 treatment was applied, which facilitates the extraction of membrane proteins, followed by 1-D gel LC-MS/MS. This resulted in the identification of an additional 153 proteins. Half of those had one or more TMD and 30 proteins with more than four TMDs were identified. This demonstrates that Triton X-114 treatment can provide a valuable additional tool for the identification of new membrane proteins in proteomic studies. With two exceptions, all proteins involved in glycolysis and the citric acid cycle were identified. For at least 29% of identified proteins, the corresponding transcripts were not present in the existing expressed sequence tag databases. The proteomics data were integrated into the publicly accessible database resource at EuPathDB (www.eupathdb.org) so that mass spectrometry-based protein expression evidence for T. annulata can be queried alongside transcriptional and other genomics data available for these parasites. PMID:23178997
The Zebrafish Model Organism Database: new support for human disease models, mutation details, gene expression phenotypes and searching

PubMed Central

Howe, Douglas G.; Bradford, Yvonne M.; Eagle, Anne; Fashena, David; Frazer, Ken; Kalita, Patrick; Mani, Prita; Martin, Ryan; Moxon, Sierra Taylor; Paddock, Holly; Pich, Christian; Ramachandran, Sridhar; Ruzicka, Leyla; Schaper, Kevin; Shao, Xiang; Singer, Amy; Toro, Sabrina; Van Slyke, Ceri; Westerfield, Monte

2017-01-01

The Zebrafish Model Organism Database (ZFIN; http://zfin.org) is the central resource for zebrafish (Danio rerio) genetic, genomic, phenotypic and developmental data. ZFIN curators provide expert manual curation and integration of comprehensive data involving zebrafish genes, mutants, transgenic constructs and lines, phenotypes, genotypes, gene expressions, morpholinos, TALENs, CRISPRs, antibodies, anatomical structures, models of human disease and publications. We integrate curated, directly submitted, and collaboratively generated data, making these available to zebrafish research community. Among the vertebrate model organisms, zebrafish are superbly suited for rapid generation of sequence-targeted mutant lines, characterization of phenotypes including gene expression patterns, and generation of human disease models. The recent rapid adoption of zebrafish as human disease models is making management of these data particularly important to both the research and clinical communities. Here, we describe recent enhancements to ZFIN including use of the zebrafish experimental conditions ontology, ‘Fish’ records in the ZFIN database, support for gene expression phenotypes, models of human disease, mutation details at the DNA, RNA and protein levels, and updates to the ZFIN single box search. PMID:27899582
De novo sequencing and analysis of the transcriptome during the browning of fresh-cut Luffa cylindrica 'Fusi-3' fruits.

PubMed

Zhu, Haisheng; Liu, Jianting; Wen, Qingfang; Chen, Mindong; Wang, Bin; Zhang, Qianrong; Xue, Zhuzheng

2017-01-01

Fresh-cut luffa (Luffa cylindrica) fruits commonly undergo browning. However, little is known about the molecular mechanisms regulating this process. We used the RNA-seq technique to analyze the transcriptomic changes occurring during the browning of fresh-cut fruits from luffa cultivar 'Fusi-3'. Over 90 million high-quality reads were assembled into 58,073 Unigenes, and 60.86% of these were annotated based on sequences in four public databases. We detected 35,282 Unigenes with significant hits to sequences in the NCBInr database, and 24,427 Unigenes encoded proteins with sequences that were similar to those of known proteins in the Swiss-Prot database. Additionally, 20,546 and 13,021 Unigenes were similar to existing sequences in the Eukaryotic Orthologous Groups of proteins and Kyoto Encyclopedia of Genes and Genomes databases, respectively. Furthermore, 27,301 Unigenes were differentially expressed during the browning of fresh-cut luffa fruits (i.e., after 1-6 h). Moreover, 11 genes from five gene families (i.e., PPO, PAL, POD, CAT, and SOD) identified as potentially associated with enzymatic browning as well as four WRKY transcription factors were observed to be differentially regulated in fresh-cut luffa fruits. With the assistance of rapid amplification of cDNA ends technology, we obtained the full-length sequences of the 15 Unigenes. We also confirmed these Unigenes were expressed by quantitative real-time polymerase chain reaction analysis. This study provides a comprehensive transcriptome sequence resource, and may facilitate further studies aimed at identifying genes affecting luffa fruit browning for the exploitation of the underlying mechanism.
Integration of transcriptomic and proteomic data from a single wheat cultivar provides new tools for understanding the roles of individual alpha gliadin proteins in flour quality and celiac disease

USDA-ARS?s Scientific Manuscript database

One-hundred-thirty-six expressed sequence tags (ESTs) encoding alpha gliadins from Triticum aestivum cv Butte 86 were identified in public databases and assembled into 19 contigs. Consensus sequences for 12 of the contigs encoded complete alpha gliadin proteins, but only two were identical to protei...
Global differential expression of genes located in the Down Syndrome Critical Region in normal human brain

PubMed Central

Montoya, Julio Cesar; Fajardo, Dianora; Peña, Angela; Sánchez, Adalberto; Domínguez, Martha C; Satizábal, José María

2014-01-01

Background: The information of gene expression obtained from databases, have made possible the extraction and analysis of data related with several molecular processes involving not only in brain homeostasis but its disruption in some neuropathologies; principally in Down syndrome and the Alzheimer disease. Objective: To correlate the levels of transcription of 19 genes located in the Down Syndrome Critical Region (DSCR) with their expression in several substructures of normal human brain. Methods: There were obtained expression profiles of 19 DSCR genes in 42 brain substructures, from gene expression values available at the database of the human brain of the Brain Atlas of the Allen Institute for Brain Sciences", (http://human.brain-map.org/). The co-expression patterns of DSCR genes in brain were calculated by using multivariate statistical methods. Results: Highest levels of gene expression were registered at caudate nucleus, nucleus accumbens and putamen among central areas of cerebral cortex. Increased expression levels of RCAN1 that encode by a protein involved in signal transduction process of the CNS were recorded for PCP4 that participates in the binding to calmodulin and TTC3; a protein that is associated with differentiation of neurons. That previously identified brain structures play a crucial role in the learning process, in different class of memory and in motor skills. Conclusion: The precise regulation of DSCR gene expression is crucial to maintain the brain homeostasis, especially in those areas with high levels of gene expression associated with a remarkable process of learning and cognition. PMID:25767303
Proteome analysis of the fungus Aspergillus carbonarius under ochratoxin A producing conditions.

PubMed

Crespo-Sempere, A; Gil, J V; Martínez-Culebras, P V

2011-06-30

Aspergillus carbonarius is an important ochratoxin A producing fungus that is responsible for mycotoxin contamination of grapes and wine. In this study, the proteomes of highly (W04-40) and weakly (W04-46) OTA-producing A. carbonarius strains were compared to identify proteins that may be involved in OTA biosynthesis. Protein samples were extracted from two biological replicates and subjected to two dimensional gel electrophoresis analysis and mass spectrometry. Expression profile comparison (PDQuest software), revealed 21 differential spots that were statistically significant and showed a two-fold change in expression, or greater. Among these, nine protein spots were identified by MALDI-MS/MS and MASCOT database and twelve remain unidentified. Of the identified proteins, seven showed a higher expression in strain W04-40 (high OTA producer) and two in strain W04-46 (low OTA producer). Some of the identified amino acid sequences shared homology with proteins involved in regulation, amino acid metabolism, oxidative stress and sporulation. It is worth noting the presence of a protein with 126.5 fold higher abundance in strain W04-40 showing homology with protein CipC, a protein with unknown function related with pathogenesis and mycotoxin production by some authors. Variations in protein expression were also further investigated at the mRNA level by real-time PCR analysis. The mRNA expression levels from three identified proteins including CipC showed correlation with protein expression levels. This study represents the first proteomic analysis for a comparison of two A. carbonarius strains with different OTA production and will contribute to a better understanding of the molecular events involved in OTA biosynthesis. Copyright © 2011 Elsevier B.V. All rights reserved.
iTRAQ Quantitative Proteomic Comparison of Metastatic and Non-Metastatic Uveal Melanoma Tumors

PubMed Central

Crabb, John W.; Hu, Bo; Crabb, John S.; Triozzi, Pierre; Saunthararajah, Yogen; Singh, Arun D.

2015-01-01

Background Uveal melanoma is the most common malignancy of the adult eye. The overall mortality rate is high because this aggressive cancer often metastasizes before ophthalmic diagnosis. Quantitative proteomic analysis of primary metastasizing and non-metastasizing tumors was pursued for insights into mechanisms and biomarkers of uveal melanoma metastasis. Methods Eight metastatic and 7 non-metastatic human primary uveal melanoma tumors were analyzed by LC MS/MS iTRAQ technology with Bruch’s membrane/choroid complex from normal postmortem eyes as control tissue. Tryptic peptides from tumor and control proteins were labeled with iTRAQ tags, fractionated by cation exchange chromatography, and analyzed by LC MS/MS. Protein identification utilized the Mascot search engine and the human Uni-Prot/Swiss-Protein database with false discovery ≤ 1%; protein quantitation utilized the Mascot weighted average method. Proteins designated differentially expressed exhibited quantitative differences (p ≤ 0.05, t-test) in a training set of five metastatic and five non-metastatic tumors. Logistic regression models developed from the training set were used to classify the metastatic status of five independent tumors. Results Of 1644 proteins identified and quantified in 5 metastatic and 5 non-metastatic tumors, 12 proteins were found uniquely in ≥ 3 metastatic tumors, 28 were found significantly elevated and 30 significantly decreased only in metastatic tumors, and 31 were designated differentially expressed between metastatic and non-metastatic tumors. Logistic regression modeling of differentially expressed collagen alpha-3(VI) and heat shock protein beta-1 allowed correct prediction of metastasis status for each of five independent tumor specimens. Conclusions The present data provide new clues to molecular differences in metastatic and non-metastatic uveal melanoma tumors. While sample size is limited and validation required, the results support collagen alpha-3(VI) and heat shock protein beta-1 as candidate biomarkers of uveal melanoma metastasis and establish a quantitative proteomic database for uveal melanoma primary tumors. PMID:26305875
Expression of the Long Intergenic Non-Protein Coding RNA 665 (LINC00665) Gene and the Cell Cycle in Hepatocellular Carcinoma Using The Cancer Genome Atlas, the Gene Expression Omnibus, and Quantitative Real-Time Polymerase Chain Reaction.

PubMed

Wen, Dong-Yue; Lin, Peng; Pang, Yu-Yan; Chen, Gang; He, Yun; Dang, Yi-Wu; Yang, Hong

2018-05-05

BACKGROUND Long non-coding RNAs (lncRNAs) have a role in physiological and pathological processes, including cancer. The aim of this study was to investigate the expression of the long intergenic non-protein coding RNA 665 (LINC00665) gene and the cell cycle in hepatocellular carcinoma (HCC) using database analysis including The Cancer Genome Atlas (TCGA), the Gene Expression Omnibus (GEO), and quantitative real-time polymerase chain reaction (qPCR). MATERIAL AND METHODS Expression levels of LINC00665 were compared between human tissue samples of HCC and adjacent normal liver, clinicopathological correlations were made using TCGA and the GEO, and qPCR was performed to validate the findings. Other public databases were searched for other genes associated with LINC00665 expression, including The Atlas of Noncoding RNAs in Cancer (TANRIC), the Multi Experiment Matrix (MEM), Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and protein-protein interaction (PPI) networks. RESULTS Overexpression of LINC00665 in patients with HCC was significantly associated with gender, tumor grade, stage, and tumor cell type. Overexpression of LINC00665 in patients with HCC was significantly associated with overall survival (OS) (HR=1.47795%; CI: 1.046-2.086). Bioinformatics analysis identified 469 related genes and further analysis supported a hypothesis that LINC00665 regulates pathways in the cell cycle to facilitate the development and progression of HCC through ten identified core genes: CDK1, BUB1B, BUB1, PLK1, CCNB2, CCNB1, CDC20, ESPL1, MAD2L1, and CCNA2. CONCLUSIONS Overexpression of the lncRNA, LINC00665 may be involved in the regulation of cell cycle pathways in HCC through ten identified hub genes.

Verification of Ribosomal Proteins of Aspergillus fumigatus for Use as Biomarkers in MALDI-TOF MS Identification

PubMed Central

Nakamura, Sayaka; Sato, Hiroaki; Tanaka, Reiko; Yaguchi, Takashi

2016-01-01

We have previously proposed a rapid identification method for bacterial strains based on the profiles of their ribosomal subunit proteins (RSPs), observed using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). This method can perform phylogenetic characterization based on the mass of housekeeping RSP biomarkers, ideally calculated from amino acid sequence information registered in public protein databases. With the aim of extending its field of application to medical mycology, this study investigates the actual state of information of RSPs of eukaryotic fungi registered in public protein databases through the characterization of ribosomal protein fractions extracted from genome-sequenced Aspergillus fumigatus strains Af293 and A1163 as a model. In this process, we have found that the public protein databases harbor problems. The RSP names are in confusion, so we have provisionally unified them using the yeast naming system. The most serious problem is that many incorrect sequences are registered in the public protein databases. Surprisingly, more than half of the sequences are incorrect, due chiefly to mis-annotation of exon/intron structures. These errors could be corrected by a combination of in silico inspection by sequence homology analysis and MALDI-TOF MS measurements. We were also able to confirm conserved post-translational modifications in eleven RSPs. After these verifications, the masses of 31 expressed RSPs under 20,000 Da could be accurately confirmed. These RSPs have a potential to be useful biomarkers for identifying clinical isolates of A. fumigatus. PMID:27843740
Web server to identify similarity of amino acid motifs to compounds (SAAMCO).

PubMed

Casey, Fergal P; Davey, Norman E; Baran, Ivan; Varekova, Radka Svobodova; Shields, Denis C

2008-07-01

Protein-protein interactions are fundamental in mediating biological processes including metabolism, cell growth, and signaling. To be able to selectively inhibit or induce protein activity or complex formation is a key feature in controlling disease. For those situations in which protein-protein interactions derive substantial affinity from short linear peptide sequences, or motifs, we can develop search algorithms for peptidomimetic compounds that resemble the short peptide's structure but are not compromised by poor pharmacological properties. SAAMCO is a Web service ( http://bioware.ucd.ie/ approximately saamco) that facilitates the screening of motifs with known structures against bioactive compound databases. It is built on an algorithm that defines compound similarity based on the presence of appropriate amino acid side chain fragments and a favorable Root Mean Squared Deviation (RMSD) between compound and motif structure. The methodology is efficient as the available compound databases are preprocessed and fast regular expression searches filter potential matches before time-intensive 3D superposition is performed. The required input information is minimal, and the compound databases have been selected to maximize the availability of information on biological activity. "Hits" are accompanied with a visualization window and links to source database entries. Motif matching can be defined on partial or full similarity which will increase or reduce respectively the number of potential mimetic compounds. The Web server provides the functionality for rapid screening of known or putative interaction motifs against prepared compound libraries using a novel search algorithm. The tabulated results can be analyzed by linking to appropriate databases and by visualization.
Metallopeptidases of Toxoplasma gondii: in silico identification and gene expression.

PubMed

Escotte-Binet, Sandie; Huguenin, Antoine; Aubert, Dominique; Martin, Anne-Pascaline; Kaltenbach, Matthieu; Florent, Isabelle; Villena, Isabelle

2018-01-01

Metallopeptidases are a family of proteins with domains that remain highly conserved throughout evolution. These hydrolases require divalent metal cation(s) to activate the water molecule in order to carry out their catalytic action on peptide bonds by nucleophilic attack. Metallopeptidases from parasitic protozoa, including Toxoplasma, are investigated because of their crucial role in parasite biology. In the present study, we screened the T. gondii database using PFAM motifs specific for metallopeptidases in association with the MEROPS peptidase Database (release 10.0). In all, 49 genes encoding proteins with metallopeptidase signatures were identified in the Toxoplasma genome. An Interpro Search enabled us to uncover their domain/motif organization, and orthologs with the highest similarity by BLAST were used for annotation. These 49 Toxoplasma metallopeptidases clustered into 15 families described in the MEROPS database. Experimental expression analysis of their genes in the tachyzoite stage revealed transcription for all genes studied. Further research on the role of these peptidases should increase our knowledge of basic Toxoplasma biology and provide opportunities to identify novel therapeutic targets. This type of study would also open a path towards the comparative biology of apicomplexans. © S. Escotte-Binet et al., published by EDP Sciences, 2018.
Key genes and pathways in measles and their interaction with environmental chemicals.

PubMed

Zhang, Rongqiang; Jiang, Hualin; Li, Fengying; Su, Ning; Ding, Yi; Mao, Xiang; Ren, Dan; Wang, Jing

2018-06-01

The aim of the present study was to explore key genes that may have a role in the pathology of measles virus infection and to clarify the interaction networks between environmental factors and differentially expressed genes (DEGs). After screening the database of the Gene Expression Omnibus of the National Center for Biotechnology Information, the dataset GSE5808 was downloaded and analyzed. A global normalization method was performed to minimize data inconsistencies and heterogeneity. DEGs during different stages of measles virus infection were explored using R software (v3.4.0). Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the DEGs were performed using Cytoscape 3.4.0 software. A protein-protein interaction (PPI) network of the DEGs was obtained from the STRING database v9.05. A total of 43 DEGs were obtained from four analyzed sample groups, including 10 highly expressed genes and 33 genes with decreased expression. The most enriched pathways based on KEGG analysis were fatty acid elongation, cytokine-cytokine receptor interaction and RNA degradation. The genes mentioned in the PPI network were mainly associated with protein binding and chemokine activity. A total of 219 chemicals were identified that may, jointly or on their own, interact with the 6 DEGs between the control group and patients with measles (at hospital entry), including benzo(a)pyrene (BaP) and tetrachlorodibenzodioxin (TCDD). In conclusion, the present study revealed that chemokines and environmental chemicals, e.g. BaP and TCDD, may affect the development of measles.
Exposure to Nickel, Chromium, or Cadmium Causes Distinct Changes in the Gene Expression Patterns of Rat Liver-Derived Cell Lines

DTIC Science & Technology

2010-05-22

member B8 Blue 1370939_at Acsl1 acyl-CoA synthetase long-chain family member 1 Yellow 1372006_at --- --- Blue 1372101_at Ppap2b phosphatidic acid ...Stress L-ascorbic Acid Binding Cation Binding Identical Protein Binding Protein Dimerization Activity Dioxygenase Activity Oxidoreductase...Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts, and proteins. Nucleic Acid Research. 35: D61-65. Ryter SW
Cysteine-containing peptides having antioxidant properties

DOEpatents

Bielicki, John K [Castro Valley, CA

2007-05-15

The term "homology" or "homologous" means an amino acid similarity measured by the program, BLAST (Altschul et al (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:33 89 3402), and expressed as --(% identity n/n). In measuring homology between a peptide and a protein of greater size, homology is measured only in the corresponding region; that is, the protein is regarded as only having the same general length as the peptide, allowing for gaps and insertions.
Clinical value of miR-452-5p expression in lung adenocarcinoma: A retrospective quantitative real-time polymerase chain reaction study and verification based on The Cancer Genome Atlas and Gene Expression Omnibus databases.

PubMed

Gan, Xiao-Ning; Luo, Jie; Tang, Rui-Xue; Wang, Han-Lin; Zhou, Hong; Qin, Hui; Gan, Ting-Qing; Chen, Gang

2017-05-01

The role and mechanism of miR-452-5p in lung adenocarcinoma remain unclear. In this study, we performed a systematic study to investigate the clinical value of miR-452-5p expression in lung adenocarcinoma. The expression of miR-452-5p in 101 lung adenocarcinoma patients was detected by quantitative real-time polymerase chain reaction. The Cancer Genome Atlas and Gene Expression Omnibus databases were joined to verify the expression level of miR-452-5p in lung adenocarcinoma. Via several online prediction databases and bioinformatics software, pathway and network analyses of miR-452-5p target genes were performed to explore its prospective molecular mechanism. The expression of miR-452-5p in lung adenocarcinoma in house was significantly lower than that in adjacent tissues (p < 0.001). Additionally, the expression level of miR-452-5p was negatively correlated with several clinicopathological parameters including the tumor size (p = 0.014), lymph node metastasis (p = 0.032), and tumor-node-metastasis stage (p = 0.036). Data from The Cancer Genome Atlas also confirmed the low expression of miR-452 in lung adenocarcinoma (p < 0.001). Furthermore, reduced expression of miR-452-5p in lung adenocarcinoma (standard mean deviations = -0.393, 95% confidence interval: -0.774 to -0.011, p = 0.044) was validated by a meta-analysis. Five hub genes targeted by miR-452-5p, including SMAD family member 4, SMAD family member 2, cyclin-dependent kinase inhibitor 1B, tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein epsilon, and tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein beta, were significantly enriched in the cell-cycle pathway. In conclusion, low expression of miR-452-5p tends to play an essential role in lung adenocarcinoma. Bioinformatics analysis might be beneficial to reveal the potential mechanism of miR-452-5p in lung adenocarcinoma.
Proteomic Analysis and qRT-PCR Verification of Temperature Response to Arthrospira (Spirulina) platensis

PubMed Central

Huili, Wang; Xiaokai, Zhao; Meili, Lin; Dahlgren, Randy A.; Wei, Chen; Jaiopeng, Zhou; Chengyang, Xu; Chunlei, Jin; Yi, Xu; Xuedong, Wang; Li, Ding; Qiyu, Bao

2013-01-01

Arthrospira (Spirulina) platensis (ASP) is a representative filamentous, non-N2-fixing cyanobacterium that has great potential to enhance the food supply and possesses several valuable physiological features. ASP tolerates high and low temperatures along with highly alkaline and salty environments, and can strongly resist oxidation and irradiation. Based on genomic sequencing of ASP, we compared the protein expression profiles of this organism under different temperature conditions (15°C, 35°Cand 45°C) using 2-DE and peptide mass fingerprinting techniques. A total of 122 proteins having a significant differential expression response to temperature were retrieved. Of the positively expressed proteins, the homologies of 116 ASP proteins were found in Arthrospira (81 proteins in Arthrospira platensis str. Paraca and 35 in Arthrospira maxima CS-328). The other 6 proteins have high homology with other microorganisms. We classified the 122 differentially expressed positive proteins into 14 functions using the COG database, and characterized their respective KEGG metabolism pathways. The results demonstrated that these differentially expressed proteins are mainly involved in post-translational modification (protein turnover, chaperones), energy metabolism (photosynthesis, respiratory electron transport), translation (ribosomal structure and biogenesis) and carbohydrate transport and metabolism. Others proteins were related to amino acid transport and metabolism, cell envelope biogenesis, coenzyme metabolism and signal transduction mechanisms. Results implied that these proteins can perform predictable roles in rendering ASP resistance against low and high temperatures. Subsequently, we determined the transcription level of 38 genes in vivo in response to temperature and identified them by qRT-PCR. We found that the 26 differentially expressed proteins, representing 68.4% of the total target genes, maintained consistency between transcription and translation levels. The remaining 12 genes showed inconsistent protein expression with transcription level and accounted for 31.6% of the total target genes. PMID:24349519
Identification and prognostic value of anterior gradient protein 2 expression in breast cancer based on tissue microarray.

PubMed

Guo, Jilong; Gong, Guohua; Zhang, Bin

2017-07-01

Breast cancer has attracted substantial attention as one of the major cancers causing death in women. It is crucial to find potential biomarkers of prognostic value in breast cancer. In this study, the expression pattern of anterior gradient protein 2 in breast cancer was identified based on the main molecular subgroups. Through analysis of 69 samples from the Gene Expression Omnibus database, we found that anterior gradient protein 2 expression was significantly higher in non-triple-negative breast cancer tissues compared with normal tissues and triple-negative breast cancer tissues (p < 0.05). The data from a total of 622 patients from The Cancer Genome Atlas were analysed. The data from The Cancer Genome Atlas and results from quantitative reverse transcription polymerase chain reaction also verified the anterior gradient protein 2 expression pattern. Furthermore, we performed immunohistochemical analysis. The quantification results revealed that anterior gradient protein 2 is highly expressed in non-triple-negative breast cancer (grade 3 excluded) and grade 1 + 2 (triple-negative breast cancer excluded) tumours compared with normal tissues. Anterior gradient protein 2 was significantly highly expressed in non-triple-negative breast cancer (grade 3 excluded) and non-triple-negative breast cancer tissues compared with triple-negative breast cancer tissues (p < 0.01). In addition, anterior gradient protein 2 was significantly highly expressed in grade 1 + 2 (triple-negative breast cancer excluded) and grade 1 + 2 tissues compared with grade 3 tissues (p < 0.05). Analysis by Fisher's exact test revealed that anterior gradient protein 2 expression was significantly associated with histologic type, histological grade, oestrogen status and progesterone status. Univariate analysis of clinicopathological variables showed that anterior gradient protein 2 expression, tumour size and lymph node status were significantly correlated with overall survival in patients with grade 1 and 2 tumours. Cox multivariate analysis revealed anterior gradient protein 2 as a putative independent indicator of unfavourable outcomes (p = 0.031). All these data clearly showed that anterior gradient protein 2 is highly expressed in breast cancer and can be regarded as a putative biomarker for breast cancer prognosis.
Proteomic profiling of fetal esophageal epithelium, esophageal cancer, and tumor-adjacent esophageal epithelium and immunohistochemical characterization of a representative differential protein, PRX6

PubMed Central

Guo, Jun-Hui; Xing, Guo-Lan; Fang, Xin-Hui; Wu, Hui-Fang; Zhang, Bo; Yu, Jin-Zhong; Fan, Zong-Min; Wang, Li-Dong

2017-01-01

AIM To understand the molecular mechanism of esophageal cancer development and provide molecular markers for screening high-risk populations and early diagnosis. METHODS Two-dimensional electrophoresis combined with mass spectrometry were adopted to screen differentially expressed proteins in nine cases of fetal esophageal epithelium, eight cases of esophageal cancer, and eight cases of tumor-adjacent normal esophageal epithelium collected from fetuses of different gestational age, or esophageal cancer patients from a high-risk area of esophageal cancer in China. Immunohistochemistry (avidin-biotin-horseradish peroxidase complex method) was used to detect the expression of peroxiredoxin (PRX)6 in 91 cases of esophageal cancer, tumor-adjacent normal esophageal tissue, basal cell hyperplasia, dysplasia, and carcinoma in situ, as well as 65 cases of esophageal epithelium from fetuses at a gestational age of 3-9 mo. RESULTS After peptide mass fingerprint analysis and search of protein databases, 21 differential proteins were identified; some of which represent a protein isoform. Varying degrees of expression of PRX6 protein, which was localized mainly in the cytoplasm, were detected in adult and fetal normal esophageal tissues, precancerous lesions, and esophageal cancer. With the progression of esophageal lesions, PRX6 protein expression showed a declining trend (P < 0.05). In fetal epithelium from fetuses at gestational age 3-6 mo, PRX6 protein expression showed a declining trend with age (P < 0.05). PRX6 protein expression was significantly higher in well-differentiated esophageal cancer tissues than in poorly differentiated esophageal cancer tissues (P < 0.05). CONCLUSION Development and progression of esophageal cancer result from interactions of genetic changes (accumulation or superposition). PRX6 protein is associated with fetal esophageal development and cancer differentiation. PMID:28293090
Proteomic profiling of fetal esophageal epithelium, esophageal cancer, and tumor-adjacent esophageal epithelium and immunohistochemical characterization of a representative differential protein, PRX6.

PubMed

Guo, Jun-Hui; Xing, Guo-Lan; Fang, Xin-Hui; Wu, Hui-Fang; Zhang, Bo; Yu, Jin-Zhong; Fan, Zong-Min; Wang, Li-Dong

2017-02-28

To understand the molecular mechanism of esophageal cancer development and provide molecular markers for screening high-risk populations and early diagnosis. Two-dimensional electrophoresis combined with mass spectrometry were adopted to screen differentially expressed proteins in nine cases of fetal esophageal epithelium, eight cases of esophageal cancer, and eight cases of tumor-adjacent normal esophageal epithelium collected from fetuses of different gestational age, or esophageal cancer patients from a high-risk area of esophageal cancer in China. Immunohistochemistry (avidin-biotin-horseradish peroxidase complex method) was used to detect the expression of peroxiredoxin (PRX)6 in 91 cases of esophageal cancer, tumor-adjacent normal esophageal tissue, basal cell hyperplasia, dysplasia, and carcinoma in situ , as well as 65 cases of esophageal epithelium from fetuses at a gestational age of 3-9 mo. After peptide mass fingerprint analysis and search of protein databases, 21 differential proteins were identified; some of which represent a protein isoform. Varying degrees of expression of PRX6 protein, which was localized mainly in the cytoplasm, were detected in adult and fetal normal esophageal tissues, precancerous lesions, and esophageal cancer. With the progression of esophageal lesions, PRX6 protein expression showed a declining trend ( P < 0.05). In fetal epithelium from fetuses at gestational age 3-6 mo, PRX6 protein expression showed a declining trend with age ( P < 0.05). PRX6 protein expression was significantly higher in well-differentiated esophageal cancer tissues than in poorly differentiated esophageal cancer tissues ( P < 0.05). Development and progression of esophageal cancer result from interactions of genetic changes (accumulation or superposition). PRX6 protein is associated with fetal esophageal development and cancer differentiation.
Prognostic value of decreased FOXP1 protein expression in various tumors: a systematic review and meta-analysis

PubMed Central

Xiao, Jian; He, Bixiu; Zou, Yong; Chen, Xi; Lu, Xiaoxiao; Xie, Mingxuan; Li, Wei; He, Shuya; You, Shaojin; Chen, Qiong

2016-01-01

The prognostic value of forkhead box protein P1 (FOXP1) protein expression in tumors remains controversial. Therefore, we conducted a systematic review and meta-analysis, searching the PubMed, Embase and Web of Science databases to identify eligible studies. In total, we analyzed 22 articles that examined 9 tumor types and included 2468 patients. Overall, decreased expression of FOXP1 protein was associated with favorable overall survival (OS) in lymphoma patients (HR = 0.38, 95%CI: 0.30–0.48, p < 0.001). In patients with solid tumors, decreased FOXP1 expression correlated with unfavorable OS (HR = 1.82, 95%CI: 1.18–2.83, p = 0.007). However, when FOXP1 protein expression was nuclear, decreased expression was also associated with favorable OS (HR = 0.53, 95%CI: 0.32–0.86, p = 0.011). Furthermore, decreased FOXP1 expression resulted in the best OS in patients with mucosa-associated lymphoid tissue (MALT) lymphomas (HR = 0.26, 95%CI: 0.11–0.59, p = 0.001), but the worst OS was observed in non-small cell lung cancer (NSCLC) patients (HR = 3.11, 95%CI: 1.87–5.17, p < 0.001). In addition, decreased FOXP1 expression was significantly correlated with an unfavorable relapse-free survival (RFS) in breast cancer patients (HR = 1.93, 95%CI: 1.33–2.80, p = 0.001). PMID:27457567
Protein-protein interaction network of gene expression in the hydrocortisone-treated keloid.

PubMed

Chen, Rui; Zhang, Zhiliang; Xue, Zhujia; Wang, Lin; Fu, Mingang; Lu, Yi; Bai, Ling; Zhang, Ping; Fan, Zhihong

2015-01-01

In order to explore the molecular mechanism of hydrocortisone in keloid tissue, the gene expression profiles of keloid samples treated with hydrocortisone were subjected to bioinformatics analysis. Firstly, the gene expression profiles (GSE7890) of five samples of keloid treated with hydrocortisone and five untreated keloid samples were downloaded from the Gene Expression Omnibus (GEO) database. Secondly, data were preprocessed using packages in R language and differentially expressed genes (DEGs) were screened using a significance analysis of microarrays (SAM) protocol. Thirdly, the DEGs were subjected to gene ontology (GO) function and KEGG pathway enrichment analysis. Finally, the interactions of DEGs in samples of keloid treated with hydrocortisone were explored in a human protein-protein interaction (PPI) network, and sub-modules of the DEGs interaction network were analyzed using Cytoscape software. Based on the analysis, 572 DEGs in the hydrocortisone-treated samples were screened; most of these were involved in the signal transduction and cell cycle. Furthermore, three critical genes in the module, including COL1A1, NID1, and PRELP, were screened in the PPI network analysis. These findings enhance understanding of the pathogenesis of the keloid and provide references for keloid therapy. © 2015 The International Society of Dermatology.
Computer applications making rapid advances in high throughput microbial proteomics (HTMP).

PubMed

Anandkumar, Balakrishna; Haga, Steve W; Wu, Hui-Fen

2014-02-01

The last few decades have seen the rise of widely-available proteomics tools. From new data acquisition devices, such as MALDI-MS and 2DE to new database searching softwares, these new products have paved the way for high throughput microbial proteomics (HTMP). These tools are enabling researchers to gain new insights into microbial metabolism, and are opening up new areas of study, such as protein-protein interactions (interactomics) discovery. Computer software is a key part of these emerging fields. This current review considers: 1) software tools for identifying the proteome, such as MASCOT or PDQuest, 2) online databases of proteomes, such as SWISS-PROT, Proteome Web, or the Proteomics Facility of the Pathogen Functional Genomics Resource Center, and 3) software tools for applying proteomic data, such as PSI-BLAST or VESPA. These tools allow for research in network biology, protein identification, functional annotation, target identification/validation, protein expression, protein structural analysis, metabolic pathway engineering and drug discovery.
Combined protein construct and synthetic gene engineering for heterologous protein expression and crystallization using Gene Composer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Raymond, Amy; Lovell, Scott; Lorimer, Don

2009-12-01

With the goal of improving yield and success rates of heterologous protein production for structural studies we have developed the database and algorithm software package Gene Composer. This freely available electronic tool facilitates the information-rich design of protein constructs and their engineered synthetic gene sequences, as detailed in the accompanying manuscript. In this report, we compare heterologous protein expression levels from native sequences to that of codon engineered synthetic gene constructs designed by Gene Composer. A test set of proteins including a human kinase (P38{alpha}), viral polymerase (HCV NS5B), and bacterial structural protein (FtsZ) were expressed in both E. colimore » and a cell-free wheat germ translation system. We also compare the protein expression levels in E. coli for a set of 11 different proteins with greatly varied G:C content and codon bias. The results consistently demonstrate that protein yields from codon engineered Gene Composer designs are as good as or better than those achieved from the synonymous native genes. Moreover, structure guided N- and C-terminal deletion constructs designed with the aid of Gene Composer can lead to greater success in gene to structure work as exemplified by the X-ray crystallographic structure determination of FtsZ from Bacillus subtilis. These results validate the Gene Composer algorithms, and suggest that using a combination of synthetic gene and protein construct engineering tools can improve the economics of gene to structure research.« less
Immunohistochemistry in the Diagnosis of Mucinous Neoplasms Involving the Ovary: The Added Value of SATB2 and Biomarker Discovery Through Protein Expression Database Mining.

PubMed

Strickland, Sarah; Wasserman, Jason K; Giassi, Ana; Djordjevic, Bojana; Parra-Herran, Carlos

2016-05-01

Immunohistochemistry is frequently used to identify ovarian mucinous neoplasms as primary or metastatic; however, there is significant overlap in expression patterns. We compared traditional markers (CK7, CK20, CDX2, PAX8, estrogen receptor, β-catenin, MUC1, MUC2, and MUC5AC) to 2 novel proteins identified through mining of the Human Protein Atlas expression database: SATB2 and POF1B. The study cohort included 49 primary gastrointestinal (GI) mucinous adenocarcinomas (19 colorectal, 15 gastric, 15 pancreatobiliary), 60 primary ovarian mucinous neoplasms (19 cystadenomas, 21 borderline tumors, 20 adenocarcinomas), and 19 metastatic carcinomas to the ovary (14 lower and 5 upper GI primaries). Immunohistochemistry was performed on tissue microarrays, scored and interpreted as negative (absent or focal/weak) or positive. Metastatic tumors were frequently unilateral (42.8% of tumors from lower and 40% of tumors from upper tract) and ≥10 cm (85.7% of tumors from lower and 80% of tumors from upper tract). CK7 was positive in 88.5% upper GI and 88.3% primary ovarian compared with 24.3% lower GI neoplasms. CK20 and CDX2 were positive in 84.8% and 100% of lower GI tumors, respectively; however, expression was also common in upper GI (CK20 42.8%, CDX2 50%) and primary ovarian neoplasms (CK20 65.7%, CDX2 38.3%). Conversely, SATB2 was more specific for lower GI origin, being positive in 78.8% lower GI but only 11.5% upper GI and 1.7% primary ovarian neoplasms. PAX8 expression was common in primary ovarian neoplasms (75% of all neoplasms, 65% of carcinomas); only 1 (1.5%) GI tumor was positive. MUC2 and β-catenin were frequently positive in lower GI tumors (96.9% and 51.5%, respectively). Estrogen receptor expression was only seen in primary ovarian neoplasms (13.3%). Nuclear premature ovarian failure 1B (POF1B) expression was seen in malignant tumors regardless of their origin. A panel including CK7, SATB2, and PAX8 separated primary from secondary GI neoplasms with up to 77.1% sensitivity and 99% specificity, outperforming tumor laterality and size. Second-line markers such as CDX2, MUC2, estrogen receptor, MUC1, and β-catenin increased the sensitivity of immunohistochemistry in excluding lower GI origin. Biomarker search using proteomic databases has a value in diagnostic pathology, as shown with SATB2; however, as seen with POF1B, expression profiles in these databases are not always reproduced in larger cohorts.
Identification of Biological Targets of Therapeutic Intervention for Hepatocellular Carcinoma by Integrated Bioinformatical Analysis.

PubMed

Hu, Wei Qi; Wang, Wei; Fang, Di Long; Yin, Xue Feng

2018-05-24

BACKGROUND We screened the potential molecular targets and investigated the molecular mechanisms of hepatocellular carcinoma (HCC). MATERIAL AND METHODS Microarray data of GSE47786, including the 40 μM berberine-treated HepG2 human hepatoma cell line and 0.08% DMSO-treated as control cells samples, was downloaded from the GEO database. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses were performed; the protein-protein interaction (PPI) networks were constructed using STRING database and Cytoscape; the genetic alteration, neighboring genes networks, and survival analysis of hub genes were explored by cBio portal; and the expression of mRNA level of hub genes was obtained from the Oncomine databases. RESULTS A total of 56 upregulated and 8 downregulated DEGs were identified. The GO analysis results were significantly enriched in cell-cycle arrest, regulation of transcription, DNA-dependent, protein amino acid phosphorylation, cell cycle, and apoptosis. The KEGG pathway analysis showed that DEGs were enriched in MAPK signaling pathway, ErbB signaling pathway, and p53 signaling pathway. JUN, EGR1, MYC, and CDKN1A were identified as hub genes in PPI networks. The genetic alteration of hub genes was mainly concentrated in amplification. TP53, NDRG1, and MAPK15 were found in neighboring genes networks. Altered genes had worse overall survival and disease-free survival than unaltered genes. The expressions of EGR1, MYC, and CDKN1A were significantly increased, but expression of JUN was not, in the Roessler Liver datasets. CONCLUSIONS We found that JUN, EGR1, MYC, and CDKN1A might be used as diagnostic and therapeutic molecular biomarkers and broaden our understanding of the molecular mechanisms of HCC.
Proteomic sensitivity to dietary manipulations in rainbow trout.

PubMed

Martin, S A M; Vilhelmsson, O; Médale, F; Watt, P; Kaushik, S; Houlihan, D F

2003-09-23

Changes in dietary protein sources due to substitution of fish meal by other protein sources can have metabolic consequences in farmed fish. A proteomics approach was used to study the protein profiles of livers of rainbow trout that have been fed two diets containing different proportions of plant ingredients. Both diets control (C) and soy (S) contained fish meal and plant ingredients and synthetic amino acids, but diet S had a greater proportion of soybean meal. A feeding trial was performed for 12 weeks at the end of which, growth and protein metabolism parameters were measured. Protein growth rates were not different in fish fed different diets; however, protein consumption and protein synthesis rates were higher in the fish fed the diet S. Fish fed diet S had lower efficiency of retention of synthesised protein. Ammonia excretion was increased as well as the activities of hepatic glutamate dehydrogenase and aspartate amino transferase (ASAT). No differences were found in free amino acid pools in either liver or muscle between diets. Protein extraction followed by high-resolution two-dimensional electrophoresis, coupled with gel image analysis, allowed identification and expression of hundreds of protein. Individual proteins of interest were then subjected to further analysis leading to protein identification by trypsin digest fingerprinting. During this study, approximately 800 liver proteins were analysed for expression pattern, of which 33 were found to be differentially expressed between diets C and S. Seventeen proteins were positively identified after database searching. Proteins were identified from diverse metabolic pathways, demonstrating the complex nature of gene expression responses to dietary manipulation revealed by proteomic characterisation.
Cyclebase 3.0: a multi-organism database on cell-cycle regulation and phenotypes.

PubMed

Santos, Alberto; Wernersson, Rasmus; Jensen, Lars Juhl

2015-01-01

The eukaryotic cell division cycle is a highly regulated process that consists of a complex series of events and involves thousands of proteins. Researchers have studied the regulation of the cell cycle in several organisms, employing a wide range of high-throughput technologies, such as microarray-based mRNA expression profiling and quantitative proteomics. Due to its complexity, the cell cycle can also fail or otherwise change in many different ways if important genes are knocked out, which has been studied in several microscopy-based knockdown screens. The data from these many large-scale efforts are not easily accessed, analyzed and combined due to their inherent heterogeneity. To address this, we have created Cyclebase--available at http://www.cyclebase.org--an online database that allows users to easily visualize and download results from genome-wide cell-cycle-related experiments. In Cyclebase version 3.0, we have updated the content of the database to reflect changes to genome annotation, added new mRNA and protein expression data, and integrated cell-cycle phenotype information from high-content screens and model-organism databases. The new version of Cyclebase also features a new web interface, designed around an overview figure that summarizes all the cell-cycle-related data for a gene. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Construction of a Lotus japonicus late nodulin expressed sequence tag library and identification of novel nodule-specific genes.

PubMed Central

Szczyglowski, K; Hamburger, D; Kapranov, P; de Bruijn, F J

1997-01-01

A range of novel expressed sequence tags (ESTs) associated with late developmental events during nodule organogenesis in the legume Lotus japonicus were identified using mRNA differential display; 110 differentially displayed polymerase chain reaction products were cloned and analyzed. Of 88 unique cDNAs obtained, 22 shared significant homology to DNA/protein sequences in the respective databases. This group comprises, among others, a nodule-specific homolog of protein phosphatase 2C, a peptide transporter protein, and a nodule-specific form of cytochrome P450. RNA gel-blot analysis of 16 differentially displayed ESTs confirmed their nodule-specific expression pattern. The kinetics of mRNA accumulation of the majority of the ESTs analyzed were found to resemble the expression pattern observed for the L. japonicus leghemoglobin gene. These results indicate that the newly isolated molecular markers correspond to genes induced during late developmental stages of L. japonicus nodule organogenesis and provide important, novel tools for the study of nodulation. PMID:9276951

Broad host range vectors for expression of proteins with (Twin-) Strep-tag, His-tag and engineered, export optimized yellow fluorescent protein

PubMed Central

2013-01-01

Background In current protein research, a limitation still is the production of active recombinant proteins or native protein associations to assess their function. Especially the localization and analysis of protein-complexes or the identification of modifications and small molecule interaction partners by co-purification experiments requires a controllable expression of affinity- and/or fluorescence tagged variants of a protein of interest in its native cellular background. Advantages of periplasmic and/or homologous expressions can frequently not be realized due to a lack of suitable tools. Instead, experiments are often limited to the heterologous production in one of the few well established expression strains. Results Here, we introduce a series of new RK2 based broad host range expression plasmids for inducible production of affinity- and fluorescence tagged proteins in the cytoplasm and periplasm of a wide range of Gram negative hosts which are designed to match the recently suggested modular Standard European Vector Architecture and database. The vectors are equipped with a yellow fluorescent protein variant which is engineered to fold and brightly fluoresce in the bacterial periplasm following Sec-mediated export, as shown from fractionation and imaging studies. Expression of Strep-tag®II and Twin-Strep-tag® fusion proteins in Pseudomonas putida KT2440 is demonstrated for various ORFs. Conclusion The broad host range constructs we have produced enable good and controlled expression of affinity tagged protein variants for single-step purification and qualify for complex co-purification experiments. Periplasmic export variants enable production of affinity tagged proteins and generation of fusion proteins with a novel engineered Aequorea-based yellow fluorescent reporter protein variant with activity in the periplasm of the tested Gram-negative model bacteria Pseudomonas putida KT2440 and Escherichia coli K12 for production, localization or co-localization studies. In addition, the new tools facilitate metabolic engineering and yield assessment for cytoplasmic or periplasmic protein production in a number of different expression hosts when yields in one initially selected are insufficient. PMID:23687945
Discovery of novel cold-induced CISP genes encoding small RNA-binding proteins related to cold adaptation in barley.

PubMed

Ying, Mengchao; Kidou, Shin-Ichiro

2017-07-01

To adapt to cold conditions, barley plants rely on specific mechanisms, which have not been fully understood. In this study, we characterized a novel barley cold-induced gene identified using a PCR-based high coverage gene expression profiling method. The identified gene encodes a small protein that we named CISP1 (Cold-induced Small Protein 1). Homology searches of sequence databases revealed that CISP1 homologs (CISP2 and CISP3) exist in barley genome. Further database analyses showed that the CISP1 homologs were widely distributed in cold-tolerant plants such as wheat and rye. Quantitative reverse transcription PCR analyses indicated that the expression of barley CISP genes was markedly increased in roots exposed to cold conditions. In situ hybridization analyses showed that the CISP1 transcripts were localized in the root tip and lateral root primordium. We also demonstrated that the CISP1 protein bound to RNA. Taken together, these findings indicate that CISP1 and its homologs encoding small RNA-binding proteins may serve as RNA chaperones playing a vital role in the cold adaptation of barley root. This is the first report describing the likely close relationship between root-specific genes and the cold adaptation process, as well as the potential function of the identified genes. Copyright © 2017 Elsevier B.V. All rights reserved.
Database resources of the National Center for Biotechnology Information

PubMed Central

Sayers, Eric W.; Barrett, Tanya; Benson, Dennis A.; Bolton, Evan; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M.; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Krasnov, Sergey; Landsman, David; Lipman, David J.; Lu, Zhiyong; Madden, Thomas L.; Madej, Tom; Maglott, Donna R.; Marchler-Bauer, Aron; Miller, Vadim; Karsch-Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D.; Schuler, Gregory D.; Sequeira, Edwin; Sherry, Stephen T.; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A.; Wagner, Lukas; Wang, Yanli; Wilbur, W. John; Yaschenko, Eugene; Ye, Jian

2012-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:22140104
Database resources of the National Center for Biotechnology Information

PubMed Central

2013-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page. PMID:23193264
Discovery of Possible Gene Relationships through the Application of Self-Organizing Maps to DNA Microarray Databases

PubMed Central

Chavez-Alvarez, Rocio; Chavoya, Arturo; Mendez-Vazquez, Andres

2014-01-01

DNA microarrays and cell cycle synchronization experiments have made possible the study of the mechanisms of cell cycle regulation of Saccharomyces cerevisiae by simultaneously monitoring the expression levels of thousands of genes at specific time points. On the other hand, pattern recognition techniques can contribute to the analysis of such massive measurements, providing a model of gene expression level evolution through the cell cycle process. In this paper, we propose the use of one of such techniques –an unsupervised artificial neural network called a Self-Organizing Map (SOM)–which has been successfully applied to processes involving very noisy signals, classifying and organizing them, and assisting in the discovery of behavior patterns without requiring prior knowledge about the process under analysis. As a test bed for the use of SOMs in finding possible relationships among genes and their possible contribution in some biological processes, we selected 282 S. cerevisiae genes that have been shown through biological experiments to have an activity during the cell cycle. The expression level of these genes was analyzed in five of the most cited time series DNA microarray databases used in the study of the cell cycle of this organism. With the use of SOM, it was possible to find clusters of genes with similar behavior in the five databases along two cell cycles. This result suggested that some of these genes might be biologically related or might have a regulatory relationship, as was corroborated by comparing some of the clusters obtained with SOMs against a previously reported regulatory network that was generated using biological knowledge, such as protein-protein interactions, gene expression levels, metabolism dynamics, promoter binding, and modification, regulation and transport of proteins. The methodology described in this paper could be applied to the study of gene relationships of other biological processes in different organisms. PMID:24699245
AbMiner: a bioinformatic resource on available monoclonal antibodies and corresponding gene identifiers for genomic, proteomic, and immunologic studies.

PubMed

Major, Sylvia M; Nishizuka, Satoshi; Morita, Daisaku; Rowland, Rick; Sunshine, Margot; Shankavaram, Uma; Washburn, Frank; Asin, Daniel; Kouros-Mehr, Hosein; Kane, David; Weinstein, John N

2006-04-06

Monoclonal antibodies are used extensively throughout the biomedical sciences for detection of antigens, either in vitro or in vivo. We, for example, have used them for quantitation of proteins on "reverse-phase" protein lysate arrays. For those studies, we quality-controlled > 600 available monoclonal antibodies and also needed to develop precise information on the genes that encode their antigens. Translation among the various protein and gene identifier types proved non-trivial because of one-to-many and many-to-one relationships. To organize the antibody, protein, and gene information, we initially developed a relational database in Filemaker for our own use. When it became apparent that the information would be useful to many other researchers faced with the need to choose or characterize antibodies, we developed it further as AbMiner, a fully relational web-based database under MySQL, programmed in Java. AbMiner is a user-friendly, web-based relational database of information on > 600 commercially available antibodies that we validated by Western blot for protein microarray studies. It includes many types of information on the antibody, the immunogen, the vendor, the antigen, and the antigen's gene. Multiple gene and protein identifier types provide links to corresponding entries in a variety of other public databases, including resources for phosphorylation-specific antibodies. AbMiner also includes our quality-control data against a pool of 60 diverse cancer cell types (the NCI-60) and also protein expression levels for the NCI-60 cells measured using our high-density "reverse-phase" protein lysate microarrays for a selection of the listed antibodies. Some other available database resources give information on antibody specificity for one or a couple of cell types. In contrast, the data in AbMiner indicate specificity with respect to the antigens in a pool of 60 diverse cell types from nine different tissues of origin. AbMiner is a relational database that provides extensive information from our own laboratory and other sources on more than 600 available antibodies and the genes that encode the antibodies' antigens. The data will be made freely available at http://discover.nci.nih.gov/abminer.
Differential protein expression patterns between planktonic and biofilm cells of Salmonella enterica serovar Enteritidis PT4 on stainless steel surface.

PubMed

Giaouris, Efstathios; Samoilis, Georgios; Chorianopoulos, Nikos; Ercolini, Danilo; Nychas, George-John

2013-03-01

In the present study, the proteome of a strain of S. enterica serovar Enteritidis PT4, grown either as biofilm on stainless steel surface or as free-floating (planktonic) in Brain Heart (BH) broth, was investigated in order to detect the strong differences in whole-cell protein expression patterns between the two growth styles. The proteins extracted from both types of cells were subjected to 2-D PAGE, followed by in-gel tryptic digestion, extraction, subsequent MALDI-TOF mass spectrometry (MS) analysis and finally database searches for protein identification. Using this approach, 30 proteins were identified as differentially expressed between the two growth modes on an "on-off" basis, that is, proteins that were detected in one case but not in the other. In particular, 20 and 10 proteins were identified in biofilm and planktonic-grown cells, respectively. The group of proteins whose expression was visible only during biofilm growth included proteins involved in global regulation and stress response (ArcA, BtuE, Dps, OsmY, SspA, TrxA, YbbN and YhbO), nutrient transport (Crr, DppA, Fur and SufC), degradation and energy metabolism (GcvT, GpmA, RibB), detoxification (SseA and YibF), DNA metabolism (SSB), curli production (CsgF), and murein synthesis (MipA). To summarize, this study demonstrates that biofilm growth of S. Enteritidis causes distinct changes in protein expression and offers valuable new data regarding some of the proteins presumably involved in this process. The putative role of these proteins in the maintenance of a biofilm community in Salmonella and other bacteria is discussed. Copyright © 2012 Elsevier B.V. All rights reserved.
Characterization of the Proteome of Theobroma cacao Beans by Nano-UHPLC-ESI MS/MS.

PubMed

Scollo, Emanuele; Neville, David; Oruna-Concha, M Jose; Trotin, Martine; Cramer, Rainer

2018-02-01

Cocoa seed storage proteins play an important role in flavour development as aroma precursors are formed from their degradation during fermentation. Major proteins in the beans of Theobroma cacao are the storage proteins belonging to the vicilin and albumin classes. Although both these classes of proteins have been extensively characterized, there is still limited information on the expression and abundance of other proteins present in cocoa beans. This work is the first attempt to characterize the whole cocoa bean proteome by nano-UHPLC-ESI MS/MS analysis using tryptic digests of cocoa bean protein extracts. The results of this analysis show that >1000 proteins could be identified using a species-specific Theobroma cacao database. The majority of the identified proteins were involved with metabolism and energy. Additionally, a significant number of the identified proteins were linked to protein synthesis and processing. Several proteins were also involved with plant response to stress conditions and defence. Albumin and vicilin storage proteins showed the highest intensity values among all detected proteins, although only seven entries were identified as storage proteins. A comparison of MS/MS data searches carried out against larger non-specific databases confirmed that using a species-specific database can increase the number of identified proteins, and at the same time reduce the number of false positives. The results of this work will be useful in developing tools that can allow the comparison of the proteomic profile of cocoa beans from different genotypes and geographic origins. Data are available via ProteomeXchange with identifier PXD005586. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
[Screening and identification of apolipoprotein A-I as a potential marker for hepatoblastoma in children].

PubMed

Guo, Li-Hua; Zhao, Wei; Zhang, Jun-Jie; Zhang, Qian; Fan, Ying-Zhong; Wang, Jia-Xiang

2016-12-01

To screen and identify serum biomarkers for childhood hepatoblastoma (HB). The serum samples from 30 children with hepatoblastoma (HB), 20 children with systemic inflammatory response syndrome, and 20 normal children were treated with magnetic bead-based weak cation exchange chromatography. The platform of surface-enhanced laser desorption/ionization-time of flight-mass spectrometry (SELDI-TOF-MS) was used to eliminate the interference of inflammatory factors and to screen out the differentially expressed proteins in serum between tumor group and normal group. After the purification and separation of target proteins were performed using sodium dodecyl sulfate-polyacrylamide gel electrophoresis, matrix-assisted laser desorption/ionization-time of flight-mass spectrometry was used to determine their amino acid sequences. The SwissProt database was searched for matched proteins. Finally, real-time PCR and ELISA were used to verify and measure the expression of target proteins. After SELDI-TOF-MS was used for screening and elimination of the interference of inflammatory factors, a differentially expression protein with a mass-to-charge ratio of 9 348 Da was found in serum between HB group and normal group, and the HB group had significantly lower expression of this protein than the normal group (p<0.05). This protein was identified as apolipoprotein A-1 (Apo A-I). Real-time PCR and ELISA verified the low mRNA and protein expression of Apo A-I in serum in the HB group and high expression in serum in the normal group. Apo A-I can be used as a non-inflammatory protein marker for HB and has a certain value in the early diagnosis of HB.
ProteinInferencer: Confident protein identification and multiple experiment comparison for large scale proteomics projects.

PubMed

Zhang, Yaoyang; Xu, Tao; Shan, Bing; Hart, Jonathan; Aslanian, Aaron; Han, Xuemei; Zong, Nobel; Li, Haomin; Choi, Howard; Wang, Dong; Acharya, Lipi; Du, Lisa; Vogt, Peter K; Ping, Peipei; Yates, John R

2015-11-03

Shotgun proteomics generates valuable information from large-scale and target protein characterizations, including protein expression, protein quantification, protein post-translational modifications (PTMs), protein localization, and protein-protein interactions. Typically, peptides derived from proteolytic digestion, rather than intact proteins, are analyzed by mass spectrometers because peptides are more readily separated, ionized and fragmented. The amino acid sequences of peptides can be interpreted by matching the observed tandem mass spectra to theoretical spectra derived from a protein sequence database. Identified peptides serve as surrogates for their proteins and are often used to establish what proteins were present in the original mixture and to quantify protein abundance. Two major issues exist for assigning peptides to their originating protein. The first issue is maintaining a desired false discovery rate (FDR) when comparing or combining multiple large datasets generated by shotgun analysis and the second issue is properly assigning peptides to proteins when homologous proteins are present in the database. Herein we demonstrate a new computational tool, ProteinInferencer, which can be used for protein inference with both small- or large-scale data sets to produce a well-controlled protein FDR. In addition, ProteinInferencer introduces confidence scoring for individual proteins, which makes protein identifications evaluable. This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015. Published by Elsevier B.V.
Kangaroo – A pattern-matching program for biological sequences

PubMed Central

2002-01-01

Background Biologists are often interested in performing a simple database search to identify proteins or genes that contain a well-defined sequence pattern. Many databases do not provide straightforward or readily available query tools to perform simple searches, such as identifying transcription binding sites, protein motifs, or repetitive DNA sequences. However, in many cases simple pattern-matching searches can reveal a wealth of information. We present in this paper a regular expression pattern-matching tool that was used to identify short repetitive DNA sequences in human coding regions for the purpose of identifying potential mutation sites in mismatch repair deficient cells. Results Kangaroo is a web-based regular expression pattern-matching program that can search for patterns in DNA, protein, or coding region sequences in ten different organisms. The program is implemented to facilitate a wide range of queries with no restriction on the length or complexity of the query expression. The program is accessible on the web at http://bioinfo.mshri.on.ca/kangaroo/ and the source code is freely distributed at http://sourceforge.net/projects/slritools/. Conclusion A low-level simple pattern-matching application can prove to be a useful tool in many research settings. For example, Kangaroo was used to identify potential genetic targets in a human colorectal cancer variant that is characterized by a high frequency of mutations in coding regions containing mononucleotide repeats. PMID:12150718
Aureolib — A Proteome Signature Library: Towards an Understanding of Staphylococcus aureus Pathophysiology

PubMed Central

Pané-Farré, Jan; Kusch, Harald; Wolf, Carmen; Reiß, Swantje; Binh, Le Thi Nguyen; Albrecht, Dirk; Riedel, Katharina; Hecker, Michael; Engelmann, Susanne

2013-01-01

Gel-based proteomics is a powerful approach to study the physiology of Staphylococcus aureus under various growth restricting conditions. We analyzed 679 protein spots from a reference 2-dimensional gel of cytosolic proteins of S. aureus COL by mass spectrometry resulting in 521 different proteins. 4,692 time dependent protein synthesis profiles were generated by exposing S. aureus to nine infection-related stress and starvation stimuli (H2O2, diamide, paraquat, NO, fermentation, nitrate respiration, heat shock, puromycin, mupirocin). These expression profiles are stored in an online resource called Aureolib (http://www.aureolib.de). Moreover, information on target genes of 75 regulators and regulatory elements were included in the database. Cross-comparisons of this extensive data collection of protein synthesis profiles using the tools implemented in Aureolib lead to the identification of stress and starvation specific marker proteins. Altogether, 226 protein synthesis profiles showed induction ratios of 2.5-fold or higher under at least one of the tested conditions with 157 protein synthesis profiles specifically induced in response to a single stimulus. The respective proteins might serve as marker proteins for the corresponding stimulus. By contrast, proteins whose synthesis was increased or repressed in response to more than four stimuli are rather exceptional. The only protein that was induced by six stimuli is the universal stress protein SACOL1759. Most strikingly, cluster analyses of synthesis profiles of proteins differentially synthesized under at least one condition revealed only in rare cases a grouping that correlated with known regulon structures. The most prominent examples are the GapR, Rex, and CtsR regulon. In contrast, protein synthesis profiles of proteins belonging to the CodY and σB regulon are widely distributed. In summary, Aureolib is by far the most comprehensive protein expression database for S. aureus and provides an essential tool to decipher more complex adaptation processes in S. aureus during host pathogen interaction. PMID:23967085
RICD: a rice indica cDNA database resource for rice functional genomics.

PubMed

Lu, Tingting; Huang, Xuehui; Zhu, Chuanrang; Huang, Tao; Zhao, Qiang; Xie, Kabing; Xiong, Lizhong; Zhang, Qifa; Han, Bin

2008-11-26

The Oryza sativa L. indica subspecies is the most widely cultivated rice. During the last few years, we have collected over 20,000 putative full-length cDNAs and over 40,000 ESTs isolated from various cDNA libraries of two indica varieties Guangluai 4 and Minghui 63. A database of the rice indica cDNAs was therefore built to provide a comprehensive web data source for searching and retrieving the indica cDNA clones. Rice Indica cDNA Database (RICD) is an online MySQL-PHP driven database with a user-friendly web interface. It allows investigators to query the cDNA clones by keyword, genome position, nucleotide or protein sequence, and putative function. It also provides a series of information, including sequences, protein domain annotations, similarity search results, SNPs and InDels information, and hyperlinks to gene annotation in both The Rice Annotation Project Database (RAP-DB) and The TIGR Rice Genome Annotation Resource, expression atlas in RiceGE and variation report in Gramene of each cDNA. The online rice indica cDNA database provides cDNA resource with comprehensive information to researchers for functional analysis of indica subspecies and for comparative genomics. The RICD database is available through our website http://www.ncgr.ac.cn/ricd.
Expression of psoriasis-associated fatty acid-binding protein in senescent human dermal microvascular endothelial cells.

PubMed

Ha, Moon Kyung; Chung, Kee Yang; Lee, Ju Hee; Bang, Dongsik; Park, Yoon Kee; Lee, Kwang Hoon

2004-09-01

Aging is associated with the progressive pathophysiologic modification of endothelial cells. In vitro endothelial cell senescence is accompanied by proliferative activity failure and by perturbations in gene and protein expressions. Moreover, this cellular senescence in culture has been proposed to reflect processes that occur in aging organisms. In order to observe the changing patterns of protein expression in senescent human dermal microvascular endothelial cells (HDMECs), proteins obtained from both early- and late-passaged HDMECs were separated by two-dimensional electrophoresis, visualized by silver staining, and quantified by image processing. Proteins of interest were extracted by in-gel digestion with trypsin and quantified by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS), by searching the National Center for Biotechnology Information protein-sequence database. More than 2000 spots were detected by 2D electrophoresis within a linear pH range of 3-10. Twenty-two major differentially expressed spots were observed in serially passaged HDMECs and identified with high confidence by MALDI-TOF-MS. One of these spots was found to be a 14-15 kDa psoriasis-associated fatty acid-binding protein (PA-FABP) with high affinity for long-chain fatty acids. The expression of PA-FABP was confirmed to be elevated in senescent HDMECs (passage 20) by fluorescence-activated cell sorting (FACS), confocal laser microscopy, and by immunohistochemistry in aged human skin tissue. Our results suggest that the overexpression of FABP in cultured senescent HDMECs is closely related to skin aging.
De novo sequencing and analysis of the transcriptome during the browning of fresh-cut Luffa cylindrica 'Fusi-3' fruits

PubMed Central

Chen, Mindong; Wang, Bin; Zhang, Qianrong; Xue, Zhuzheng

2017-01-01

Fresh-cut luffa (Luffa cylindrica) fruits commonly undergo browning. However, little is known about the molecular mechanisms regulating this process. We used the RNA-seq technique to analyze the transcriptomic changes occurring during the browning of fresh-cut fruits from luffa cultivar ‘Fusi-3’. Over 90 million high-quality reads were assembled into 58,073 Unigenes, and 60.86% of these were annotated based on sequences in four public databases. We detected 35,282 Unigenes with significant hits to sequences in the NCBInr database, and 24,427 Unigenes encoded proteins with sequences that were similar to those of known proteins in the Swiss-Prot database. Additionally, 20,546 and 13,021 Unigenes were similar to existing sequences in the Eukaryotic Orthologous Groups of proteins and Kyoto Encyclopedia of Genes and Genomes databases, respectively. Furthermore, 27,301 Unigenes were differentially expressed during the browning of fresh-cut luffa fruits (i.e., after 1–6 h). Moreover, 11 genes from five gene families (i.e., PPO, PAL, POD, CAT, and SOD) identified as potentially associated with enzymatic browning as well as four WRKY transcription factors were observed to be differentially regulated in fresh-cut luffa fruits. With the assistance of rapid amplification of cDNA ends technology, we obtained the full-length sequences of the 15 Unigenes. We also confirmed these Unigenes were expressed by quantitative real-time polymerase chain reaction analysis. This study provides a comprehensive transcriptome sequence resource, and may facilitate further studies aimed at identifying genes affecting luffa fruit browning for the exploitation of the underlying mechanism. PMID:29145430
The Web-Based DNA Vaccine Database DNAVaxDB and Its Usage for Rational DNA Vaccine Design.

PubMed

Racz, Rebecca; He, Yongqun

2016-01-01

A DNA vaccine is a vaccine that uses a mammalian expression vector to express one or more protein antigens and is administered in vivo to induce an adaptive immune response. Since the 1990s, a significant amount of research has been performed on DNA vaccines and the mechanisms behind them. To meet the needs of the DNA vaccine research community, we created DNAVaxDB ( http://www.violinet.org/dnavaxdb ), the first Web-based database and analysis resource of experimentally verified DNA vaccines. All the data in DNAVaxDB, which includes plasmids, antigens, vaccines, and sources, is manually curated and experimentally verified. This chapter goes over the detail of DNAVaxDB system and shows how the DNA vaccine database, combined with the Vaxign vaccine design tool, can be used for rational design of a DNA vaccine against a pathogen, such as Mycobacterium bovis.
Microarray analysis of genes associated with cell surface NIS protein levels in breast cancer.

PubMed

Beyer, Sasha J; Zhang, Xiaoli; Jimenez, Rafael E; Lee, Mei-Ling T; Richardson, Andrea L; Huang, Kun; Jhiang, Sissy M

2011-10-11

Na+/I- symporter (NIS)-mediated iodide uptake allows radioiodine therapy for thyroid cancer. NIS is also expressed in breast tumors, raising potential for radionuclide therapy of breast cancer. However, NIS expression in most breast cancers is low and may not be sufficient for radionuclide therapy. We aimed to identify biomarkers associated with NIS expression such that mechanisms underlying NIS modulation in human breast tumors may be elucidated. Published oligonucleotide microarray data within the National Center for Biotechnology Information Gene Expression Omnibus database were analyzed to identify gene expression tightly correlated with NIS mRNA level among human breast tumors. NIS immunostaining was performed in a tissue microarray composed of 28 human breast tumors which had corresponding oligonucleotide microarray data available for each tumor such that gene expression associated with cell surface NIS protein level could be identified. NIS mRNA levels do not vary among breast tumors or when compared to normal breast tissues when detected by Affymetrix oligonucleotide microarray platforms. Cell surface NIS protein levels are much more variable than their corresponding NIS mRNA levels. Despite a limited number of breast tumors examined, our analysis identified cysteinyl-tRNA synthetase as a biomarker that is highly associated with cell surface NIS protein levels in the ER-positive breast cancer subtype. Further investigation on genes associated with cell surface NIS protein levels within each breast cancer molecular subtype may lead to novel targets for selectively increasing NIS expression/function in a subset of breast cancers patients.
Clinical value of miR-182-5p in lung squamous cell carcinoma: a study combining data from TCGA, GEO, and RT-qPCR validation.

PubMed

Luo, Jie; Shi, Ke; Yin, Shu-Ya; Tang, Rui-Xue; Chen, Wen-Jie; Huang, Lin-Zhen; Gan, Ting-Qing; Cai, Zheng-Wen; Chen, Gang

2018-04-10

MiR-182-5p, as a member of miRNA family, can be detected in lung cancer and plays an important role in lung cancer. To explore the clinical value of miR-182-5p in lung squamous cell carcinoma (LUSC) and to unveil the molecular mechanism of LUSC. The clinical value of miR-182-5p in LUSC was investigated by collecting and calculating data from The Cancer Genome Atlas (TCGA) database, the Gene Expression Omnibus (GEO) database, and real-time quantitative polymerase chain reaction (RT-qPCR). Twelve prediction platforms were used to predict the target genes of miR-182-5p. Protein-protein interaction (PPI) networks and gene ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were used to explore the molecular mechanism of LUSC. The expression of miR-182-5p was significantly over-expressed in LUSC than in non-cancerous tissues, as evidenced by various approaches, including the TCGA database, GEO microarrays, RT-qPCR, and a comprehensive meta-analysis of 501 LUSC cases and 148 non-cancerous cases. Furthermore, a total of 81 potential target genes were chosen from the union of predicted genes and the TCGA database. GO and KEGG analyses demonstrated that the target genes are involved in pathways related to biological processes. PPIs revealed the relationships between these genes, with EPAS1, PRKCE, NR3C1, and RHOB being located in the center of the PPI network. MiR-182-5p upregulation greatly contributes to LUSC and may serve as a biomarker in LUSC.
CellFinder: a cell data repository

PubMed Central

Stachelscheid, Harald; Seltmann, Stefanie; Lekschas, Fritz; Fontaine, Jean-Fred; Mah, Nancy; Neves, Mariana; Andrade-Navarro, Miguel A.; Leser, Ulf; Kurtz, Andreas

2014-01-01

CellFinder (http://www.cellfinder.org) is a comprehensive one-stop resource for molecular data characterizing mammalian cells in different tissues and in different development stages. It is built from carefully selected data sets stemming from other curated databases and the biomedical literature. To date, CellFinder describes 3394 cell types and 50 951 cell lines. The database currently contains 3055 microscopic and anatomical images, 205 whole-genome expression profiles of 194 cell/tissue types from RNA-seq and microarrays and 553 905 protein expressions for 535 cells/tissues. Text mining of a corpus of >2000 publications followed by manual curation confirmed expression information on ∼900 proteins and genes. CellFinder’s data model is capable to seamlessly represent entities from single cells to the organ level, to incorporate mappings between homologous entities in different species and to describe processes of cell development and differentiation. Its ontological backbone currently consists of 204 741 ontology terms incorporated from 10 different ontologies unified under the novel CELDA ontology. CellFinder’s web portal allows searching, browsing and comparing the stored data, interactive construction of developmental trees and navigating the partonomic hierarchy of cells and tissues through a unique body browser designed for life scientists and clinicians. PMID:24304896
Proteomic analysis by iTRAQ in red claw crayfish, Cherax quadricarinatus, hematopoietic tissue cells post white spot syndrome virus infection.

PubMed

Jeswin, Joseph; Xie, Xiao-lu; Ji, Qiao-lin; Wang, Ke-jian; Liu, Hai-peng

2016-03-01

To elucidate proteomic changes of Hpt cells from red claw crayfish, Cherax quadricarinatus, we have carried out isobaric tags for relative and absolute quantitation (iTRAQ) of cellular proteins at both early (1 hpi) and late stage (12 hpi) post white spot syndrome virus (WSSV) infection. Protein database search revealed 594 protein hits by Mascot, in which 17 and 30 proteins were present as differentially expressed proteins at early and late viral infection, respectively. Generally, these differentially expressed proteins include: 1) the metabolic process related proteins in glycolysis and glucogenesis, DNA replication, nucleotide/amino acid/fatty acid metabolism and protein biosynthesis; 2) the signal transduction related proteins like small GTPases, G-protein-alpha stimulatory subunit, proteins bearing PDZ- or 14-3-3-domains that help holding together and organize signaling complexes, casein kinase I and proteins of the MAP-kinase signal transduction pathway; 3) the immune defense related proteins such as α-2 macroglobulin, transglutaminase and trans-activation response RNA-binding protein 1. Taken together, these protein information shed new light on the host cellular response against WSSV infection in a crustacean cell culture. Copyright © 2016 Elsevier Ltd. All rights reserved.

Expression of proteins in serum, synovial fluid, synovial membrane, and articular cartilage samples obtained from dogs with stifle joint osteoarthritis secondary to cranial cruciate ligament disease and dogs without stifle joint arthritis.

PubMed

Garner, Bridget C; Kuroki, Keiichi; Stoker, Aaron M; Cook, Cristi R; Cook, James L

2013-03-01

To identify proteins with differential expression between healthy dogs and dogs with stifle joint osteoarthritis secondary to cranial cruciate ligament (CCL) disease. Serum and synovial fluid samples obtained from dogs with stifle joint osteoarthritis before (n = 10) and after (8) surgery and control dogs without osteoarthritis (9) and archived synovial membrane and articular cartilage samples obtained from dogs with stifle joint osteoarthritis (5) and dogs without arthritis (5). Serum and synovial fluid samples were analyzed via liquid chromatography-tandem mass spectrometry; results were compared against a nonredundant protein database. Expression of complement component 3 in archived tissue samples was determined via immunohistochemical methods. No proteins had significantly different expression between serum samples of control dogs versus those of dogs with stifle joint osteoarthritis. Eleven proteins (complement component 3 precursor, complement factor I precursor, apolipoprotein B-100 precursor, serum paraoxonase and arylesterase 1, zinc-alpha-2-glycoprotein precursor, serum amyloid A, transthyretin precursor, retinol-binding protein 4 precursor, alpha-2-macroglobulin precursor, angiotensinogen precursor, and fibronectin 1 isoform 1 preproprotein) had significantly different expression (> 2.0-fold) between synovial fluid samples obtained before surgery from dogs with stifle joint osteoarthritis versus those obtained from control dogs. Complement component 3 was strongly expressed in all (5/5) synovial membrane samples of dogs with stifle joint osteoarthritis and weakly expressed in 3 of 5 synovial membrane samples of dogs without stifle joint arthritis. Findings suggested that the complement system and proteins involved in lipid and cholesterol metabolism may have a role in stifle joint osteoarthritis, CCL disease, or both.
Gene Expression Differences in Infected and Noninfected Middle Ear Complementary DNA Libraries

PubMed Central

Kerschner, Joseph E.; Horsey, Edward; Ahmed, Azad; Erbe, Christy; Khampang, Pawjai; Cioffi, Joseph; Hu, Fen Ze; Post, James Christopher; Ehrlich, Garth D.

2010-01-01

Objectives To investigate genetic differences in middle ear mucosa (MEM) with nontypeable Haemophilus influenzae (NTHi) infection. Genetic upregulation and downregulation occurs in MEM during otitis media (OM) pathogenesis. A comprehensive assessment of these genetic differences using the techniques of complementary DNA (cDNA) library creation has not been performed. Design The cDNA libraries were constructed from NTHi-infected and noninfected chinchilla MEM. Random clones were picked, sequenced bidirectionally, and submitted to the National Center for Biotechnology Information (NCBI) Expressed Sequence Tags database, where they were assigned accession numbers. These numbers were used with the basic local alignment search tool (BLAST) to align clones against the nonredundant nucleotide database at NCBI. Results Analysis with the Web-based statistical program FatiGO identified several biological processes with significant differences in numbers of represented genes. Processes involved in immune, stress, and wound responses were more prevalent in the NTHi-infected library. S100 calcium-binding protein A9 (S100A9); secretory leukoprotease inhibitor (SLPI); β2-microglobulin (B2M); ferritin, heavy-chain polypeptide 1 (FTH1); and S100 calcium-binding protein A8 (S100A8) were expressed at significantly higher levels in the NTHi-infected library. Calcium-binding proteins S100A9 and S100A8 serve as markers for inflammation and have antibacterial effects. Secretory leukoprotease inhibitor is an antibacterial protein that inhibits stimuli-induced MUC1, MUC2, and MUC5AC production. Conclusions A number of genes demonstrate changes during the pathogenesis of OM, including SLPI, which has an impact on mucin gene expression; this expression is known to be an important regulator in OM. The techniques described herein provide a framework for future investigations to more thoroughly understand molecular changes in the middle ear, which will likely be important in developing new therapeutic and intervention strategies. PMID:19153305
Metabolic, Genomic, and Biochemical Analyses of Glandular Trichomes from the Wild Tomato Species Lycopersicon hirsutum Identify a Key Enzyme in the Biosynthesis of MethylketonesW⃞

PubMed Central

Fridman, Eyal; Wang, Jihong; Iijima, Yoko; Froehlich, John E.; Gang, David R.; Ohlrogge, John; Pichersky, Eran

2005-01-01

Medium-length methylketones (C7-C15) are highly effective in protecting plants from numerous pests. We used a biochemical genomics approach to elucidate the pathway leading to synthesis of methylketones in the glandular trichomes of the wild tomato Lycopersicon hirsutum f glabratum (accession PI126449). A comparison of gland EST databases from accession PI126449 and a second L. hirsutum accession, LA1777, whose glands do not contain methylketones, showed that the expression of genes for fatty acid biosynthesis is elevated in PI126449 glands, suggesting de novo biosynthesis of methylketones. A cDNA abundant in the PI126449 gland EST database but rare in the LA1777 database was similar in sequence to plant esterases. This cDNA, designated Methylketone Synthase 1 (MKS1), was expressed in Escherichia coli and the purified protein used to catalyze in vitro reactions in which C12, C14, and C16 β-ketoacyl–acyl-carrier-proteins (intermediates in fatty acid biosynthesis) were hydrolyzed and decarboxylated to give C11, C13, and C15 methylketones, respectively. Although MKS1 does not contain a classical transit peptide, in vitro import assays showed that it was targeted to the stroma of plastids, where fatty acid biosynthesis occurs. Levels of MKS1 transcript, protein, and enzymatic activity were correlated with levels of methylketones and gland density in a variety of tomato accessions and in different plant organs. PMID:15772286
PSI:Biology-Materials Repository: A Biologist’s Resource for Protein Expression Plasmids

PubMed Central

Cormier, Catherine Y.; Park, Jin G.; Fiacco, Michael; Steel, Jason; Hunter, Preston; Kramer, Jason; Singla, Rajeev; LaBaer, Joshua

2011-01-01

The Protein Structure Initiative:Biology-Materials Repository (PSI:Biology-MR; MR; http://psimr.asu.edu) sequence-verifies, annotates, stores, and distributes the protein expression plasmids and vectors created by the Protein Structure Initiative (PSI). The MR has developed an informatics and sample processing pipeline that manages this process for thousands of samples per month from nearly a dozen PSI centers. DNASU (http://dnasu.asu.edu), a freely searchable database, stores the plasmid annotations, which include the full-length sequence, vector information, and associated publications for over 130,000 plasmids created by our laboratory, by the PSI and other consortia, and by individual laboratories for distribution to researchers worldwide. Each plasmid links to external resources, including the PSI Structural Biology Knowledgebase (http://sbkb.org), which facilitates cross-referencing of a particular plasmid to additional protein annotations and experimental data. To expedite and simplify plasmid requests, the MR uses an expedited material transfer agreement (EP-MTA) network, where researchers from network institutions can order and receive PSI plasmids without institutional delays. Currently over 39,000 protein expression plasmids and 78 empty vectors from the PSI are available upon request from DNASU. Overall, the MR’s repository of expression-ready plasmids, its automated pipeline, and the rapid process for receiving and distributing these plasmids more effectively allows the research community to dissect the biological function of proteins whose structures have been studied by the PSI. PMID:21360289
Cassava root membrane proteome reveals activities during storage root maturation.

PubMed

Naconsie, Maliwan; Lertpanyasampatha, Manassawe; Viboonjun, Unchera; Netrphan, Supatcharee; Kuwano, Masayoshi; Ogasawara, Naotake; Narangajavana, Jarunya

2016-01-01

Cassava (Manihot esculenta Crantz) is one of the most important crops of Thailand. Its storage roots are used as food, feed, starch production, and be the important source for biofuel and biodegradable plastic production. Despite the importance of cassava storage roots, little is known about the mechanisms involved in their formation. This present study has focused on comparison of the expression profiles of cassava root proteome at various developmental stages using two-dimensional gel electrophoresis and LC-MS/MS. Based on an anatomical study using Toluidine Blue, the secondary growth was confirmed to be essential during the development of cassava storage root. To investigate biochemical processes occurring during storage root maturation, soluble and membrane proteins were isolated from storage roots harvested from 3-, 6-, 9-, and 12-month-old cassava plants. The proteins with differential expression pattern were analysed and identified to be associated with 8 functional groups: protein folding and degradation, energy, metabolism, secondary metabolism, stress response, transport facilitation, cytoskeleton, and unclassified function. The expression profiling of membrane proteins revealed the proteins involved in protein folding and degradation, energy, and cell structure were highly expressed during early stages of development. Integration of these data along with the information available in genome and transcriptome databases is critical to expand knowledge obtained solely from the field of proteomics. Possible role of identified proteins were discussed in relation with the activities during storage root maturation in cassava.
Proteomic analysis of strawberry leaves infected with Colletotrichum fragariae.

PubMed

Fang, Xianping; Chen, Wenyue; Xin, Ya; Zhang, Hengmu; Yan, Chengqi; Yu, Hong; Liu, Hui; Xiao, Wenfei; Wang, Shuzhen; Zheng, Guizhen; Liu, Hongbo; Jin, Liang; Ma, Huasheng; Ruan, Songlin

2012-07-16

Understanding the defense mechanisms used by anthracnose-resistant strawberries against Colletotrichum infection is important for breeding purposes. To characterize cell responses to Colletotrichum infection, proteomes from strawberry seedling leaves that had or had not been infected with Colletotrichum fragariae were characterized at different time points post infection by 2-DE and by MALDI-TOF/TOF MS/MS and database-searching protein identification. Mass spectrometry identified 49 differentially expressed proteins with significant intensity differences (>1.5-fold, p<0.05) in mock- and C. fragariae-infected leaves at least at one time point. Notably, 2-DE analysis revealed that C. fragariae infection increased the expression of well-known and novel pathogen-responsive proteins whose expression patterns tended to correlate with physiological changes in the leaves. Quantitative real-time PCR was used to examine the transcriptional profiles of infected and uninfected strawberry leaves, and western blotting confirmed the induction of β-1,3-glucanase and a low-molecular-weight heat shock protein in response to C. fragariae infection. During the late phase of infection, proteins involved in the Calvin cycle and glycolysis pathway had suppressed expression. The abundance changes, putative functions, and participation in physiological reactions for the identified proteins produce a pathogen-responsive protein network in C. fragariae-infected strawberry leaves. Together, these findings increase our knowledge of pathogen resistance mechanisms, especially those found in non-model plant species. Copyright © 2012 Elsevier B.V. All rights reserved.
Intrinsic and extrinsic approaches for detecting genes in a bacterial genome.

PubMed Central

Borodovsky, M; Rudd, K E; Koonin, E V

1994-01-01

The unannotated regions of the Escherichia coli genome DNA sequence from the EcoSeq6 database, totaling 1,278 'intergenic' sequences of the combined length of 359,279 basepairs, were analyzed using computer-assisted methods with the aim of identifying putative unknown genes. The proposed strategy for finding new genes includes two key elements: i) prediction of expressed open reading frames (ORFs) using the GeneMark method based on Markov chain models for coding and non-coding regions of Escherichia coli DNA, and ii) search for protein sequence similarities using programs based on the BLAST algorithm and programs for motif identification. A total of 354 putative expressed ORFs were predicted by GeneMark. Using the BLASTX and TBLASTN programs, it was shown that 208 ORFs located in the unannotated regions of the E. coli chromosome are significantly similar to other protein sequences. Identification of 182 ORFs as probable genes was supported by GeneMark and BLAST, comprising 51.4% of the GeneMark 'hits' and 87.5% of the BLAST 'hits'. 73 putative new genes, comprising 20.6% of the GeneMark predictions, belong to ancient conserved protein families that include both eubacterial and eukaryotic members. This value is close to the overall proportion of highly conserved sequences among eubacterial proteins, indicating that the majority of the putative expressed ORFs that are predicted by GeneMark, but have no significant BLAST hits, nevertheless are likely to be real genes. The majority of the putative genes identified by BLAST search have been described since the release of the EcoSeq6 database, but about 70 genes have not been detected so far. Among these new identifications are genes encoding proteins with a variety of predicted functions including dehydrogenases, kinases, several other metabolic enzymes, ATPases, rRNA methyltransferases, membrane proteins, and different types of regulatory proteins. Images PMID:7984428
Investigation of Glandular Trichome Proteins in Artemisia annua L. Using Comparative Proteomics

PubMed Central

Wu, Ting; Wang, Yejun; Guo, Dianjing

2012-01-01

Glandular secreting trichomes (GSTs) are called biofactories because they are active in synthesizing, storing and secreting various types of plant secondary metabolites. As the most effective drug against malaria, artemisinin, a sesquiterpene lactone is derived from GSTs of Artemisia annua. However, low artemisinin content (0.001%∼1.54% of dry weight) has hindered its wide application. We investigate the GST-expressed proteins in Artemisia annua using a comparative proteomics approach, aiming for a better understanding of the trichome proteome and arteminisin metabolism. 2D-electrophoresis was employed to compare the protein profiles of GSTs and leaves. More than 700 spots were resolved for GSTs, of which ∼93 non-redundant proteins were confidently identified by searching NCBI and Artemisia EST databases. Over 70% of these proteins were highly expressed in GTSs. Functional classification of these GSTs enriched proteins revealed that many of them participate in major plant metabolic processes such as electron transport, transcription and translation. PMID:22905110
SPIRE: Systematic protein investigative research environment.

PubMed

Kolker, Eugene; Higdon, Roger; Morgan, Phil; Sedensky, Margaret; Welch, Dean; Bauman, Andrew; Stewart, Elizabeth; Haynes, Winston; Broomall, William; Kolker, Natali

2011-12-10

The SPIRE (Systematic Protein Investigative Research Environment) provides web-based experiment-specific mass spectrometry (MS) proteomics analysis (https://www.proteinspire.org). Its emphasis is on usability and integration of the best analytic tools. SPIRE provides an easy to use web-interface and generates results in both interactive and simple data formats. In contrast to run-based approaches, SPIRE conducts the analysis based on the experimental design. It employs novel methods to generate false discovery rates and local false discovery rates (FDR, LFDR) and integrates the best and complementary open-source search and data analysis methods. The SPIRE approach of integrating X!Tandem, OMSSA and SpectraST can produce an increase in protein IDs (52-88%) over current combinations of scoring and single search engines while also providing accurate multi-faceted error estimation. One of SPIRE's primary assets is combining the results with data on protein function, pathways and protein expression from model organisms. We demonstrate some of SPIRE's capabilities by analyzing mitochondrial proteins from the wild type and 3 mutants of C. elegans. SPIRE also connects results to publically available proteomics data through its Model Organism Protein Expression Database (MOPED). SPIRE can also provide analysis and annotation for user supplied protein ID and expression data. Copyright © 2011. Published by Elsevier B.V.
Characterization and comparison of proteomes of albino sea cucumber Apostichopus japonicus (Selenka) by iTRAQ analysis.

PubMed

Xia, Chang-Ge; Zhang, Dijun; Ma, Chengnv; Zhou, Jun; He, Shan; Su, Xiu-Rong

2016-04-01

Sea cucumber is a commercially important marine organism in China. Of the different colored varieties sold in China, albino sea cucumber has the greatest appeal among consumers. Identification of factors contributing to albinism in sea cucumber is therefore likely to provide a scientific basis for improving the cultivability of these strains. In this study, two-dimensional liquid chromatography-tandem mass spectrometry coupled with isobaric tags for relative and absolute quantification labeling was used for the first time to quantitatively define the proteome of sea cucumbers and reveal proteomic characteristics unique to albino sea cucumbers. A total of 549 proteins were identified and quantified in albino sea cucumber and the functional annotations of 485 proteins have been exhibited based on COG database. Compared with green sea cucumber, 12 proteins were identified as differentially expressed in the intestine and 16 proteins in the body wall of albino sea cucumber. Among them, 5 proteins were up-regulated in the intestine and 8 proteins were down-regulated in body wall. Gene ontology annotations of these differentially expressed proteins consisted mostly of 'biological process'. The large number of differentially expressed proteins identified here should be highly useful in further elucidating the mechanisms underlying albinism in sea cucumber. Copyright © 2015 Elsevier Ltd. All rights reserved.
Role of PELP1 in EGFR-ER Signaling Crosstalk in Ovarian Cancer Cells

DTIC Science & Technology

2009-04-01

expression of genes involved in metastasis using a focused microarray approach. We have used Human Tumor Metastasis Microarray (Oligo GE array from...ovarian cancer progression. Analysis of human genome databases and SAGE data suggested deregulation of PELP1 expression in ovarian cancer cells...PI3K, and STAT3 in the cytosol. PELP1/MNAR regulates meiosis via its interactions with heterotimeric Gbc protein, androgen receptor (AR), and by
Key genes and pathways in measles and their interaction with environmental chemicals

PubMed Central

Zhang, Rongqiang; Jiang, Hualin; Li, Fengying; Su, Ning; Ding, Yi; Mao, Xiang; Ren, Dan; Wang, Jing

2018-01-01

The aim of the present study was to explore key genes that may have a role in the pathology of measles virus infection and to clarify the interaction networks between environmental factors and differentially expressed genes (DEGs). After screening the database of the Gene Expression Omnibus of the National Center for Biotechnology Information, the dataset GSE5808 was downloaded and analyzed. A global normalization method was performed to minimize data inconsistencies and heterogeneity. DEGs during different stages of measles virus infection were explored using R software (v3.4.0). Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the DEGs were performed using Cytoscape 3.4.0 software. A protein-protein interaction (PPI) network of the DEGs was obtained from the STRING database v9.05. A total of 43 DEGs were obtained from four analyzed sample groups, including 10 highly expressed genes and 33 genes with decreased expression. The most enriched pathways based on KEGG analysis were fatty acid elongation, cytokine-cytokine receptor interaction and RNA degradation. The genes mentioned in the PPI network were mainly associated with protein binding and chemokine activity. A total of 219 chemicals were identified that may, jointly or on their own, interact with the 6 DEGs between the control group and patients with measles (at hospital entry), including benzo(a)pyrene (BaP) and tetrachlorodibenzodioxin (TCDD). In conclusion, the present study revealed that chemokines and environmental chemicals, e.g. BaP and TCDD, may affect the development of measles. PMID:29805511
Screening and identification of apolipoprotein A-I as a potential hepatoblastoma biomarker in children, excluding inflammatory factors

PubMed Central

ZHAO, WEI; LI, JUAN; ZHANG, YILIN; GAO, PENGFEI; ZHANG, JUNJIE; GUO, FEI; YU, JIEKAI; ZHENG, SHU; WANG, JIAXIANG

2015-01-01

The aim of the present study was to identify a child hepatoblastoma serum biomarker that is unaffected by inflammatory factors, with the ultimate aim of finding an effective method for the early diagnosis of hepatoblastoma. The magnetic bead-based weak cation exchange chromatography technique was used to process serum harvested from 30 children with hepatoblastoma, 20 children with systemic inflammatory response syndrome (SIRS) and 20 healthy children. Proteins differentially expressed in SIRS were excluded from consideration as biomarkers for hepatoblastoma. Proteins differentially expressed in hepatoblastoma and healthy controls were screened using surface-enhanced laser desorption/ionization-time of flight-mass spectrometry (SELDI-TOF-MS). Target proteins were purified by SDS-PAGE, and matrix-assisted laser desorption/ionization (MALDI)-TOF-MS was used to determine their amino acid sequences. Protein matches were searched in the SwissProt database. Quantitative polymerase chain reaction (qPCR) and ELISA were employed to confirm the expression of target proteins. Following screening to exclude inflammatory factors, SELDI-TOF-MS revealed a protein with a mass-to-charge ratio of 9,348 Da that was expressed at significantly lower levels in the serum of children with hepatoblastoma compared with healthy controls (P<0.01). Sequence analysis identified this protein as apolipoprotein A-1 (Apo A-I). qPCR and ELISA confirmed that the expression of Apo A-I mRNA and protein were significantly lower in children with hepatoblastoma compared with healthy controls (P<0.05). These results indicate that Apo A-I is a non-inflammatory protein marker for hepatoblastoma with the potential for use in early diagnosis of hepatoblastoma. In addition, the present study demonstrates the feasibility of proteomic screening for the identification of proteins that can serve as markers for a specific tumor. PMID:26171005
Screening and identification of apolipoprotein A-I as a potential hepatoblastoma biomarker in children, excluding inflammatory factors.

PubMed

Zhao, Wei; Li, Juan; Zhang, Yilin; Gao, Pengfei; Zhang, Junjie; Guo, Fei; Yu, Jiekai; Zheng, Shu; Wang, Jiaxiang

2015-07-01

The aim of the present study was to identify a child hepatoblastoma serum biomarker that is unaffected by inflammatory factors, with the ultimate aim of finding an effective method for the early diagnosis of hepatoblastoma. The magnetic bead-based weak cation exchange chromatography technique was used to process serum harvested from 30 children with hepatoblastoma, 20 children with systemic inflammatory response syndrome (SIRS) and 20 healthy children. Proteins differentially expressed in SIRS were excluded from consideration as biomarkers for hepatoblastoma. Proteins differentially expressed in hepatoblastoma and healthy controls were screened using surface-enhanced laser desorption/ionization-time of flight-mass spectrometry (SELDI-TOF-MS). Target proteins were purified by SDS-PAGE, and matrix-assisted laser desorption/ionization (MALDI)-TOF-MS was used to determine their amino acid sequences. Protein matches were searched in the SwissProt database. Quantitative polymerase chain reaction (qPCR) and ELISA were employed to confirm the expression of target proteins. Following screening to exclude inflammatory factors, SELDI-TOF-MS revealed a protein with a mass-to-charge ratio of 9,348 Da that was expressed at significantly lower levels in the serum of children with hepatoblastoma compared with healthy controls (P<0.01). Sequence analysis identified this protein as apolipoprotein A-1 (Apo A-I). qPCR and ELISA confirmed that the expression of Apo A-I mRNA and protein were significantly lower in children with hepatoblastoma compared with healthy controls (P<0.05). These results indicate that Apo A-I is a non-inflammatory protein marker for hepatoblastoma with the potential for use in early diagnosis of hepatoblastoma. In addition, the present study demonstrates the feasibility of proteomic screening for the identification of proteins that can serve as markers for a specific tumor.
In silico re-identification of properties of drug target proteins.

PubMed

Kim, Baeksoo; Jo, Jihoon; Han, Jonghyun; Park, Chungoo; Lee, Hyunju

2017-05-31

Computational approaches in the identification of drug targets are expected to reduce time and effort in drug development. Advances in genomics and proteomics provide the opportunity to uncover properties of druggable genomes. Although several studies have been conducted for distinguishing drug targets from non-drug targets, they mainly focus on the sequences and functional roles of proteins. Many other properties of proteins have not been fully investigated. Using the DrugBank (version 3.0) database containing nearly 6,816 drug entries including 760 FDA-approved drugs and 1822 of their targets and human UniProt/Swiss-Prot databases, we defined 1578 non-redundant drug target and 17,575 non-drug target proteins. To select these non-redundant protein datasets, we built four datasets (A, B, C, and D) by considering clustering of paralogous proteins. We first reassessed the widely used properties of drug target proteins. We confirmed and extended that drug target proteins (1) are likely to have more hydrophobic, less polar, less PEST sequences, and more signal peptide sequences higher and (2) are more involved in enzyme catalysis, oxidation and reduction in cellular respiration, and operational genes. In this study, we proposed new properties (essentiality, expression pattern, PTMs, and solvent accessibility) for effectively identifying drug target proteins. We found that (1) drug targetability and protein essentiality are decoupled, (2) druggability of proteins has high expression level and tissue specificity, and (3) functional post-translational modification residues are enriched in drug target proteins. In addition, to predict the drug targetability of proteins, we exploited two machine learning methods (Support Vector Machine and Random Forest). When we predicted drug targets by combining previously known protein properties and proposed new properties, an F-score of 0.8307 was obtained. When the newly proposed properties are integrated, the prediction performance is improved and these properties are related to drug targets. We believe that our study will provide a new aspect in inferring drug-target interactions.
Shewregdb: Database and visualization environment for experimental and predicted regulatory information in Shewanella oneidensis mr-1

PubMed Central

Syed, Mustafa H; Karpinets, Tatiana V; Leuze, Michael R; Kora, Guruprasad H; Romine, Margaret R; Uberbacher, Edward C

2009-01-01

Shewanella oneidensis MR-1 is an important model organism for environmental research as it has an exceptional metabolic and respiratory versatility regulated by a complex regulatory network. We have developed a database to collect experimental and computational data relating to regulation of gene and protein expression, and, a visualization environment that enables integration of these data types. The regulatory information in the database includes predictions of DNA regulator binding sites, sigma factor binding sites, transcription units, operons, promoters, and RNA regulators including non-coding RNAs, riboswitches, and different types of terminators. Availability http://shewanella-knowledgebase.org:8080/Shewanella/gbrowserLanding.jsp PMID:20198195
Application of Ferulic Acid for Alzheimer’s Disease: Combination of Text Mining and Experimental Validation

PubMed Central

Meng, Guilin; Meng, Xiulin; Ma, Xiaoye; Zhang, Gengping; Hu, Xiaolin; Jin, Aiping; Liu, Xueyuan

2018-01-01

Alzheimer’s disease (AD) is an increasing concern in human health. Despite significant research, highly effective drugs to treat AD are lacking. The present study describes the text mining process to identify drug candidates from a traditional Chinese medicine (TCM) database, along with associated protein target mechanisms. We carried out text mining to identify literatures that referenced both AD and TCM and focused on identifying compounds and protein targets of interest. After targeting one potential TCM candidate, corresponding protein-protein interaction (PPI) networks were assembled in STRING to decipher the most possible mechanism of action. This was followed by validation using Western blot and co-immunoprecipitation in an AD cell model. The text mining strategy using a vast amount of AD-related literature and the TCM database identified curcumin, whose major component was ferulic acid (FA). This was used as a key candidate compound for further study. Using the top calculated interaction score in STRING, BACE1 and MMP2 were implicated in the activity of FA in AD. Exposure of SHSY5Y-APP cells to FA resulted in the decrease in expression levels of BACE-1 and APP, while the expression of MMP-2 and MMP-9 increased in a dose-dependent manner. This suggests that FA induced BACE1 and MMP2 pathways maybe novel potential mechanisms involved in AD. The text mining of literature and TCM database related to AD suggested FA as a promising TCM ingredient for the treatment of AD. Potential mechanisms interconnected and integrated with Aβ aggregation inhibition and extracellular matrix remodeling underlying the activity of FA were identified using in vitro studies. PMID:29896095
Application of Ferulic Acid for Alzheimer's Disease: Combination of Text Mining and Experimental Validation.

PubMed

Meng, Guilin; Meng, Xiulin; Ma, Xiaoye; Zhang, Gengping; Hu, Xiaolin; Jin, Aiping; Zhao, Yanxin; Liu, Xueyuan

2018-01-01

Alzheimer's disease (AD) is an increasing concern in human health. Despite significant research, highly effective drugs to treat AD are lacking. The present study describes the text mining process to identify drug candidates from a traditional Chinese medicine (TCM) database, along with associated protein target mechanisms. We carried out text mining to identify literatures that referenced both AD and TCM and focused on identifying compounds and protein targets of interest. After targeting one potential TCM candidate, corresponding protein-protein interaction (PPI) networks were assembled in STRING to decipher the most possible mechanism of action. This was followed by validation using Western blot and co-immunoprecipitation in an AD cell model. The text mining strategy using a vast amount of AD-related literature and the TCM database identified curcumin, whose major component was ferulic acid (FA). This was used as a key candidate compound for further study. Using the top calculated interaction score in STRING, BACE1 and MMP2 were implicated in the activity of FA in AD. Exposure of SHSY5Y-APP cells to FA resulted in the decrease in expression levels of BACE-1 and APP, while the expression of MMP-2 and MMP-9 increased in a dose-dependent manner. This suggests that FA induced BACE1 and MMP2 pathways maybe novel potential mechanisms involved in AD. The text mining of literature and TCM database related to AD suggested FA as a promising TCM ingredient for the treatment of AD. Potential mechanisms interconnected and integrated with Aβ aggregation inhibition and extracellular matrix remodeling underlying the activity of FA were identified using in vitro studies.
In silico database screening of potential targets and pathways of compounds contained in plants used for psoriasis vulgaris.

PubMed

May, Brian H; Deng, Shiqiang; Zhang, Anthony L; Lu, Chuanjian; Xue, Charlie C L

2015-09-01

Reviews and meta-analyses of clinical trials identified plants used as traditional medicines (TMs) that show promise for psoriasis. These include Rehmannia glutinosa, Camptotheca acuminata, Indigo naturalis and Salvia miltiorrhiza. Compounds contained in these TMs have shown activities of relevance to psoriasis in experimental models. To further investigate the likely mechanisms of action of the multiple compounds in these TMs, we undertook a computer-based in silico investigation of the proteins known to be regulated by these compounds and their associated biological pathways. The proteins reportedly regulated by compounds in these four TMs were identified using the HIT (Herbal Ingredients' Targets) database. The resultant data were entered into the PANTHER (Protein ANnotation THrough Evolutionary Relationship) database to identify the pathways in which the proteins could be involved. The study identified 237 compounds in the TMs and these retrieved 287 proteins from HIT. These proteins identified 59 pathways in PANTHER with most proteins being located in the Apoptosis, Angiogenesis, Inflammation mediated by chemokine and cytokine, Gonadotropin releasing hormone receptor, and/or Interleukin signaling pathways. All four TMs contained compounds that had regulating effects on Apoptosis regulator BAX, Apoptosis regulator Bcl-2, Caspase-3, Tumor necrosis factor (TNF) or Prostaglandin G/H synthase 2 (COX2). The main proteins and pathways are primarily related to inflammation, proliferation and angiogenesis which are all processes involved in psoriasis. Experimental studies have reported that certain compounds from these TMs can regulate the expression of proteins involved in each of these pathways.
Deregulated HOXB7 expression predicts poor prognosis of patients with malignancies of digestive system.

PubMed

Liu, Fang-Teng; Chen, Han-Min; Xiong, Ying; Zhu, Zheng-Ming

2017-07-26

Numerous studies have investigated the relationship between deregulated HOXB7 expression with the clinical outcome in patients with digestive stem cancers, HOXB7 has showed negative impacts but with varying levels. We aimed to comprehensively evaluate the prediction and prognostic value of HOXB7 in digestive stem cancers. Electronic databases updated to December 1, 2016 were retrieved to collect relevant eligible studies to quantitatively explore the potential roles of HOXB7 as a prognostic indicator in digestive system cancers. A total of 9 studies (n = 1298 patients) was included in this synthetical meta-analysis. The pooled hazard ratios suggested that high expression of HOXB7 protein was associated with poor prognosis of OS in patients with digestive system cancers (HR = 1.97, 95% CI: 1.65-2.28, p= 0.000), and HOXB7 protein could act as an independent prognostic factor for predicting OS of patients with digestive system cancers (HR: 2.02, 95% CI: 1.69-2.36, p = 0.000). Statistical significance was also observed in subgroup meta-analysis based on the cancer type, histology type, country, sample size and publication date. Furthermore, we examined the correlations between HOXB7 protein and clinicopathological features. It showed that altered expression of HOXB7 protein was correlated with tumor invasion (p = 0.000), lymph node status (p = 0.000), distant metastasis (p = 0.001) and TNM stage (p = 0.000). However, the expression of HOXB7 protein was not associated with age (p = 0.64), gender (p = 0.40) or levels of differentiation (p = 0.19). High expression of HOXB7 protein was associated with poor prognosis of patients with digestive system cancers, as well as clinicopathologic characteristics, including the tumor invasion, lymph node status, distant metastasis and TNM stage. The expression of HOXB7 protein was not associated with age, gender or levels of differentiation. HOXB7 protein expression level in tumor tissue might serve as a novel prognostic marker for digestive system cancers.

BloodChIP: a database of comparative genome-wide transcription factor binding profiles in human blood cells.

PubMed

Chacon, Diego; Beck, Dominik; Perera, Dilmi; Wong, Jason W H; Pimanda, John E

2014-01-01

The BloodChIP database (http://www.med.unsw.edu.au/CRCWeb.nsf/page/BloodChIP) supports exploration and visualization of combinatorial transcription factor (TF) binding at a particular locus in human CD34-positive and other normal and leukaemic cells or retrieval of target gene sets for user-defined combinations of TFs across one or more cell types. Increasing numbers of genome-wide TF binding profiles are being added to public repositories, and this trend is likely to continue. For the power of these data sets to be fully harnessed by experimental scientists, there is a need for these data to be placed in context and easily accessible for downstream applications. To this end, we have built a user-friendly database that has at its core the genome-wide binding profiles of seven key haematopoietic TFs in human stem/progenitor cells. These binding profiles are compared with binding profiles in normal differentiated and leukaemic cells. We have integrated these TF binding profiles with chromatin marks and expression data in normal and leukaemic cell fractions. All queries can be exported into external sites to construct TF-gene and protein-protein networks and to evaluate the association of genes with cellular processes and tissue expression.
Time Series Proteome Profiling

PubMed Central

Formolo, Catherine A.; Mintz, Michelle; Takanohashi, Asako; Brown, Kristy J.; Vanderver, Adeline; Halligan, Brian; Hathout, Yetrib

2014-01-01

This chapter provides a detailed description of a method used to study temporal changes in the endoplasmic reticulum (ER) proteome of fibroblast cells exposed to ER stress agents (tunicamycin and thapsigargin). Differential stable isotope labeling by amino acids in cell culture (SILAC) is used in combination with crude ER fractionation, SDS–PAGE and LC-MS/MS to define altered protein expression in tunicamycin or thapsigargin treated cells versus untreated cells. Treated and untreated cells are harvested at different time points, mixed at a 1:1 ratio and processed for ER fractionation. Samples containing labeled and unlabeled proteins are separated by SDS–PAGE, bands are digested with trypsin and the resulting peptides analyzed by LC-MS/MS. Proteins are identified using Bioworks software and the Swiss-Prot data-base, whereas ratios of protein expression between treated and untreated cells are quantified using ZoomQuant software. Data visualization is facilitated by GeneSpring software. proteomics PMID:21082445
Systematic Identification and Characterization of Novel Human Skin-Associated Genes Encoding Membrane and Secreted Proteins

PubMed Central

Buhren, Bettina Alexandra; Martinez, Cynthia; Schrumpf, Holger; Gasis, Marcia; Grether-Beck, Susanne; Krutmann, Jean

2013-01-01

Through bioinformatics analyses of a human gene expression database representing 105 different tissues and cell types, we identified 687 skin-associated genes that are selectively and highly expressed in human skin. Over 50 of these represent uncharacterized genes not previously associated with skin and include a subset that encode novel secreted and plasma membrane proteins. The high levels of skin-associated expression for eight of these novel therapeutic target genes were confirmed by semi-quantitative real time PCR, western blot and immunohistochemical analyses of normal skin and skin-derived cell lines. Four of these are expressed specifically by epidermal keratinocytes; two that encode G-protein-coupled receptors (GPR87 and GPR115), and two that encode secreted proteins (WFDC5 and SERPINB7). Further analyses using cytokine-activated and terminally differentiated human primary keratinocytes or a panel of common inflammatory, autoimmune or malignant skin diseases revealed distinct patterns of regulation as well as disease associations that point to important roles in cutaneous homeostasis and disease. Some of these novel uncharacterized skin genes may represent potential biomarkers or drug targets for the development of future diagnostics or therapeutics. PMID:23840300
Comparative proteomics of leaves found at different stem positions of maize seedlings.

PubMed

Chen, Yi-Bo; Wang, Dan; Ge, Xuan-Liang; Zhao, Biligen-Gaowa; Wang, Xu-Chu; Wang, Bai-Chen

2016-07-01

To better understand the roles of leaves at different stem positions during plant development, we measured the physiological properties of leaves 1-4 on maize seedling stems, and performed a proteomics study to investigate the differences in protein expression in the four leaves using two-dimensional difference gel electrophoresis and tandem mass spectrometry in conjunction with database searching. A total of 167 significantly differentially expressed protein spots were found and identified. Of these, 35% are involved in photosynthesis. By further analysis of the data, we speculated that in leaf 1 the seedling has started to transition from a heterotroph to an autotroph, development of leaf 2 is the time at which the seedling fully transitions from a heterotroph to an autotroph, and leaf maturity was reached only with fully expanded leaves 3 and 4, although there were still some protein expression differences in the two leaves. These results suggest that the different leaves make different contributions to maize seedling growth via modulation of the expression of the photosynthetic proteins. Together, these results provide insight into the roles of the different maize leaves as the plant develops from a heterotroph to an autotroph. Copyright © 2016 Elsevier GmbH. All rights reserved.
Abnormally high expression of POLD1, MCM2, and PLK4 promotes relapse of acute lymphoblastic leukemia.

PubMed

Li, Sheng; Wang, Chengzhong; Wang, Weikai; Liu, Weidong; Zhang, Guiqin

2018-05-01

This study aimed to explore the underlying mechanism of relapsed acute lymphoblastic leukemia (ALL).Datasets of GSE28460 and GSE18497 were downloaded from Gene Expression Omnibus (GEO). Differentially expressed genes (DEGs) between diagnostic and relapsed ALL samples were identified using Limma package in R, and a Venn diagram was drawn. Next, functional enrichment analyses of co-regulated DEGs were performed. Based on the String database, protein-protein interaction network and module analyses were also conducted. Moreover, transcription factors and miRNAs targeting co-regulated DEGs were predicted using the WebGestalt online tool.A total of 71 co-regulated DEGs were identified, including 56 co-upregulated genes and 15 co-downregulated genes. Functional enrichment analyses showed that upregulated DEGs were significantly enriched in the cell cycle, and DNA replication, and repair related pathways. POLD1, MCM2, and PLK4 were hub proteins in both protein-protein interaction network and module, and might be potential targets of E2F. Additionally, POLD1 and MCM2 were found to be regulated by miR-520H via E2F1.High expression of POLD1, MCM2, and PLK4 might play positive roles in the recurrence of ALL, and could serve as potential therapeutic targets for the treatment of relapsed ALL.
GEneSTATION 1.0: a synthetic resource of diverse evolutionary and functional genomic data for studying the evolution of pregnancy-associated tissues and phenotypes

PubMed Central

Kim, Mara; Cooper, Brian A.; Venkat, Rohit; Phillips, Julie B.; Eidem, Haley R.; Hirbo, Jibril; Nutakki, Sashank; Williams, Scott M.; Muglia, Louis J.; Capra, J. Anthony; Petren, Kenneth; Abbot, Patrick; Rokas, Antonis; McGary, Kriston L.

2016-01-01

Mammalian gestation and pregnancy are fast evolving processes that involve the interaction of the fetal, maternal and paternal genomes. Version 1.0 of the GEneSTATION database (http://genestation.org) integrates diverse types of omics data across mammals to advance understanding of the genetic basis of gestation and pregnancy-associated phenotypes and to accelerate the translation of discoveries from model organisms to humans. GEneSTATION is built using tools from the Generic Model Organism Database project, including the biology-aware database CHADO, new tools for rapid data integration, and algorithms that streamline synthesis and user access. GEneSTATION contains curated life history information on pregnancy and reproduction from 23 high-quality mammalian genomes. For every human gene, GEneSTATION contains diverse evolutionary (e.g. gene age, population genetic and molecular evolutionary statistics), organismal (e.g. tissue-specific gene and protein expression, differential gene expression, disease phenotype), and molecular data types (e.g. Gene Ontology Annotation, protein interactions), as well as links to many general (e.g. Entrez, PubMed) and pregnancy disease-specific (e.g. PTBgene, dbPTB) databases. By facilitating the synthesis of diverse functional and evolutionary data in pregnancy-associated tissues and phenotypes and enabling their quick, intuitive, accurate and customized meta-analysis, GEneSTATION provides a novel platform for comprehensive investigation of the function and evolution of mammalian pregnancy. PMID:26567549
The plant energy-dissipating mitochondrial systems: depicting the genomic structure and the expression profiles of the gene families of uncoupling protein and alternative oxidase in monocots and dicots.

PubMed

Borecky, Jirí; Nogueira, Fábio T S; de Oliveira, Kívia A P; Maia, Ivan G; Vercesi, Aníbal E; Arruda, Paulo

2006-01-01

The simultaneous existence of alternative oxidases and uncoupling proteins in plants has raised the question as to why plants need two energy-dissipating systems with apparently similar physiological functions. A probably complete plant uncoupling protein gene family is described and the expression profiles of this family compared with the multigene family of alternative oxidases in Arabidopsis thaliana and sugarcane (Saccharum sp.) employed as dicot and monocot models, respectively. In total, six uncoupling protein genes, AtPUMP1-6, were recognized within the Arabidopsis genome and five (SsPUMP1-5) in a sugarcane EST database. The recombinant AtPUMP5 protein displayed similar biochemical properties as AtPUMP1. Sugarcane possessed four Arabidopsis AOx1-type orthologues (SsAOx1a-1d); no sugarcane orthologue corresponding to Arabidopsis AOx2-type genes was identified. Phylogenetic and expression analyses suggested that AtAOx1d does not belong to the AOx1-type family but forms a new (AOx3-type) family. Tissue-enriched expression profiling revealed that uncoupling protein genes were expressed more ubiquitously than the alternative oxidase genes. Distinct expression patterns among gene family members were observed between monocots and dicots and during chilling stress. These findings suggest that the members of each energy-dissipating system are subject to different cell or tissue/organ transcriptional regulation. As a result, plants may respond more flexibly to adverse biotic and abiotic conditions, in which oxidative stress is involved.
Lack of Detectable Allergenicity in Genetically Modified Maize Containing “Cry” Proteins as Compared to Native Maize Based on In Silico & In Vitro Analysis

PubMed Central

Mathur, Chandni; Kathuria, Pooran C.; Dahiya, Pushpa; Singh, Anand B.

2015-01-01

Background Genetically modified, (GM) crops with potential allergens must be evaluated for safety and endogenous IgE binding pattern compared to native variety, prior to market release. Objective To compare endogenous IgE binding proteins of three GM maize seeds containing Cry 1Ab,1Ac,1C transgenic proteins with non GM maize. Methods An integrated approach of in silico & in vitro methods was employed. Cry proteins were tested for presence of allergen sequence by FASTA in allergen databases. Biochemical assays for maize extracts were performed. Specific IgE (sIgE) and Immunoblot using food sensitized patients sera (n = 39) to non GM and GM maize antigens was performed. Results In silico approaches, confirmed for non sequence similarity of stated transgenic proteins in allergen databases. An insignificant (p> 0.05) variation in protein content between GM and non GM maize was observed. Simulated Gastric Fluid (SGF) revealed reduced number of stable protein fractions in GM then non GM maize which might be due to shift of constituent protein expression. Specific IgE values from patients showed insignificant difference in non GM and GM maize extracts. Five maize sensitized cases, recognized same 7 protein fractions of 88-28 kD as IgE bindng in both GM and non-GM maize, signifying absence of variation. Four of the reported IgE binding proteins were also found to be stable by SGF. Conclusion Cry proteins did not indicate any significant similarity of >35% in allergen databases. Immunoassays also did not identify appreciable differences in endogenous IgE binding in GM and non GM maize. PMID:25706412
A calmodulin binding protein from Arabidopsis is induced by ethylene and contains a DNA-binding motif

NASA Technical Reports Server (NTRS)

Reddy, A. S.; Reddy, V. S.; Golovkin, M.

2000-01-01

Calmodulin (CaM), a key calcium sensor in all eukaryotes, regulates diverse cellular processes by interacting with other proteins. To isolate CaM binding proteins involved in ethylene signal transduction, we screened an expression library prepared from ethylene-treated Arabidopsis seedlings with 35S-labeled CaM. A cDNA clone, EICBP (Ethylene-Induced CaM Binding Protein), encoding a protein that interacts with activated CaM was isolated in this screening. The CaM binding domain in EICBP was mapped to the C-terminus of the protein. These results indicate that calcium, through CaM, could regulate the activity of EICBP. The EICBP is expressed in different tissues and its expression in seedlings is induced by ethylene. The EICBP contains, in addition to a CaM binding domain, several features that are typical of transcription factors. These include a DNA-binding domain at the N terminus, an acidic region at the C terminus, and nuclear localization signals. In database searches a partial cDNA (CG-1) encoding a DNA-binding motif from parsley and an ethylene up-regulated partial cDNA from tomato (ER66) showed significant similarity to EICBP. In addition, five hypothetical proteins in the Arabidopsis genome also showed a very high sequence similarity with EICBP, indicating that there are several EICBP-related proteins in Arabidopsis. The structural features of EICBP are conserved in all EICBP-related proteins in Arabidopsis, suggesting that they may constitute a new family of DNA binding proteins and are likely to be involved in modulating gene expression in the presence of ethylene.
Identification of differentially expressed proteins of Arthrospira (Spirulina) plantensis-YZ under salt-stress conditions by proteomics and qRT-PCR analysis

PubMed Central

2013-01-01

Arthrospira (Spirulina) platensis as a representative species of cyanobacteria has been recognized and used worldwide as a source of protein in the food, which possesses some unusual and valuable physiological characteristics, such as alkali and salt tolerance. Based on complete genome sequencing of Arthrospira (Spirulina) plantensis-YZ, we compared the protein expression profiles of this organism under different salt-stress conditions (i.e. 0.02 M, 0.5 M and 1.0 M NaCl, respectively), using 2-D electrophoresis and peptide mass fingerprinting, and retrieved 141 proteins showing significantly differential expression in response to salt-stress. Of the 141 proteins, 114 Arthrospira (Spirulina) plantensis-YZ proteins were found with significant homology to those found in Arthrospira (76 proteins in Arthrospira platensis str. Paraca and 38 in Arthrospira maxima CS-328). The remaining 27 proteins belong to other bacteria. Subsequently, we determined the transcriptional level of 29 genes in vivo in response to NaCl treatments and verified them by qRT-PCR. We found that 12 genes keep consistency at both transcription and protein levels, and transcription of all of them but one were up-regulated. We classified the 141 differentially expressed proteins into 18 types of function categories using COG database, and linked them to their respective KEGG metabolism pathways. These proteins are involved in 31 metabolism pathways, such as photosynthesis, glucose metabolism, cysteine and methionine metabolism, lysine synthesis, fatty acid metabolism, glutathione metabolism. Additionally, the SRPs, heat shock protein and ABC transporter proteins were identified, which probably render Arthrospira (Spirulina) plantensis’s resistance against high salt stress. PMID:23363438
Plant Reactome: a resource for plant pathways and comparative analysis

PubMed Central

Naithani, Sushma; Preece, Justin; D'Eustachio, Peter; Gupta, Parul; Amarasinghe, Vindhya; Dharmawardhana, Palitha D.; Wu, Guanming; Fabregat, Antonio; Elser, Justin L.; Weiser, Joel; Keays, Maria; Fuentes, Alfonso Munoz-Pomer; Petryszak, Robert; Stein, Lincoln D.; Ware, Doreen; Jaiswal, Pankaj

2017-01-01

Plant Reactome (http://plantreactome.gramene.org/) is a free, open-source, curated plant pathway database portal, provided as part of the Gramene project. The database provides intuitive bioinformatics tools for the visualization, analysis and interpretation of pathway knowledge to support genome annotation, genome analysis, modeling, systems biology, basic research and education. Plant Reactome employs the structural framework of a plant cell to show metabolic, transport, genetic, developmental and signaling pathways. We manually curate molecular details of pathways in these domains for reference species Oryza sativa (rice) supported by published literature and annotation of well-characterized genes. Two hundred twenty-two rice pathways, 1025 reactions associated with 1173 proteins, 907 small molecules and 256 literature references have been curated to date. These reference annotations were used to project pathways for 62 model, crop and evolutionarily significant plant species based on gene homology. Database users can search and browse various components of the database, visualize curated baseline expression of pathway-associated genes provided by the Expression Atlas and upload and analyze their Omics datasets. The database also offers data access via Application Programming Interfaces (APIs) and in various standardized pathway formats, such as SBML and BioPAX. PMID:27799469
Database resources of the National Center for Biotechnology Information.

PubMed

Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bolton, Evan; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Landsman, David; Lipman, David J; Lu, Zhiyong; Madden, Thomas L; Madej, Tom; Maglott, Donna R; Marchler-Bauer, Aron; Miller, Vadim; Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Wang, Yanli; Wilbur, W John; Yaschenko, Eugene; Ye, Jian

2011-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Electronic PCR, OrfFinder, Splign, ProSplign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), IBIS, Biosystems, Peptidome, OMSSA, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Protein Expression Profile of Twenty-Week-Old Diabetic db/db and Non-Diabetic Mice Livers: A Proteomic and Bioinformatic Analysis.

PubMed

Guzmán-Flores, Juan Manuel; Flores-Pérez, Elsa Cristina; Hernández-Ortiz, Magdalena; Vargas-Ortiz, Katya; Ramírez-Emiliano, Joel; Encarnación-Guevara, Sergio; Pérez-Vázquez, Victoriano

2018-06-01

Type 2 diabetes mellitus is characterized by insulin resistance in the liver. Insulin is not only involved in carbohydrate metabolism, it also regulates protein synthesis. This work describes the expression of proteins in the liver of a diabetic mouse and identifies the metabolic pathways involved. Twenty-week-old diabetic db/db mice were hepatectomized, after which proteins were separated by 2D-Polyacrylamide Gel Electrophoresis (2D-PAGE). Spots varying in intensity were analyzed using mass spectrometry, and biological function was assigned by the Database for Annotation, Visualization and Integrated Discovery (DAVID) software. A differential expression of 26 proteins was identified; among these were arginase-1, pyruvate carboxylase, peroxiredoxin-1, regucalcin, and sorbitol dehydrogenase. Bioinformatics analysis indicated that many of these proteins are mitochondrial and participate in metabolic pathways, such as the citrate cycle, the fructose and mannose metabolism, and glycolysis or gluconeogenesis. In addition, these proteins are related to oxidation⁻reduction reactions and molecular function of vitamin binding and amino acid metabolism. In conclusion, the proteomic profile of the liver of diabetic mouse db/db exhibited mainly alterations in the metabolism of carbohydrates and nitrogen. These differences illustrate the heterogeneity of diabetes in its different stages and under different conditions and highlights the need to improve treatments for this disease.
Structural Genomics of Bacterial Virulence Factors

DTIC Science & Technology

2005-05-01

is deficient to mammals and unique to bacteria, the enzymes involved in the pathway may be useful for antibiotic design. Recent genome sequence...the SARS S1 spike protein with a high affinity antibody (඘R)" ( Sui et al., 2004). Both the Si protein and antibody have been expressed and purified in... Streptococcus group are now in preparation. Key Research Accomplishments * Development of the VirFact database (J;p ’liL- tbur.htm o.i) of virulence
Insights into salicylic acid responses in cucumber (Cucumis sativus L.) cotyledons based on a comparative proteomic analysis.

PubMed

Hao, J H; Dong, C J; Zhang, Z G; Wang, X L; Shang, Q M

2012-05-01

To investigate the response of cucumber seedlings to exogenous salicylic acid (SA) and gain a better understanding of SA action mechanism, we generated a proteomic profile of cucumber (Cucumis sativus L.) cotyledons treated with exogenous SA. Analysis of 1500 protein spots from each gel revealed 63 differentially expressed proteins, 59 of which were identified successfully. Of the identified proteins, 97% matched cucumber proteins using a whole cucumber protein database based on the newly completed genome established by our laboratory. The identified proteins were involved in various cellular responses and metabolic processes, including antioxidative reactions, cell defense, photosynthesis, carbohydrate metabolism, respiration and energy homeostasis, protein folding and biosynthesis. The two largest functional categories included proteins involved in antioxidative reactions (23.7%) and photosynthesis (18.6%). Furthermore, the SA-responsive protein interaction network revealed 13 key proteins, suggesting that the expression changes of these proteins could be critical for SA-induced resistance. An analysis of these changes suggested that SA-induced resistance and seedling growth might be regulated in part through pathways involving antioxidative reactions and photosynthesis. © 2012 Elsevier Ireland Ltd. All rights reserved.
Comparative proteomic analysis of differentially expressed proteins between peripheral sensory and motor nerves.

PubMed

He, Qianru; Man, Lili; Ji, Yuhua; Zhang, Shuqiang; Jiang, Maorong; Ding, Fei; Gu, Xiaosong

2012-06-01

Peripheral sensory and motor nerves have different functions and different approaches to regeneration, especially their distinct ability to accurately reinervate terminal nerve pathways. To understand the molecular aspects underlying these differences, the proteomics technique by coupling isobaric tags for relative and absolute quantitation (iTRAQ) with online two-dimensional liquid chromatography tandem mass spectrometry (2D LC-MS/MS) was used to investigate the protein profile of sensory and motor nerve samples from rats. A total of 1472 proteins were identified in either sensory or motor nerve. Of them, 100 proteins showed differential expressions between both nerves, and some of them were validated by quantitative real time RT-PCR, Western blot analysis, and immunohistochemistry. In the light of functional categorization, the differentially expressed proteins in sensory and motor nerves, belonging to a broad range of classes, were related to a diverse array of biological functions, which included cell adhesion, cytoskeleton, neuronal plasticity, neurotrophic activity, calcium-binding, signal transduction, transport, enzyme catalysis, lipid metabolism, DNA-binding, synaptosome function, actin-binding, ATP-binding, extracellular matrix, and commitment to other lineages. The relatively higher expressed proteins in either sensory or motor nerve were tentatively discussed in combination with their specific molecular characteristics. It is anticipated that the database generated in this study will provide a solid foundation for further comprehensive investigation of functional differences between sensory and motor nerves, including the specificity of their regeneration.
[Adrenal protein expressions after Pinggan Qianyang Formula treatment in hypertensive rats with liver-yang hyperactivity: a comparative proteomic analysis].

PubMed

Zhang, Ying; Chen, Ze-qi; Zhong, Guang-wei

2008-07-01

To explore the pathogenic mechanism of liver-yang hyperactivity type of hypertension and to observe the effects of Pinggan Qianyang Formula (PGQYF), a compound of traditional Chinese herbals for calming the liver and suppressing yang, so as to provide experimental evidence for new marker proteins of drug therapy. A rat model of liver-yang hyperactivity was prepared with spontaneous hypertensive rats (SHRs) by administration of Aconiti Praeparatae Decoction. Adrenal proteins were separated by 2D gel electrophoresis (2-DE). The differentially expressed proteins were identified by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS) and database analysis. The rat model of liver-yang hyperactivity was successfully reproduced, and the PGQYF could decrease the grades of irritability, conjunctival congestion and systolic blood pressure of the rats (P<0.05, P<0.01). After analysis, twelve obviously differentially expressed proteins were found, eight of which were identified. The expression levels of isocitrate dehydrogenase and steroidogenic acute regulatory protein in the untreated group were up-regulated as compared with those in the normal control group, and down-regulated in the treatment group. The expression levels of ferritin light chain, elongation factor Tu, Rho GDP disassociation inhibitor 1, flavin reductase and basic transcription factor 3 in the untreated group were down-regulated as compared with those in the normal control group, and up-regulated in the treatment group. Differentially expressed adrenal proteins in SHRs with live-yang hyperactivity are successfully identified. This approach may lay a foundation for the further investigation of pathogenic mechanisms in hypertension with liver-yang hyperactivity and the mechanisms of PGQYF treatment.
Expression of uncharacterized male germ cell-specific genes and discovery of novel sperm-tail proteins in mice.

PubMed

Kwon, Jun Tae; Ham, Sera; Jeon, Suyeon; Kim, Youil; Oh, Seungmin; Cho, Chunghee

2017-01-01

The identification and characterization of germ cell-specific genes are essential if we hope to comprehensively understand the mechanisms of spermatogenesis and fertilization. Here, we searched the mouse UniGene databases and identified 13 novel genes as being putatively testis-specific or -predominant. Our in silico and in vitro analyses revealed that the expressions of these genes are testis- and germ cell-specific, and that they are regulated in a stage-specific manner during spermatogenesis. We generated antibodies against the proteins encoded by seven of the genes to facilitate their characterization in male germ cells. Immunoblotting and immunofluorescence analyses revealed that one of these proteins was expressed only in testicular germ cells, three were expressed in both testicular germ cells and testicular sperm, and the remaining three were expressed in sperm of the testicular stages and in mature sperm from the epididymis. Further analysis of the latter three proteins showed that they were all associated with cytoskeletal structures in the sperm flagellum. Among them, MORN5, which is predicted to contain three MORN motifs, is conserved between mouse and human sperm. In conclusion, we herein identify 13 authentic genes with male germ cell-specific expression, and provide comprehensive information about these genes and their encoded products. Our finding will facilitate future investigations into the functional roles of these novel genes in spermatogenesis and sperm functions.
Proteomic analysis identifies insulin-like growth factor-binding protein-related protein-1 as a podocyte product.

PubMed

Matsumoto, Takayuki; Hess, Sonja; Kajiyama, Hiroshi; Sakairi, Toru; Saleem, Moin A; Mathieson, Peter W; Nojima, Yoshihisa; Kopp, Jeffrey B

2010-10-01

The podocyte secretory proteome may influence the phenotype of adjacent podocytes, endothelial cells, parietal epithelial cells, and tubular epithelial cells but has not been systematically characterized. We have initiated studies to characterize this proteome, with the goal of further understanding the podocyte cell biology. We cultured differentiated conditionally immortalized human podocytes and subjected the proteins in conditioned medium to mass spectrometry. At a false discovery rate of <3%, we identified 111 candidates from conditioned medium, including 44 proteins that have signal peptides or are described as secreted proteins in the UniProt database. As validation, we confirmed that one of these proteins, insulin-like growth factor-binding protein-related protein-1 (IGFBP-rP1), was expressed in mRNA and protein of cultured podocytes. In addition, transforming growth factor-β1 stimulation increased IGFBP-rP1 in conditioned medium. We analyzed IGFBP-rP1 glomerular expression in a mouse model of human immunodeficiency virus-associated nephropathy. IGFBP-rP1 was absent from podocytes of normal mice and was expressed in podocytes and pseudocrescents of transgenic mice, where it was coexpressed with desmin, a podocyte injury marker. We conclude that IGFBP-rP1 may be a product of injured podocytes. Further analysis of the podocyte secretory proteome may identify biomarkers of podocyte injury.
Comparative and quantitative proteomic analysis of normal and degenerated human annulus fibrosus cells.

PubMed

Ye, Dongping; Liang, Weiguo; Dai, Libing; Zhou, Longqiang; Yao, Yicun; Zhong, Xin; Chen, Honghui; Xu, Jiake

2015-05-01

Degeneration of the intervertebral disc (IVD) is a major chronic medical condition associated with back pain. To better understand the pathogenesis of IVD degeneration, we performed comparative and quantitative proteomic analyses of normal and degenerated human annulus fibrosus (AF) cells and identified proteins that are differentially expressed between them. Annulus fibrosus cells were isolated and cultured from patients with lumbar disc herniation (the experimental group, degenerated AF cells) and scoliosis patients who underwent orthopaedic surgery (the control group, normal AF cells). Comparative proteomic analyses of normal and degenerated cultured AF cells were carried out using 2-D electrophoresis, mass spectrometric analyses, and database searching. Quantitative analyses of silver-stained 2-D electrophoresis gels of normal and degenerated cultured AF cells identified 10 protein spots that showed the most altered differential expression levels between the two groups. Among these, three proteins were decreased, including heat shock cognate 71-kDa protein, glucose-6-phosphate 1-dehydrogenase, and protocadherin-23, whereas seven proteins were increased, including guanine nucleotide-binding protein G(i) subunit α-2, superoxide dismutase, transmembrane protein 51, adenosine receptor A3, 26S protease regulatory subunit 8, lipid phosphate phosphatase-related protein, and fatty acyl-crotonic acid reductase 1. These differentially expressed proteins might be involved in the pathophysiological process of IVD degeneration and have potential values as biomarkers of the degeneration of IVD. © 2015 Wiley Publishing Asia Pty Ltd.

Bombyx mori Serpin6 regulates prophenoloxidase activity and the expression of antimicrobial proteins.

PubMed

Li, Bing; Yu, Hai-Zhong; Ye, Chong-Jun; Ma, Yan; Li, Xing; Fan, Tao; Chen, Fu-Sheng; Xu, Jia-Ping

2017-04-30

Serpins are a family of serine protease inhibitors that are found widely in insects. They play an important role in insect physiological responses, such as innate immunity and development. In this study, we obtained the Bombyx mori serpin6 (BmSerpin6) sequence from National Center for Biotechnology Information (NCBI) and the silkworm genome database (SilkDB). Reverse transcription PCR (RT-PCR) results showed that BmSerpin6 was expressed highly in hemocytes, the midgut, and the fat body. After challenging with Micrococcus luteus (Mi) and Serratia marcescens (Sm), the BmSerpin6 expression level was induced significantly. Transcript levels of gloverin2 and prophenoloxidase (PPO) activity were reduced significantly in the fat body and hemocytes after injecting the recombinant BmSerpin6 protein into silkworm larvae. A BmSerpin6 recombinant plasmid (BmSerpin6-pAC 5.1) was constructed successfully and transfected into Drosophila S2 cells, which resulted in significantly reduced expression of the drosomycin protein. These results indicated that BmSerpin6 might regulate silkworm immune responses. Copyright © 2017 Elsevier B.V. All rights reserved.
Differential display of abundantly expressed genes of Trichoderma harzianum during colonization of tomato-germinating seeds and roots.

PubMed

Mehrabi-Koushki, Mehdi; Rouhani, Hamid; Mahdikhani-Moghaddam, Esmat

2012-11-01

The identification of Trichoderma genes whose expression is altered during early stages of interaction with developing roots of germinated seeds is an important step toward understanding the rhizosphere competency of Trichoderma spp. The potential of 13 Trichoderma strains to colonize tomato root and promote plant growth has been evaluated. All used strains successfully propagated in spermosphere and continued their growth in rhizoplane simultaneously root enlargement while the strains T6 and T7 were the most abundant in the apical segment of roots. Root colonization in most strains associated with promoting the roots and shoots growth while they significantly increased up to 43 and 40 % roots and shoots dry weights, respectively. Differential display reverse transcriptase-PCR (DDRT-PCR) has been developed to detect differentially expressed genes in the previously selected strain, Trichoderma harzianum T7, during colonization stages of tomato-germinating seeds and roots. Amplified DDRT-PCR products were analyzed on gel agarose and 62 differential bands excised, purified, cloned, and sequenced. Obtained ESTs were submit-queried to NCBI database by BLASTx search and gene ontology hierarchy. Most of transcripts (29 EST) corresponds to known and hypothetical proteins such as secretion-related small GTPase, 40S ribosomal protein S3a, 3-hydroxybutyryl-CoA dehydrogenase, DNA repair protein rad50, lipid phosphate phosphatase-related protein type 3, nuclear essential protein, phospholipase A2, fatty acid desaturase, nuclear pore complex subunit Nup133, ubiquitin-activating enzyme, and 60S ribosomal protein L40. Also, 13 of these sequences showed no homology (E > 0.05) with public databases and considered as novel genes. Some of these ESTs corresponded to genes encodes enzymes potentially involved in nutritional support of microorganisms which have obvious importance in the establishment of Trichoderma in spermosphere and rhizosphere, via potentially functioning in acquisition of nutrients from energy-rich carbon compounds leaked from the germinating seeds and roots.
Proteomic analysis of Chromobacterium violaceum and its adaptability to stress.

PubMed

Castro, Diogo; Cordeiro, Isabelle Bezerra; Taquita, Paula; Eberlin, Marcos Nogueira; Garcia, Jerusa Simone; Souza, Gustavo Henrique M F; Arruda, Marco Aurélio Zezzi; Andrade, Edmar V; Filho, Spartaco A; Crainey, J Lee; Lozano, Luis Lopez; Nogueira, Paulo A; Orlandi, Patrícia P

2015-12-01

Chromobacterium violaceum (C. violaceum) occurs abundantly in a variety of ecosystems, including ecosystems that place the bacterium under stress. This study assessed the adaptability of C. violaceum by submitting it to nutritional and pH stresses and then analyzing protein expression using bi-dimensional electrophoresis (2-DE) and Maldi mass spectrometry. Chromobacterium violaceum grew best in pH neutral, nutrient-rich medium (reference conditions); however, the total protein mass recovered from stressed bacteria cultures was always higher than the total protein mass recovered from our reference culture. The diversity of proteins expressed (repressed by the number of identifiable 2-DE spots) was seen to be highest in the reference cultures, suggesting that stress reduces the overall range of proteins expressed by C. violaceum. Database comparisons allowed 43 of the 55 spots subjected to Maldi mass spectrometry to be characterized as containing a single identifiable protein. Stress-related expression changes were noted for C. violaceum proteins related to the previously characterized bacterial proteins: DnaK, GroEL-2, Rhs, EF-Tu, EF-P; MCP, homogentisate 1,2-dioxygenase, Arginine deiminase and the ATP synthase β-subunit protein as well as for the ribosomal protein subunits L1, L3, L5 and L6. The ability of C. violaceum to adapt its cellular mechanics to sub-optimal growth and protein production conditions was well illustrated by its regulation of ribosomal protein subunits. With the exception of the ribosomal subunit L3, which plays a role in protein folding and maybe therefore be more useful in stressful conditions, all the other ribosomal subunit proteins were seen to have reduced expression in stressed cultures. Curiously, C. violeaceum cultures were also observed to lose their violet color under stress, which suggests that the violacein pigment biosynthetic pathway is affected by stress. Analysis of the proteomic signatures of stressed C. violaceum indicates that nutrient-starvation and pH stress can cause changes in the expression of the C. violaceum receptors, transporters, and proteins involved with biosynthetic pathways, molecule recycling, energy production. Our findings complement the recent publication of the C. violeaceum genome sequence and could help with the future commercial exploitation of C. violeaceum.
Mining disease genes using integrated protein-protein interaction and gene-gene co-regulation information.

PubMed

Li, Jin; Wang, Limei; Guo, Maozu; Zhang, Ruijie; Dai, Qiguo; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Xuan, Ping; Zhang, Mingming

2015-01-01

In humans, despite the rapid increase in disease-associated gene discovery, a large proportion of disease-associated genes are still unknown. Many network-based approaches have been used to prioritize disease genes. Many networks, such as the protein-protein interaction (PPI), KEGG, and gene co-expression networks, have been used. Expression quantitative trait loci (eQTLs) have been successfully applied for the determination of genes associated with several diseases. In this study, we constructed an eQTL-based gene-gene co-regulation network (GGCRN) and used it to mine for disease genes. We adopted the random walk with restart (RWR) algorithm to mine for genes associated with Alzheimer disease. Compared to the Human Protein Reference Database (HPRD) PPI network alone, the integrated HPRD PPI and GGCRN networks provided faster convergence and revealed new disease-related genes. Therefore, using the RWR algorithm for integrated PPI and GGCRN is an effective method for disease-associated gene mining.
Treatment with captopril abrogates the altered expression of alpha1 macroglobulin and alpha1 antiproteinase in sera of spontaneously hypertensive rats

PubMed Central

2012-01-01

Background Proteins that are associated with hypertension may be identified by comparing the 2-dimensional gel electrophoresis (2-DE) profiles of the sera of spontaneously hypertensive rats (SHR) with those generated from normotensive Spraque-Dawley rats (SDR). Results Five proteins of high abundance were found to be significantly altered when the 2-DE serum profiles of the SHR were compared to those that were similarly generated from the SDR. Analysis by mass spectrometry and database search identified the proteins as retinol binding protein 4, complement C3, albumin (19.9 kDa fragment), alpha1 macroglobulin and alpha1 antiproteinase, which are all known to be associated with hypertension. The altered expression of the two latter proteins was found to be abrogated when similar analysis was performed on sera of the SHR that were treated with captopril. Conclusion Our data suggests that serum alpha1 macroglobulin and alpha1 antiproteinase are potentially useful complementary biomolecular indicators for monitoring of hypertension. PMID:22416803
A gel-free proteomic-based method for the characterization of Bordetella pertussis clinical isolates

PubMed Central

Williamson, Yulanda M.; Moura, Hercules; Simmons, Kaneatra; Whitmon, Jennifer; Melnick, Nikkol; Rees, Jon; Woolfitt, Adrian; Schieltz, David M.; Tondella, Maria L.; Ades, Edwin; Sampson, Jacquelyn; Carlone, George; Barr, John R.

2017-01-01

Bordetella pertussis (Bp) is the etiologic agent of pertussis or whooping cough, a highly contagious respiratory disease occurring primarily in infants and young children. Although vaccine preventable, pertussis cases have increased over the years leading researchers to re-evaluate vaccine control strategies. Since bacterial outer membrane proteins, comprising the surfaceome, often play roles in pathogenesis and antibody-mediated immunity, three recent Bp circulating isolates were examined using proteomics to identify any potential changes in surface protein expression. Fractions enriched for outer membrane proteins were digested with trypsin and the peptides analyzed by nano liquid chromatography-electrospray ionization-mass spectrometry (nLC-ESI-MS), followed by database analysis to elucidate the surfaceomes of our three Bp isolates. Furthermore, a less labor intensive non-gel based antibody affinity capture technology in conjunction with MS was employed to assess each Bp strains' immunogenic outer membrane proteins. This novel technique is generally applicable allowing for the identification of immunogenic surface expressed proteins on pertussis and other pathogenic bacteria. PMID:22537821
Proteomic analysis of early phase of conidia germination in Aspergillus nidulans.

PubMed

Oh, Young Taek; Ahn, Chun-Seob; Kim, Jeong Geun; Ro, Hyeon-Su; Lee, Chang-Won; Kim, Jae Won

2010-03-01

In order to investigate proteins involved in early phase of conidia germination, proteomic analysis was performed using two-dimensional gel electrophoresis (2D-GE) in conjunction with MALDI-TOF mass spectrometry (MS). The expression levels of 241 proteins varied quantitatively with statistical significance (P<0.05) at the early phase of the germination stage. Out of these 57 were identified by MALDI-TOF MS. Through classification of physiological functions from Conserved Domain Database analysis, among the identified proteins, 21, 13, and 6 proteins were associated with energy metabolism, protein synthesis, and protein folding process, respectively. Interestingly, eight proteins, which are involved in detoxification of reactive oxygen species (ROS) including catalase A, thioredoxin reductase, and mitochondrial peroxiredoxin, were also identified. The expression levels of the genes were further confirmed using Northern blot and reverse transcriptase (RT)-PCR analyses. This study represents the first proteomic analysis of early phase of conidia germination and will contribute to a better understanding of the molecular events involved in conidia germination process. Copyright (c) 2009 Elsevier Inc. All rights reserved.
VTCdb: a gene co-expression database for the crop species Vitis vinifera (grapevine).

PubMed

Wong, Darren C J; Sweetman, Crystal; Drew, Damian P; Ford, Christopher M

2013-12-16

Gene expression datasets in model plants such as Arabidopsis have contributed to our understanding of gene function and how a single underlying biological process can be governed by a diverse network of genes. The accumulation of publicly available microarray data encompassing a wide range of biological and environmental conditions has enabled the development of additional capabilities including gene co-expression analysis (GCA). GCA is based on the understanding that genes encoding proteins involved in similar and/or related biological processes may exhibit comparable expression patterns over a range of experimental conditions, developmental stages and tissues. We present an open access database for the investigation of gene co-expression networks within the cultivated grapevine, Vitis vinifera. The new gene co-expression database, VTCdb (http://vtcdb.adelaide.edu.au/Home.aspx), offers an online platform for transcriptional regulatory inference in the cultivated grapevine. Using condition-independent and condition-dependent approaches, grapevine co-expression networks were constructed using the latest publicly available microarray datasets from diverse experimental series, utilising the Affymetrix Vitis vinifera GeneChip (16 K) and the NimbleGen Grape Whole-genome microarray chip (29 K), thus making it possible to profile approximately 29,000 genes (95% of the predicted grapevine transcriptome). Applications available with the online platform include the use of gene names, probesets, modules or biological processes to query the co-expression networks, with the option to choose between Affymetrix or Nimblegen datasets and between multiple co-expression measures. Alternatively, the user can browse existing network modules using interactive network visualisation and analysis via CytoscapeWeb. To demonstrate the utility of the database, we present examples from three fundamental biological processes (berry development, photosynthesis and flavonoid biosynthesis) whereby the recovered sub-networks reconfirm established plant gene functions and also identify novel associations. Together, we present valuable insights into grapevine transcriptional regulation by developing network models applicable to researchers in their prioritisation of gene candidates, for on-going study of biological processes related to grapevine development, metabolism and stress responses.
A novel approach to select differential pathways associated with hypertrophic cardiomyopathy based on gene co‑expression analysis.

PubMed

Chen, Xiao-Min; Feng, Ming-Jun; Shen, Cai-Jie; He, Bin; Du, Xian-Feng; Yu, Yi-Bo; Liu, Jing; Chu, Hui-Min

2017-07-01

The present study was designed to develop a novel method for identifying significant pathways associated with human hypertrophic cardiomyopathy (HCM), based on gene co‑expression analysis. The microarray dataset associated with HCM (E‑GEOD‑36961) was obtained from the European Molecular Biology Laboratory‑European Bioinformatics Institute database. Informative pathways were selected based on the Reactome pathway database and screening treatments. An empirical Bayes method was utilized to construct co‑expression networks for informative pathways, and a weight value was assigned to each pathway. Differential pathways were extracted based on weight threshold, which was calculated using a random model. In order to assess whether the co‑expression method was feasible, it was compared with traditional pathway enrichment analysis of differentially expressed genes, which were identified using the significance analysis of microarrays package. A total of 1,074 informative pathways were screened out for subsequent investigations and their weight values were also obtained. According to the threshold of weight value of 0.01057, 447 differential pathways, including folding of actin by chaperonin containing T‑complex protein 1 (CCT)/T‑complex protein 1 ring complex (TRiC), purine ribonucleoside monophosphate biosynthesis and ubiquinol biosynthesis, were obtained. Compared with traditional pathway enrichment analysis, the number of pathways obtained from the co‑expression approach was increased. The results of the present study demonstrated that this method may be useful to predict marker pathways for HCM. The pathways of folding of actin by CCT/TRiC and purine ribonucleoside monophosphate biosynthesis may provide evidence of the underlying molecular mechanisms of HCM, and offer novel therapeutic directions for HCM.
The chordate proteome history database.

PubMed

Levasseur, Anthony; Paganini, Julien; Dainat, Jacques; Thompson, Julie D; Poch, Olivier; Pontarotti, Pierre; Gouret, Philippe

2012-01-01

The chordate proteome history database (http://ioda.univ-provence.fr) comprises some 20,000 evolutionary analyses of proteins from chordate species. Our main objective was to characterize and study the evolutionary histories of the chordate proteome, and in particular to detect genomic events and automatic functional searches. Firstly, phylogenetic analyses based on high quality multiple sequence alignments and a robust phylogenetic pipeline were performed for the whole protein and for each individual domain. Novel approaches were developed to identify orthologs/paralogs, and predict gene duplication/gain/loss events and the occurrence of new protein architectures (domain gains, losses and shuffling). These important genetic events were localized on the phylogenetic trees and on the genomic sequence. Secondly, the phylogenetic trees were enhanced by the creation of phylogroups, whereby groups of orthologous sequences created using OrthoMCL were corrected based on the phylogenetic trees; gene family size and gene gain/loss in a given lineage could be deduced from the phylogroups. For each ortholog group obtained from the phylogenetic or the phylogroup analysis, functional information and expression data can be retrieved. Database searches can be performed easily using biological objects: protein identifier, keyword or domain, but can also be based on events, eg, domain exchange events can be retrieved. To our knowledge, this is the first database that links group clustering, phylogeny and automatic functional searches along with the detection of important events occurring during genome evolution, such as the appearance of a new domain architecture.
FastRNABindR: Fast and Accurate Prediction of Protein-RNA Interface Residues.

PubMed

El-Manzalawy, Yasser; Abbas, Mostafa; Malluhi, Qutaibah; Honavar, Vasant

2016-01-01

A wide range of biological processes, including regulation of gene expression, protein synthesis, and replication and assembly of many viruses are mediated by RNA-protein interactions. However, experimental determination of the structures of protein-RNA complexes is expensive and technically challenging. Hence, a number of computational tools have been developed for predicting protein-RNA interfaces. Some of the state-of-the-art protein-RNA interface predictors rely on position-specific scoring matrix (PSSM)-based encoding of the protein sequences. The computational efforts needed for generating PSSMs severely limits the practical utility of protein-RNA interface prediction servers. In this work, we experiment with two approaches, random sampling and sequence similarity reduction, for extracting a representative reference database of protein sequences from more than 50 million protein sequences in UniRef100. Our results suggest that random sampled databases produce better PSSM profiles (in terms of the number of hits used to generate the profile and the distance of the generated profile to the corresponding profile generated using the entire UniRef100 data as well as the accuracy of the machine learning classifier trained using these profiles). Based on our results, we developed FastRNABindR, an improved version of RNABindR for predicting protein-RNA interface residues using PSSM profiles generated using 1% of the UniRef100 sequences sampled uniformly at random. To the best of our knowledge, FastRNABindR is the only protein-RNA interface residue prediction online server that requires generation of PSSM profiles for query sequences and accepts hundreds of protein sequences per submission. Our approach for determining the optimal BLAST database for a protein-RNA interface residue classification task has the potential of substantially speeding up, and hence increasing the practical utility of, other amino acid sequence based predictors of protein-protein and protein-DNA interfaces.
Identification of learning and memory genes in canine; promoter investigation and determining the selective pressure.

PubMed

Seifi Moroudi, Reihane; Masoudi, Ali Akbar; Vaez Torshizi, Rasoul; Zandi, Mohammad

2014-12-01

One of the important behaviors of dogs is trainability which is affected by learning and memory genes. These kinds of the genes have not yet been identified in dogs. In the current research, these genes were found in animal models by mining the biological data and scientific literatures. The proteins of these genes were obtained from the UniProt database in dogs and humans. Not all homologous proteins perform similar functions, thus comparison of these proteins was studied in terms of protein families, domains, biological processes, molecular functions, and cellular location of metabolic pathways in Interpro, KEGG, Quick Go and Psort databases. The results showed that some of these proteins have the same performance in the rat or mouse, dog, and human. It is anticipated that the protein of these genes may be effective in learning and memory in dogs. Then, the expression pattern of the recognized genes was investigated in the dog hippocampus using the existing information in the GEO profile. The results showed that BDNF, TAC1 and CCK genes are expressed in the dog hippocampus, therefore, these genes could be strong candidates associated with learning and memory in dogs. Subsequently, due to the importance of the promoter regions in gene function, this region was investigated in the above genes. Analysis of the promoter indicated that the HNF-4 site of BDNF gene and the transcription start site of CCK gene is exposed to methylation. Phylogenetic analysis of protein sequences of these genes showed high similarity in each of these three genes among the studied species. The dN/dS ratio for BDNF, TAC1 and CCK genes indicates a purifying selection during the evolution of the genes.
Transcriptome Analysis and Differential Gene Expression on the Testis of Orange Mud Crab, Scylla olivacea, during Sexual Maturation

PubMed Central

Waiho, Khor; Fazhan, Hanafiah; Shahreza, Md Sheriff; Moh, Julia Hwei Zhong; Noorbaiduri, Shaibani; Wong, Li Lian; Sinnasamy, Saranya

2017-01-01

Adequate genetic information is essential for sustainable crustacean fisheries and aquaculture management. The commercially important orange mud crab, Scylla olivacea, is prevalent in Southeast Asia region and is highly sought after. Although it is a suitable aquaculture candidate, full domestication of this species is hampered by the lack of knowledge about the sexual maturation process and the molecular mechanisms behind it, especially in males. To date, data on its whole genome is yet to be reported for S. olivacea. The available transcriptome data published previously on this species focus primarily on females and the role of central nervous system in reproductive development. De novo transcriptome sequencing for the testes of S. olivacea from immature, maturing and mature stages were performed. A total of approximately 144 million high-quality reads were generated and de novo assembled into 160,569 transcripts with a total length of 142.2 Mb. Approximately 15–23% of the total assembled transcripts were annotated when compared to public protein sequence databases (i.e. UniProt database, Interpro database, Pfam database and Drosophila melanogaster protein database), and GO-categorised with GO Ontology terms. A total of 156,181 high-quality Single-Nucleotide Polymorphisms (SNPs) were mined from the transcriptome data of present study. Transcriptome comparison among the testes of different maturation stages revealed one gene (beta crystallin like gene) with the most significant differential expression—up-regulated in immature stage and down-regulated in maturing and mature stages. This was further validated by qRT-PCR. In conclusion, a comprehensive transcriptome of the testis of orange mud crabs from different maturation stages were obtained. This report provides an invaluable resource for enhancing our understanding of this species’ genome structure and biology, as expressed and controlled by their gonads. PMID:28135340
Gene Composer: database software for protein construct design, codon engineering, and gene synthesis

PubMed Central

Lorimer, Don; Raymond, Amy; Walchli, John; Mixon, Mark; Barrow, Adrienne; Wallace, Ellen; Grice, Rena; Burgin, Alex; Stewart, Lance

2009-01-01

Background To improve efficiency in high throughput protein structure determination, we have developed a database software package, Gene Composer, which facilitates the information-rich design of protein constructs and their codon engineered synthetic gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bio-informatics steps used in modern structure guided protein engineering and synthetic gene engineering. Results An interactive Alignment Viewer allows the researcher to simultaneously visualize sequence conservation in the context of known protein secondary structure, ligand contacts, water contacts, crystal contacts, B-factors, solvent accessible area, residue property type and several other useful property views. The Construct Design Module enables the facile design of novel protein constructs with altered N- and C-termini, internal insertions or deletions, point mutations, and desired affinity tags. The modifications can be combined and permuted into multiple protein constructs, and then virtually cloned in silico into defined expression vectors. The Gene Design Module uses a protein-to-gene algorithm that automates the back-translation of a protein amino acid sequence into a codon engineered nucleic acid gene sequence according to a selected codon usage table with minimal codon usage threshold, defined G:C% content, and desired sequence features achieved through synonymous codon selection that is optimized for the intended expression system. The gene-to-oligo algorithm of the Gene Design Module plans out all of the required overlapping oligonucleotides and mutagenic primers needed to synthesize the desired gene constructs by PCR, and for physically cloning them into selected vectors by the most popular subcloning strategies. Conclusion We present a complete description of Gene Composer functionality, and an efficient PCR-based synthetic gene assembly procedure with mis-match specific endonuclease error correction in combination with PIPE cloning. In a sister manuscript we present data on how Gene Composer designed genes and protein constructs can result in improved protein production for structural studies. PMID:19383142
Gene composer: database software for protein construct design, codon engineering, and gene synthesis.

PubMed

Lorimer, Don; Raymond, Amy; Walchli, John; Mixon, Mark; Barrow, Adrienne; Wallace, Ellen; Grice, Rena; Burgin, Alex; Stewart, Lance

2009-04-21

To improve efficiency in high throughput protein structure determination, we have developed a database software package, Gene Composer, which facilitates the information-rich design of protein constructs and their codon engineered synthetic gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bio-informatics steps used in modern structure guided protein engineering and synthetic gene engineering. An interactive Alignment Viewer allows the researcher to simultaneously visualize sequence conservation in the context of known protein secondary structure, ligand contacts, water contacts, crystal contacts, B-factors, solvent accessible area, residue property type and several other useful property views. The Construct Design Module enables the facile design of novel protein constructs with altered N- and C-termini, internal insertions or deletions, point mutations, and desired affinity tags. The modifications can be combined and permuted into multiple protein constructs, and then virtually cloned in silico into defined expression vectors. The Gene Design Module uses a protein-to-gene algorithm that automates the back-translation of a protein amino acid sequence into a codon engineered nucleic acid gene sequence according to a selected codon usage table with minimal codon usage threshold, defined G:C% content, and desired sequence features achieved through synonymous codon selection that is optimized for the intended expression system. The gene-to-oligo algorithm of the Gene Design Module plans out all of the required overlapping oligonucleotides and mutagenic primers needed to synthesize the desired gene constructs by PCR, and for physically cloning them into selected vectors by the most popular subcloning strategies. We present a complete description of Gene Composer functionality, and an efficient PCR-based synthetic gene assembly procedure with mis-match specific endonuclease error correction in combination with PIPE cloning. In a sister manuscript we present data on how Gene Composer designed genes and protein constructs can result in improved protein production for structural studies.
Liverome: a curated database of liver cancer-related gene signatures with self-contained context information.

PubMed

Lee, Langho; Wang, Kai; Li, Gang; Xie, Zhi; Wang, Yuli; Xu, Jiangchun; Sun, Shaoxian; Pocalyko, David; Bhak, Jong; Kim, Chulhong; Lee, Kee-Ho; Jang, Ye Jin; Yeom, Young Il; Yoo, Hyang-Sook; Hwang, Seungwoo

2011-11-30

Hepatocellular carcinoma (HCC) is the fifth most common cancer worldwide. A number of molecular profiling studies have investigated the changes in gene and protein expression that are associated with various clinicopathological characteristics of HCC and generated a wealth of scattered information, usually in the form of gene signature tables. A database of the published HCC gene signatures would be useful to liver cancer researchers seeking to retrieve existing differential expression information on a candidate gene and to make comparisons between signatures for prioritization of common genes. A challenge in constructing such database is that a direct import of the signatures as appeared in articles would lead to a loss or ambiguity of their context information that is essential for a correct biological interpretation of a gene's expression change. This challenge arises because designation of compared sample groups is most often abbreviated, ad hoc, or even missing from published signature tables. Without manual curation, the context information becomes lost, leading to uninformative database contents. Although several databases of gene signatures are available, none of them contains informative form of signatures nor shows comprehensive coverage on liver cancer. Thus we constructed Liverome, a curated database of liver cancer-related gene signatures with self-contained context information. Liverome's data coverage is more than three times larger than any other signature database, consisting of 143 signatures taken from 98 HCC studies, mostly microarray and proteome, and involving 6,927 genes. The signatures were post-processed into an informative and uniform representation and annotated with an itemized summary so that all context information is unambiguously self-contained within the database. The signatures were further informatively named and meaningfully organized according to ten functional categories for guided browsing. Its web interface enables a straightforward retrieval of known differential expression information on a query gene and a comparison of signatures to prioritize common genes. The utility of Liverome-collected data is shown by case studies in which useful biological insights on HCC are produced. Liverome database provides a comprehensive collection of well-curated HCC gene signatures and straightforward interfaces for gene search and signature comparison as well. Liverome is available at http://liverome.kobic.re.kr.
[Locus HS.633957 expression in human gastrointestinal tract and tumors].

PubMed

Polev, D E; Krukovskaia, L L; Kozlov, A P

2011-01-01

Human locus HS.633957 corresponds to its namesake cluster in the UniGene database http:/www.ncbi.nlm.nih.gov/unigene. It is located on chromosome 7 and is 3.7 tpn in size. It does not seem to encode proteins nor has its function been identified. According to bioinformation evidence, its expression is tumor-specific. PCR assay on kDNA samples from different intact human tissues detected its slight expression in liver, heart, embryonal brain and kidney as well as in a wide spectrum of tumors. This work features locus Hs.633957 expression in different parts of human gastrointestinal tract and tumors.
Protein Drug Targets of Lavandula angustifolia on treatment of Rat Alzheimer's Disease

PubMed Central

Zali, Hakimeh; Zamanian-Azodi, Mona; Rezaei Tavirani, Mostafa; Akbar-zadeh Baghban, Alireza

2015-01-01

Different treatment strategies of Alzheimer's disease (AD) are being studied for treating or slowing the progression of AD. Many pharmaceutically important regulation systems operate through proteins as drug targets. Here, we investigate the drug target proteins in beta-amyloid (Aβ) injected rat hippocampus treated with Lavandula angustifolia (LA) by proteomics techniques. The reported study showed that lavender extract (LE) improves the spatial performance in AD animal model by diminishing Aβ production in histopathology of hippocampus, so in this study neuroprotective proteins expressed in Aβ injected rats treated with LE were scrutinized. Rats were divided into three groups including normal, Aβ injected, and Aβ injected that was treated with LE. Protein expression profiles of hippocampus tissue were determined by two-dimensional electrophoresis (2DE) method and dysregulated proteins such as Snca, NF-L, Hspa5, Prdx2, Apoa1, and Atp5a1were identified by MALDI-TOF/TOF. KEGG pathway and gene ontology (GO) categories were used by searching DAVID Bioinformatics Resources. All detected protein spots were used to determine predictedinteractions with other proteins in STRING online database. Different isoforms of important protein, Snca that exhibited neuroprotective effects by anti-apoptotic properties were expressed. NF-L involved in the maintenance of neuronal caliber. Hspa5 likewise Prdx2 displays as anti-apoptotic protein that Prdx2 also involved in the neurotrophic effects. Apoa1 has anti-inflammatory activity and Atp5a1, produces ATP from ADP. To sum up, these proteins as potential drug targets were expressed in hippocampus in response to effective components in LA may have therapeutic properties for the treatment of AD and other neurodegenerative diseases. PMID:25561935
Separation and characterization of needle and xylem maritime pine proteins.

PubMed

Costa, P; Pionneau, C; Bauw, G; Dubos, C; Bahrmann, N; Kremer, A; Frigerio, J M; Plomion, C

1999-01-01

Two-dimensional gel electrophoresis (2-DE) and image analysis are currently used for proteome analysis in maritime pine (Pinus pinaster Ait.). This study presents a database of expressed proteins extracted from needles and xylem, two important tissues for growth and wood formation. Electrophoresis was carried out by isoelectric focusing (IEF) in the first dimension and sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) in the second. Silver staining made it possible to detect an average of 900 and 600 spots on 2-DE gels from needles and xylem, respectively. A total of 28 xylem and 35 needle proteins were characterized by internal peptide microsequencing. Out of these 63 proteins, 57 (90%) could be identified based on amino acid similarity with known proteins, of which 24 (42%) have already been described in conifers. Overall comparison of both tissues indicated that 29% and 36% of the spots were specific to xylem and needles, respectively, while the other spots were of identical molecular weight and isoelectric point. The homology of spot location in 2-DE patterns was further validated by sequence analysis of proteins present in both tissues. A proteomic database of maritime pine is accessible on the internet (http://www.pierroton.inra.fr/genetics/2D/).
Proteomic characterization of hempseed (Cannabis sativa L.).

PubMed

Aiello, Gilda; Fasoli, Elisa; Boschin, Giovanna; Lammi, Carmen; Zanoni, Chiara; Citterio, Attilio; Arnoldi, Anna

2016-09-16

This paper presents an investigation on hempseed proteome. The experimental approach, based on combinatorial peptide ligand libraries (CPLLs), SDS-PAGE separation, nLC-ESI-MS/MS identification, and database search, permitted identifying in total 181 expressed proteins. This very large number of identifications was achieved by searching in two databases: Cannabis sativa L. (56 gene products identified) and Arabidopsis thaliana (125 gene products identified). By performing a protein-protein association network analysis using the STRING software, it was possible to build the first interactomic map of all detected proteins, characterized by 137 nodes and 410 interactions. Finally, a Gene Ontology analysis of the identified species permitted to classify their molecular functions: the great majority is involved in the seed metabolic processes (41%), responses to stimulus (8%), and biological process (7%). Hempseed is an underexploited non-legume protein-rich seed. Although its protein is well known for its digestibility, essential amino acid composition, and useful techno-functional properties, a comprehensive proteome characterization is still lacking. The objective of this work was to fill this knowledge gap and provide information useful for a better exploitation of this seed in different food products. Copyright © 2016 Elsevier B.V. All rights reserved.

Effective Medicinal Plant in Cancer Treatment, Part 2: Review Study

PubMed Central

Kooti, Wesam; Servatyari, Karo; Behzadifar, Masoud; Asadi-Samani, Majid; Sadeghi, Fatemeh; Nouri, Bijan; Zare Marzouni, Hadi

2017-01-01

Cancer is the second cause of death after cardiovascular diseases. With due attention to rapid progress in the phytochemical study of plants, they are becoming popular because of their anticancer effects. The aim of this study was to investigate the effective medicinal plants in the treatment of cancer and study their mechanism of action. In order to gather information the keywords “traditional medicine,” “plant compounds,” “medicinal plant,” “medicinal herb,” “toxicity,” “anticancer effect,” “cell line,” and “treatment” were searched in international databases such as ScienceDirect, PubMed, and Scopus and national databases such as Magiran, Sid, and Iranmedex, and a total of 228 articles were collected. In this phase, 49 nonrelevant articles were excluded. Enhancement P53 protein expression, reducing the expression of proteins P27, P21, NFκB expression and induction of apoptosis, inhibition of the PI3K/Akt pathway, and reduction of the level of acid phosphatase and lipid peroxidation are the most effective mechanisms of herbal plants that can inhibit cell cycle and proliferation. Common treatments such as radiotherapy and chemotherapy can cause some complications. According to results of this study, herbal extracts have antioxidant compounds that can induce apoptosis and inhibit cell proliferation by the investigated mechanisms. PMID:28359161
Histone Code Modulation by Oncogenic PWWP-Domain Protein in Breast Cancers

DTIC Science & Technology

2014-08-01

discs, the Drosophila melanogaster homo- logue of human retinoblastoma binding protein 2. Genetics 2000; 156: 645-663. [10] Zeng J, Ge Z, Wang L...in breast cancer patients (7-11). Earlier, we used genomic analysis of copy number and gene expression to perform a detailed analysis of the 8p11-12...from the 8p11-12 region (14). Very recently, we searched the Cancer Genome Atlas database that contains 744 breast invasive carcinomas. We found DNA or
Rice DB: an Oryza Information Portal linking annotation, subcellular location, function, expression, regulation, and evolutionary information for rice and Arabidopsis

PubMed Central

Narsai, Reena; Devenish, James; Castleden, Ian; Narsai, Kabir; Xu, Lin; Shou, Huixia; Whelan, James

2013-01-01

Omics research in Oryza sativa (rice) relies on the use of multiple databases to obtain different types of information to define gene function. We present Rice DB, an Oryza information portal that is a functional genomics database, linking gene loci to comprehensive annotations, expression data and the subcellular location of encoded proteins. Rice DB has been designed to integrate the direct comparison of rice with Arabidopsis (Arabidopsis thaliana), based on orthology or ‘expressology’, thus using and combining available information from two pre-eminent plant models. To establish Rice DB, gene identifiers (more than 40 types) and annotations from a variety of sources were compiled, functional information based on large-scale and individual studies was manually collated, hundreds of microarrays were analysed to generate expression annotations, and the occurrences of potential functional regulatory motifs in promoter regions were calculated. A range of computational subcellular localization predictions were also run for all putative proteins encoded in the rice genome, and experimentally confirmed protein localizations have been collated, curated and linked to functional studies in rice. A single search box allows anything from gene identifiers (for rice and/or Arabidopsis), motif sequences, subcellular location, to keyword searches to be entered, with the capability of Boolean searches (such as AND/OR). To demonstrate the utility of Rice DB, several examples are presented including a rice mitochondrial proteome, which draws on a variety of sources for subcellular location data within Rice DB. Comparisons of subcellular location, functional annotations, as well as transcript expression in parallel with Arabidopsis reveals examples of conservation between rice and Arabidopsis, using Rice DB (http://ricedb.plantenergy.uwa.edu.au). PMID:24147765
Rice DB: an Oryza Information Portal linking annotation, subcellular location, function, expression, regulation, and evolutionary information for rice and Arabidopsis.

PubMed

Narsai, Reena; Devenish, James; Castleden, Ian; Narsai, Kabir; Xu, Lin; Shou, Huixia; Whelan, James

2013-12-01

Omics research in Oryza sativa (rice) relies on the use of multiple databases to obtain different types of information to define gene function. We present Rice DB, an Oryza information portal that is a functional genomics database, linking gene loci to comprehensive annotations, expression data and the subcellular location of encoded proteins. Rice DB has been designed to integrate the direct comparison of rice with Arabidopsis (Arabidopsis thaliana), based on orthology or 'expressology', thus using and combining available information from two pre-eminent plant models. To establish Rice DB, gene identifiers (more than 40 types) and annotations from a variety of sources were compiled, functional information based on large-scale and individual studies was manually collated, hundreds of microarrays were analysed to generate expression annotations, and the occurrences of potential functional regulatory motifs in promoter regions were calculated. A range of computational subcellular localization predictions were also run for all putative proteins encoded in the rice genome, and experimentally confirmed protein localizations have been collated, curated and linked to functional studies in rice. A single search box allows anything from gene identifiers (for rice and/or Arabidopsis), motif sequences, subcellular location, to keyword searches to be entered, with the capability of Boolean searches (such as AND/OR). To demonstrate the utility of Rice DB, several examples are presented including a rice mitochondrial proteome, which draws on a variety of sources for subcellular location data within Rice DB. Comparisons of subcellular location, functional annotations, as well as transcript expression in parallel with Arabidopsis reveals examples of conservation between rice and Arabidopsis, using Rice DB (http://ricedb.plantenergy.uwa.edu.au). © 2013 The Authors The Plant Journal © 2013 John Wiley & Sons Ltd.
Cloning and expression of N-glycosylation-related glucosidase from Glaciozyma antarctica

NASA Astrophysics Data System (ADS)

Yajit, Noor Liana Mat; Kamaruddin, Shazilah; Hashim, Noor Haza Fazlin; Bakar, Farah Diba Abu; Murad, Abd. Munir Abd.; Mahadi, Nor Muhammad; Mackeen, Mukram Mohamed

2016-11-01

The need for functional oligosaccharides in various field is ever growing. The enzymatic approach for synthesis of oligosaccharides is advantageous over traditional chemical synthesis because of the regio- and stereo- selectivity that can be achieved without the need for protection chemistry. In this study, the α-glucosidase I protein sequence from Saccharomyces cerevisiae (UniProt database) was compared using Basic Local Alignment Search Tool (BLAST) with Glaciozyma antarctica genome database. Results showed 33% identity and an E-value of 1 × 10-125 for α-glucosidase I. The gene was amplified, cloned into the pPICZα C vector and used to transform Pichia pastoris X-33 cells. Soluble expression of α-Glucosidase I (˜91 kDa) was achieved at 28 °C with 1.0 % of methanol.
Buffalo cervico-vaginal fluid proteomics with special reference to estrous cycle: heat shock protein (HSP)-70 appears to be an estrus indicator.

PubMed

Muthukumar, Subramanian; Rajkumar, Ramalingam; Karthikeyan, Kandasamy; Liao, Chen-Chung; Singh, Dheer; Akbarsha, Mohammad Abdulkader; Archunan, Govindaraju

2014-05-01

Cervico-vaginal fluid (CVF) plays significant roles in coitus, sperm transport, and implantation. It is believed to be a good noninvasive biomarker for various diagnostic purposes. In this study, a comprehensive proteomic analysis of buffalo CVF was performed during the estrous cycle in order to document the protein expressions, utilizing SDS-PAGE, mass spectrometry, and immunoblot. The main objective was to screen the CVF of buffalo for one or more estrus-specific proteins. A total of 416 proteins were identified in the CVF of both estrus and diestrus phases. Out of these proteins, 68 estrus-specific proteins have been extensively reviewed in the protein database. The major physiological functions of estrus CVF proteins appeared to be stress response, immune response, and metabolic. Eventually, the expression level of heat shock protein-70 in the CVF during the estrus phase, as revealed in SDS-PAGE analysis, was higher than during diestrus. The identity of the protein was confirmed by immunoblot analysis as heat shock protein-70. The findings provide a potential lead for the evaluation of these proteins for estrus detection in buffalo because CVF biomarker detection is a noninvasive technique. The mass spectrometric data of identified proteins have been deposited at the ProteomeXchange with the identifier PXD000620.
De novo analysis of the Nilaparvata lugens (Stål) antenna transcriptome and expression patterns of olfactory genes.

PubMed

Zhou, Shuang-Shuang; Sun, Ze; Ma, Weihua; Chen, Wei; Wang, Man-Qun

2014-03-01

We sequenced the antenna transcriptome of the brown planthopper (BPH), Nilaparvata lugens (Stål), a global rice pest, and performed transcriptome analysis on BPH antenna. We obtained about 40million 90bp reads that were assembled into 75,874 unigenes with a mean size of 456bp. Among the antenna transcripts, 32,856 (43%) showed significant similarity (E-value <1e(-5)) to known proteins in the NCBI database. Gene ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were used to classify functions of BPH antenna genes. We identified 10 odorant-binding proteins (OBPs), including 7 previously unidentified, and 11 chemosensory proteins (CSPs), including two new members. The expression profiles of 4 OBPs and 2 CSPs were determined by q-PCR for antenna, abdomen, leg and wing of insects of different age, gender, and mating status including two BPH adult wing-morphology types. NlugCSP10 and 4 OBPs appeared to be antenna-specific because they were highly and differentially expressed in male and female antennae. NlugCSP11 was expressed ubiquitously, with particularly high expression in wings. The transcript levels of several olfactory genes depended on adult wing form, age, gender, and mating status, although no clear expression patterns were determined. Copyright © 2013 Elsevier Inc. All rights reserved.
Differential display cloning of a novel rat cDNA (RNB6) that shows high expression in the neonatal brain revealed a member of Ena/VASP family.

PubMed

Ohta, S; Mineta, T; Kimoto, M; Tabuchi, K

1997-08-18

We have used the differential display method to identify genes that control the neural cell development in CNS. Screening of the differential display bands that showed higher expression at neonate than at adult age enabled us to identify a novel rat cDNA (RNB6) coding for a protein of 393 amino acid residues. Database search revealed this gene as a rat homologue of the murine EVL, a member of Ena/VASP protein family that is implicated to be involved in the control of cell motility through actin filament assembly by their GP5 motifs. Although the precise characterization of EVL was not reported, our Northern blot and immunoblot analyses demonstrated that RNB6 expression in the brain gradually increases during embryonic development, reaches maximum at postnatal day 1 and decreases thereafter. Studies of tissue distribution revealed the expression of RNB6 not only in the brain but also in the spleen, thymus and testis. Histochemical analyses showed that RNB6 protein is mainly expressed in neurons and may be expressed in neural fibers. Our analyses suggest that RNB6 is critically involved in the development of CNS probably through the control of neural cell motility and/or including neuronal fiber extension.
Evolutionarily conserved ELOVL4 gene expression in the vertebrate retina.

PubMed

Lagali, Pamela S; Liu, Jiafan; Ambasudhan, Rajesh; Kakuk, Laura E; Bernstein, Steven L; Seigel, Gail M; Wong, Paul W; Ayyagari, Radha

2003-07-01

The gene elongation of very long chain fatty acids-4 (ELOVL4) has been shown to underlie phenotypically heterogeneous forms of autosomal dominant macular degeneration. In this study, the extent of evolutionary conservation and the existence and localization of retinal expression of this gene was investigated across a wide variety of species. Southern blot analysis of genomic DNA and bioinformatic analysis using the human ELOVL4 cDNA and protein sequences, respectively, were performed to identify species in which ELOVL4 orthologues and/or homologues are present. Retinal RNA and protein extracts derived from different species were assessed by Northern hybridization and immunoblot techniques to assess evolutionary conservation of gene expression. Immunohistochemical analysis of tissue sections prepared from various mammalian retinas was performed to determine the distribution of ELOVL4 and homologous proteins within specific retinal cell layers. The existence of ELOVL4 sequence orthologues and homologues was confirmed by both Southern blot analysis and in silico searches of protein sequence databases. Phylogenetic analysis places ELOVL4 among a large family of known and putative fatty acid elongase proteins. Northern blot analysis revealed the presence of multiple transcripts corresponding to ELOVL4 homologues expressed in the retina of several different mammalian species. Conserved proteins were also detected among retinal extracts of different mammals and were found to localize predominantly to the photoreceptor cell layer within retinal tissue preparations. The ELOVL4 gene is highly conserved throughout evolution and is expressed in the photoreceptor cells of the retina in a variety of different species, which suggests that it plays a critical role in retinal cell biology.
The salivary secretome of the tsetse fly Glossina pallidipes (Diptera: Glossinidae) infected by salivary gland hypertrophy virus.

PubMed

Kariithi, Henry M; Ince, Ikbal A; Boeren, Sjef; Abd-Alla, Adly M M; Parker, Andrew G; Aksoy, Serap; Vlak, Just M; Oers, Monique M van

2011-11-01

The competence of the tsetse fly Glossina pallidipes (Diptera; Glossinidae) to acquire salivary gland hypertrophy virus (SGHV), to support virus replication and successfully transmit the virus depends on complex interactions between Glossina and SGHV macromolecules. Critical requisites to SGHV transmission are its replication and secretion of mature virions into the fly's salivary gland (SG) lumen. However, secretion of host proteins is of equal importance for successful transmission and requires cataloging of G. pallidipes secretome proteins from hypertrophied and non-hypertrophied SGs. After electrophoretic profiling and in-gel trypsin digestion, saliva proteins were analyzed by nano-LC-MS/MS. MaxQuant/Andromeda search of the MS data against the non-redundant (nr) GenBank database and a G. morsitans morsitans SG EST database, yielded a total of 521 hits, 31 of which were SGHV-encoded. On a false discovery rate limit of 1% and detection threshold of least 2 unique peptides per protein, the analysis resulted in 292 Glossina and 25 SGHV MS-supported proteins. When annotated by the Blast2GO suite, at least one gene ontology (GO) term could be assigned to 89.9% (285/317) of the detected proteins. Five (∼1.8%) Glossina and three (∼12%) SGHV proteins remained without a predicted function after blast searches against the nr database. Sixty-five of the 292 detected Glossina proteins contained an N-terminal signal/secretion peptide sequence. Eight of the SGHV proteins were predicted to be non-structural (NS), and fourteen are known structural (VP) proteins. SGHV alters the protein expression pattern in Glossina. The G. pallidipes SG secretome encompasses a spectrum of proteins that may be required during the SGHV infection cycle. These detected proteins have putative interactions with at least 21 of the 25 SGHV-encoded proteins. Our findings opens venues for developing novel SGHV mitigation strategies to block SGHV infections in tsetse production facilities such as using SGHV-specific antibodies and phage display-selected gut epithelia-binding peptides.
Proteome Analysis of Liver Cells Expressing a Full- Length Hepatitis C Virus (HCV) Replicon and Biopsy Specimens of Posttransplantation Liver from HCV-Infected Patients

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jacobs, Jon M.; Diamond, Deborah L.; Chan, Eric Y.

2005-06-01

The development of a reproducible model system for the study of Hepatitis C virus (HCV) infection has the potential to significantly enhance the study of virus-host interactions and provide future direction for modeling the pathogenesis of HCV. While there are studies describing global gene expression changes associated with HCV infection, changes in the proteome have not been characterized. We report the first large scale proteome analysis of the highly permissive Huh-7.5 cell line containing a full length HCV replicon. We detected > 4,400 proteins in this cell line, including HCV replicon proteins, using multidimensional liquid chromatographic (LC) separations coupled tomore » mass spectrometry (MS). The set of Huh-7.5 proteins confidently identified is, to our knowledge, the most comprehensive yet reported for a human cell line. Consistent with the literature, a comparison of Huh-7.5 cells (+) and (-) the HCV replicon identified expression changes of proteins involved in lipid metabolism. We extended these analyses to liver biopsy material from HCV-infected patients where > 1,500 proteins were detected from 2 {micro}g protein lysate using the Huh-7.5 protein database and the accurate mass and time (AMT) tag strategy. These findings demonstrate the utility of multidimensional proteome analysis of the HCV replicon model system for assisting the determination of proteins/pathways affected by HCV infection. Our ability to extend these analyses to the highly complex proteome of small liver biopsies with limiting protein yields offers the unique opportunity to begin evaluating the clinical significance of protein expression changes associated with HCV infection.« less
Endocytosis and membrane receptor internalization: implication of F-BAR protein Carom.

PubMed

Xu, Yanjie; Xia, Jixiang; Liu, Suxuan; Stein, Sam; Ramon, Cueto; Xi, Hang; Wang, Luqiao; Xiong, Xinyu; Zhang, Lixiao; He, Dingwen; Yang, William; Zhao, Xianxian; Cheng, Xiaoshu; Yang, Xiaofeng; Wang, Hong

2017-03-01

Endocytosis is a cellular process mostly responsible for membrane receptor internalization. Cell membrane receptors bind to their ligands and form a complex which can be internalized. We previously proposed that F-BAR protein initiates membrane curvature and mediates endocytosis via its binding partners. However, F-BAR protein partners involved in membrane receptor endocytosis and the regulatory mechanism remain unknown. In this study, we established database mining strategies to explore mechanisms underlying receptor-related endocytosis. We identified 34 endocytic membrane receptors and 10 regulating proteins in clathrin-dependent endocytosis (CDE), a major process of membrane receptor internalization. We found that F-BAR protein FCHSD2 (Carom) may facilitate endocytosis via 9 endocytic partners. Carom is highly expressed, along with highly expressed endocytic membrane receptors and partners, in endothelial cells and macrophages. We established 3 models of Carom-receptor complexes and their intracellular trafficking based on protein interaction and subcellular localization. We conclude that Carom may mediate receptor endocytosis and transport endocytic receptors to the cytoplasm for receptor signaling and lysosome/proteasome degradation, or to the nucleus for RNA processing, gene transcription and DNA repair.
A systematic proteomic analysis of NaCl-stressed germinating maize seeds.

PubMed

Meng, Ling-Bo; Chen, Yi-Bo; Lu, Tian-Cong; Wang, Yue-Feng; Qian, Chun-Rong; Yu, Yang; Ge, Xuan-Liang; Li, Xiao-Hui; Wang, Bai-Chen

2014-05-01

Salt (NaCl) is a common physiological stressor of plants. To better understand how germinating seeds respond to salt stress, we examined the changes that occurred in the proteome of maize seeds during NaCl-treated germination. Phenotypically, salt concentrations less than 0.2 M appear to delay germination, while higher concentrations disrupt development completely, leading to seed death. The identities of 96 proteins with expression levels altered by NaCl-incubation were established using 2-DE-MALDI-TOF-MS and 2-DE-MALDI-TOF-MS/MS. Of these 96 proteins, 79 were altered greater than twofold when incubated with a 0.2 M salt solution, while 51 were altered when incubated with a 0.1 M salt solution. According to their functional annotations in the Swiss-Prot protein-sequence databases, these proteins are mainly involved in seed storage, energy metabolism, stress response, and protein metabolism. Notably, the expression of proteins that respond to abscisic acid signals increased in response to salt stress. The results of this study provide important clues as to how NaCl stresses the physiology of germinating maize seeds.
NCBI GEO: archive for functional genomics data sets--10 years on.

PubMed

Barrett, Tanya; Troup, Dennis B; Wilhite, Stephen E; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F; Tomashevsky, Maxim; Marshall, Kimberly A; Phillippy, Katherine H; Sherman, Patti M; Muertter, Rolf N; Holko, Michelle; Ayanbule, Oluwabukunmi; Yefanov, Andrey; Soboleva, Alexandra

2011-01-01

A decade ago, the Gene Expression Omnibus (GEO) database was established at the National Center for Biotechnology Information (NCBI). The original objective of GEO was to serve as a public repository for high-throughput gene expression data generated mostly by microarray technology. However, the research community quickly applied microarrays to non-gene-expression studies, including examination of genome copy number variation and genome-wide profiling of DNA-binding proteins. Because the GEO database was designed with a flexible structure, it was possible to quickly adapt the repository to store these data types. More recently, as the microarray community switches to next-generation sequencing technologies, GEO has again adapted to host these data sets. Today, GEO stores over 20,000 microarray- and sequence-based functional genomics studies, and continues to handle the majority of direct high-throughput data submissions from the research community. Multiple mechanisms are provided to help users effectively search, browse, download and visualize the data at the level of individual genes or entire studies. This paper describes recent database enhancements, including new search and data representation tools, as well as a brief review of how the community uses GEO data. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/.
A large scale analysis of cDNA in Arabidopsis thaliana: generation of 12,028 non-redundant expressed sequence tags from normalized and size-selected cDNA libraries.

PubMed

Asamizu, E; Nakamura, Y; Sato, S; Tabata, S

2000-06-30

For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5'-end ESTs and 39,207 3'-end ESTs were obtained. The 3'-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery.
MicroRNA-137 dysregulation predisposes to osteoporotic fracture by impeding ALP activity and expression via suppression of leucine-rich repeat-containing G-protein-coupled receptor 4 expression.

PubMed

Liu, Xiangjun; Xu, Xiaohui

2018-08-01

Osteoporosis is defined as a loss of bone mass and deterioration of its architecture resulting in bone weakness, which becomes prone to fracture. The objective of this study was to investigate the molecular mechanism by which miR-137 can reduce the risk of fracture in patients with osteoporosis. An online miRNA database and a luciferase reporter assay system were used to confirm that leucine-rich repeat-containing G-protein-coupled receptor 4 (LGR4) was the target of miR-137. Real-time PCR and western blot analysis were used to study miR-137 mRNA, the expression of LGR4 mRNA and protein among different groups or cells transfected with a scrambled miRNA control, miR-137 mimic, LGR4 siRNA and miR-137 inhibitor. Expression of miR-137 was upregulated to higher levels in cells isolated from osteoporosis patients with fracture than in those without fracture. The 'seed sequence' was found to be located within the 3' untranslated region (3'-UTR) of LGR4 mRNA by searching an online miRNA database. Luciferase reporter assay was performed to confirm that LGR4 is a direct target gene of miR-137 with a potential binding site in the 3'UTR of LGR4. Luciferase activity of cells transfected with wild-type LGR4 3'UTR was much lower than that of the cells transfected with mutant LGR4 3'UTR. The results of real-time PCR and immunohistochemistry experiments demonstrated that the expression levels of LGR4 mRNA and protein were much higher in osteoporosis patients with fracture than osteoporosis patients without fracture. We found that the expression levels of LGR4 mRNA and protein were clearly upregulated following transfection with miR-137 inhibitor, while noticeably downregulated following transfection with miR-137 mimic when compared with the scramble control. Furthermore, the expression of ALP mRNA and ALP activity in bone tissue were much higher in osteoporosis patients with fracture than those without fracture. In conclusion, these data prove that the overexpression of miR-137 was associated with an altered risk of fracture in patients with osteoporosis, and can be used as a biomarker for the prediction of risk of fracture in osteoporosis.
RiceFOX: a database of Arabidopsis mutant lines overexpressing rice full-length cDNA that contains a wide range of trait information to facilitate analysis of gene function.

PubMed

Sakurai, Tetsuya; Kondou, Youichi; Akiyama, Kenji; Kurotani, Atsushi; Higuchi, Mieko; Ichikawa, Takanari; Kuroda, Hirofumi; Kusano, Miyako; Mori, Masaki; Saitou, Tsutomu; Sakakibara, Hitoshi; Sugano, Shoji; Suzuki, Makoto; Takahashi, Hideki; Takahashi, Shinya; Takatsuji, Hiroshi; Yokotani, Naoki; Yoshizumi, Takeshi; Saito, Kazuki; Shinozaki, Kazuo; Oda, Kenji; Hirochika, Hirohiko; Matsui, Minami

2011-02-01

Identification of gene function is important not only for basic research but also for applied science, especially with regard to improvements in crop production. For rapid and efficient elucidation of useful traits, we developed a system named FOX hunting (Full-length cDNA Over-eXpressor gene hunting) using full-length cDNAs (fl-cDNAs). A heterologous expression approach provides a solution for the high-throughput characterization of gene functions in agricultural plant species. Since fl-cDNAs contain all the information of functional mRNAs and proteins, we introduced rice fl-cDNAs into Arabidopsis plants for systematic gain-of-function mutation. We generated >30,000 independent Arabidopsis transgenic lines expressing rice fl-cDNAs (rice FOX Arabidopsis mutant lines). These rice FOX Arabidopsis lines were screened systematically for various criteria such as morphology, photosynthesis, UV resistance, element composition, plant hormone profile, metabolite profile/fingerprinting, bacterial resistance, and heat and salt tolerance. The information obtained from these screenings was compiled into a database named 'RiceFOX'. This database contains around 18,000 records of rice FOX Arabidopsis lines and allows users to search against all the observed results, ranging from morphological to invisible traits. The number of searchable items is approximately 100; moreover, the rice FOX Arabidopsis lines can be searched by rice and Arabidopsis gene/protein identifiers, sequence similarity to the introduced rice fl-cDNA and traits. The RiceFOX database is available at http://ricefox.psc.riken.jp/.
RiceFOX: A Database of Arabidopsis Mutant Lines Overexpressing Rice Full-Length cDNA that Contains a Wide Range of Trait Information to Facilitate Analysis of Gene Function

PubMed Central

Sakurai, Tetsuya; Kondou, Youichi; Akiyama, Kenji; Kurotani, Atsushi; Higuchi, Mieko; Ichikawa, Takanari; Kuroda, Hirofumi; Kusano, Miyako; Mori, Masaki; Saitou, Tsutomu; Sakakibara, Hitoshi; Sugano, Shoji; Suzuki, Makoto; Takahashi, Hideki; Takahashi, Shinya; Takatsuji, Hiroshi; Yokotani, Naoki; Yoshizumi, Takeshi; Saito, Kazuki; Shinozaki, Kazuo; Oda, Kenji; Hirochika, Hirohiko; Matsui, Minami

2011-01-01

Identification of gene function is important not only for basic research but also for applied science, especially with regard to improvements in crop production. For rapid and efficient elucidation of useful traits, we developed a system named FOX hunting (Full-length cDNA Over-eXpressor gene hunting) using full-length cDNAs (fl-cDNAs). A heterologous expression approach provides a solution for the high-throughput characterization of gene functions in agricultural plant species. Since fl-cDNAs contain all the information of functional mRNAs and proteins, we introduced rice fl-cDNAs into Arabidopsis plants for systematic gain-of-function mutation. We generated >30,000 independent Arabidopsis transgenic lines expressing rice fl-cDNAs (rice FOX Arabidopsis mutant lines). These rice FOX Arabidopsis lines were screened systematically for various criteria such as morphology, photosynthesis, UV resistance, element composition, plant hormone profile, metabolite profile/fingerprinting, bacterial resistance, and heat and salt tolerance. The information obtained from these screenings was compiled into a database named ‘RiceFOX’. This database contains around 18,000 records of rice FOX Arabidopsis lines and allows users to search against all the observed results, ranging from morphological to invisible traits. The number of searchable items is approximately 100; moreover, the rice FOX Arabidopsis lines can be searched by rice and Arabidopsis gene/protein identifiers, sequence similarity to the introduced rice fl-cDNA and traits. The RiceFOX database is available at http://ricefox.psc.riken.jp/. PMID:21186176
PodNet, a protein-protein interaction network of the podocyte.

PubMed

Warsow, Gregor; Endlich, Nicole; Schordan, Eric; Schordan, Sandra; Chilukoti, Ravi K; Homuth, Georg; Moeller, Marcus J; Fuellen, Georg; Endlich, Karlhans

2013-07-01

Interactions between proteins crucially determine cellular structure and function. Differential analysis of the interactome may help elucidate molecular mechanisms during disease development; however, this analysis necessitates mapping of expression data on protein-protein interaction networks. These networks do not exist for the podocyte; therefore, we built PodNet, a literature-based mouse podocyte network in Cytoscape format. Using database protein-protein interactions, we expanded PodNet to XPodNet with enhanced connectivity. In order to test the performance of XPodNet in differential interactome analysis, we examined podocyte developmental differentiation and the effect of cell culture. Transcriptomes of podocytes in 10 different states were mapped on XPodNet and analyzed with the Cytoscape plugin ExprEssence, based on the law of mass action. Interactions between slit diaphragm proteins are most significantly upregulated during podocyte development and most significantly downregulated in culture. On the other hand, our analysis revealed that interactions lost during podocyte differentiation are not regained in culture, suggesting a loss rather than a reversal of differentiation for podocytes in culture. Thus, we have developed PodNet as a valuable tool for differential interactome analysis in podocytes, and we have identified established and unexplored regulated interactions in developing and cultured podocytes.
Differential expression of three galaxin-related genes during settlement and metamorphosis in the scleractinian coral Acropora millepora

PubMed Central

Reyes-Bermudez, Alejandro; Lin, Zhiyi; Hayward, David C; Miller, David J; Ball, Eldon E

2009-01-01

Background The coral skeleton consists of CaCO3 deposited upon an organic matrix primarily as aragonite. Currently galaxin, from Galaxea fascicularis, is the only soluble protein component of the organic matrix that has been characterized from a coral. Three genes related to galaxin were identified in the coral Acropora millepora. Results One of the Acropora genes (Amgalaxin) encodes a clear galaxin ortholog, while the others (Amgalaxin-like 1 and Amgalaxin-like 2) encode larger and more divergent proteins. All three proteins are predicted to be extracellular and share common structural features, most notably the presence of repetitive motifs containing dicysteine residues. In situ hybridization reveals distinct, but partially overlapping, spatial expression of the genes in patterns consistent with distinct roles in calcification. Both of the Amgalaxin-like genes are expressed exclusively in the early stages of calcification, while Amgalaxin continues to be expressed in the adult, consistent with the situation in the coral Galaxea. Conclusion Comparisons with molluscs suggest functional convergence in the two groups; lustrin A/pearlin proteins may be the mollusc counterparts of galaxin, whereas the galaxin-like proteins combine characteristics of two distinct proteins involved in mollusc calcification. Database searches indicate that, although sequences with high similarity to the galaxins are restricted to the Scleractinia, more divergent members of this protein family are present in other cnidarians and some other metazoans. We suggest that ancestral galaxins may have been secondarily recruited to roles in calcification in the Triassic, when the Scleractinia first appeared. Understanding the evolution of the broader galaxin family will require wider sampling and expression analysis in a range of cnidarians and other animals. PMID:19638240

Computational tools for exploring sequence databases as a resource for antimicrobial peptides.

PubMed

Porto, W F; Pires, A S; Franco, O L

Data mining has been recognized by many researchers as a hot topic in different areas. In the post-genomic era, the growing number of sequences deposited in databases has been the reason why these databases have become a resource for novel biological information. In recent years, the identification of antimicrobial peptides (AMPs) in databases has gained attention. The identification of unannotated AMPs has shed some light on the distribution and evolution of AMPs and, in some cases, indicated suitable candidates for developing novel antimicrobial agents. The data mining process has been performed mainly by local alignments and/or regular expressions. Nevertheless, for the identification of distant homologous sequences, other techniques such as antimicrobial activity prediction and molecular modelling are required. In this context, this review addresses the tools and techniques, and also their limitations, for mining AMPs from databases. These methods could be helpful not only for the development of novel AMPs, but also for other kinds of proteins, at a higher level of structural genomics. Moreover, solving the problem of unannotated proteins could bring immeasurable benefits to society, especially in the case of AMPs, which could be helpful for developing novel antimicrobial agents and combating resistant bacteria. Copyright © 2017 Elsevier Inc. All rights reserved.
Human Mitochondrial Protein Database

National Institute of Standards and Technology Data Gateway

SRD 131 Human Mitochondrial Protein Database (Web, free access) The Human Mitochondrial Protein Database (HMPDb) provides comprehensive data on mitochondrial and human nuclear encoded proteins involved in mitochondrial biogenesis and function. This database consolidates information from SwissProt, LocusLink, Protein Data Bank (PDB), GenBank, Genome Database (GDB), Online Mendelian Inheritance in Man (OMIM), Human Mitochondrial Genome Database (mtDB), MITOMAP, Neuromuscular Disease Center and Human 2-D PAGE Databases. This database is intended as a tool not only to aid in studying the mitochondrion but in studying the associated diseases.
Combined Mass Spectrometry Imaging and Top-down Microproteomics Reveals Evidence of a Hidden Proteome in Ovarian Cancer.

PubMed

Delcourt, Vivian; Franck, Julien; Leblanc, Eric; Narducci, Fabrice; Robin, Yves-Marie; Gimeno, Jean-Pascal; Quanico, Jusal; Wisztorski, Maxence; Kobeissy, Firas; Jacques, Jean-François; Roucou, Xavier; Salzet, Michel; Fournier, Isabelle

2017-07-01

Recently, it was demonstrated that proteins can be translated from alternative open reading frames (altORFs), increasing the size of the actual proteome. Top-down mass spectrometry-based proteomics allows the identification of intact proteins containing post-translational modifications (PTMs) as well as truncated forms translated from reference ORFs or altORFs. Top-down tissue microproteomics was applied on benign, tumor and necrotic-fibrotic regions of serous ovarian cancer biopsies, identifying proteins exhibiting region-specific cellular localization and PTMs. The regions of interest (ROIs) were determined by MALDI mass spectrometry imaging and spatial segmentation. Analysis with a customized protein sequence database containing reference and alternative proteins (altprots) identified 15 altprots, including alternative G protein nucleolar 1 (AltGNL1) found in the tumor, and translated from an altORF nested within the GNL1 canonical coding sequence. Co-expression of GNL1 and altGNL1 was validated by transfection in HEK293 and HeLa cells with an expression plasmid containing a GNL1-FLAG (V5) construct. Western blot and immunofluorescence experiments confirmed constitutive co-expression of altGNL1-V5 with GNL1-FLAG. Taken together, our approach provides means to evaluate protein changes in the case of serous ovarian cancer, allowing the detection of potential markers that have never been considered. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.
Possible roles of the transcription factor Nrf1 (NFE2L1) in neural homeostasis by regulating the gene expression of deubiquitinating enzymes.

PubMed

Taniguchi, Hiroaki; Okamuro, Shota; Koji, Misaki; Waku, Tsuyoshi; Kubo, Kaori; Hatanaka, Atsushi; Sun, Yimeng; Chowdhury, A M Masudul Azad; Fukamizu, Akiyoshi; Kobayashi, Akira

2017-02-26

The transcription factor Nrf1 (NFE2L1) maintains protein homeostasis (proteostasis) by regulating the gene expression of proteasome subunits in response to proteasome inhibition. The deletion of the Nrf1 gene in neural stem/progenitor cells causes severe neurodegeneration due to the accumulation of ubiquitinated proteins in Purkinje cells and motor neurons (Nrf1 NKO mice). However, the molecular mechanisms governing this neurodegenerative process remain unclear. We demonstrate herein that the loss of Nrf1 leads to the reduced gene expression of the deubiquitinating enzymes (DUBs) but not proteasome subunits in Nrf1 NKO mice between P7 and P18. First, we show that K48-linked polyubiquitinated proteins accumulate in Nrf1-deficient Purkinje cells and cerebral cortex neurons. Nevertheless, loss of Nrf1 does not alter the expression and proteolytic activity of proteasome. A significantly reduced expression of deubiquitinating enzymes was also demonstrated in Nrf1-deficient cerebellar tissue using microarray analysis. The genome database further reveals species-conserved ARE, a Nrf1 recognition element, in the regulatory region of certain DUB genes. Furthermore, we show that Nrf1 can activate Usp9x gene expression related to neurodegeneration. Altogether these findings suggest that neurodegeneration in Nrf1 NKO mice may stem from the dysfunction of the ubiquitin-mediated regulation of neuronal proteins. Copyright © 2017 Elsevier Inc. All rights reserved.
The Protein Information Resource: an integrated public resource of functional annotation of proteins

PubMed Central

Wu, Cathy H.; Huang, Hongzhan; Arminski, Leslie; Castro-Alvear, Jorge; Chen, Yongxing; Hu, Zhang-Zhi; Ledley, Robert S.; Lewis, Kali C.; Mewes, Hans-Werner; Orcutt, Bruce C.; Suzek, Baris E.; Tsugita, Akira; Vinayaka, C. R.; Yeh, Lai-Su L.; Zhang, Jian; Barker, Winona C.

2002-01-01

The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases). PMID:11752247
ContaMiner and ContaBase: a webserver and database for early identification of unwantedly crystallized protein contaminants

PubMed Central

Hungler, Arnaud; Momin, Afaque; Diederichs, Kay; Arold, Stefan, T.

2016-01-01

Solving the phase problem in protein X-ray crystallography relies heavily on the identity of the crystallized protein, especially when molecular replacement (MR) methods are used. Yet, it is not uncommon that a contaminant crystallizes instead of the protein of interest. Such contaminants may be proteins from the expression host organism, protein fusion tags or proteins added during the purification steps. Many contaminants co-purify easily, crystallize and give good diffraction data. Identification of contaminant crystals may take time, since the presence of the contaminant is unexpected and its identity unknown. A webserver (ContaMiner) and a contaminant database (ContaBase) have been established, to allow fast MR-based screening of crystallographic data against currently 62 known contaminants. The web-based ContaMiner (available at http://strube.cbrc.kaust.edu.sa/contaminer/) currently produces results in 5 min to 4 h. The program is also available in a github repository and can be installed locally. ContaMiner enables screening of novel crystals at synchrotron beamlines, and it would be valuable as a routine safety check for ‘crystallization and preliminary X-ray analysis’ publications. Thus, in addition to potentially saving X-ray crystallographers much time and effort, ContaMiner might considerably lower the risk of publishing erroneous data. PMID:27980519
A Brief Review of RNA–Protein Interaction Database Resources

PubMed Central

Yi, Ying; Zhao, Yue; Huang, Yan; Wang, Dong

2017-01-01

RNA–Protein interactions play critical roles in various biological processes. By collecting and analyzing the RNA–Protein interactions and binding sites from experiments and predictions, RNA–Protein interaction databases have become an essential resource for the exploration of the transcriptional and post-transcriptional regulatory network. Here, we briefly review several widely used RNA–Protein interaction database resources developed in recent years to provide a guide of these databases. The content and major functions in databases are presented. The brief description of database helps users to quickly choose the database containing information they interested. In short, these RNA–Protein interaction database resources are continually updated, but the current state shows the efforts to identify and analyze the large amount of RNA–Protein interactions. PMID:29657278
SZDB: A Database for Schizophrenia Genetic Research

PubMed Central

Wu, Yong; Yao, Yong-Gang

2017-01-01

Abstract Schizophrenia (SZ) is a debilitating brain disorder with a complex genetic architecture. Genetic studies, especially recent genome-wide association studies (GWAS), have identified multiple variants (loci) conferring risk to SZ. However, how to efficiently extract meaningful biological information from bulk genetic findings of SZ remains a major challenge. There is a pressing need to integrate multiple layers of data from various sources, eg, genetic findings from GWAS, copy number variations (CNVs), association and linkage studies, gene expression, protein–protein interaction (PPI), co-expression, expression quantitative trait loci (eQTL), and Encyclopedia of DNA Elements (ENCODE) data, to provide a comprehensive resource to facilitate the translation of genetic findings into SZ molecular diagnosis and mechanism study. Here we developed the SZDB database (http://www.szdb.org/), a comprehensive resource for SZ research. SZ genetic data, gene expression data, network-based data, brain eQTL data, and SNP function annotation information were systematically extracted, curated and deposited in SZDB. In-depth analyses and systematic integration were performed to identify top prioritized SZ genes and enriched pathways. Multiple types of data from various layers of SZ research were systematically integrated and deposited in SZDB. In-depth data analyses and integration identified top prioritized SZ genes and enriched pathways. We further showed that genes implicated in SZ are highly co-expressed in human brain and proteins encoded by the prioritized SZ risk genes are significantly interacted. The user-friendly SZDB provides high-confidence candidate variants and genes for further functional characterization. More important, SZDB provides convenient online tools for data search and browse, data integration, and customized data analyses. PMID:27451428
The PBII gene of the human salivary proline-rich protein P-B produces another protein, Q504X8, with an opiorphin homolog, QRGPR.

PubMed

Saitoh, Eiichi; Sega, Takuya; Imai, Akane; Isemura, Satoko; Kato, Tetsuo; Ochiai, Akihito; Taniguchi, Masayuki

2018-04-01

The NCBI gene database and human-transcriptome database for alternative splicing were used to determine the expression of mRNAs for P-B (SMR3B) and variant form of P-B. The translational product from the former mRNA was identified as the protein named P-B, whereas that from the latter has not yet been elucidated. In the present study, we investigated the expression of P-B and its variant form at the protein level. To identify the variant protein of P-B, (1) cationic proteins with a higher isoelectric point in human pooled whole saliva were purified by a two dimensional liquid chromatography; (2) the peptide fragments generated from the in-solution of all proteins digested with trypsin separated and analyzed by MALDI-TOF-MS; and (3) the presence or absence of P-B in individual saliva was examined by 15% SDS-PAGE. The peptide sequences (I 37 PPPYSCTPNMNNCSR 52 , C 53 HHHHKRHHYPCNYCFCYPK 72 , R 59 HHYPCNYCFCYPK 72 and H 60 HYPCNYCFCYPK 72 ) present in the variant protein of P-B were identified. The peptide sequence (G 6 PYPPGPLAPPQPFGPGFVPPPPPPPYGPGR 36 ) in P-B (or the variant) and sequence (I 37 PPPPPAPYGPGIFPPPPPQP 57 ) in P-B were identified. The sum of the sequences identified indicated a 91.23% sequence identity for P-B and 79.76% for the variant. There were cases in which P-B existed in individual saliva, but there were cases in which it did not exist in individual saliva. The variant protein is produced by excising a non-canonical intron (CC-AC pair) from the 3'-noncoding sequence of the PBII gene. Both P-B and the variant are subject to proteolysis in the oral cavity. Copyright © 2018 Elsevier Ltd. All rights reserved.
Plant Reactome: a resource for plant pathways and comparative analysis.

PubMed

Naithani, Sushma; Preece, Justin; D'Eustachio, Peter; Gupta, Parul; Amarasinghe, Vindhya; Dharmawardhana, Palitha D; Wu, Guanming; Fabregat, Antonio; Elser, Justin L; Weiser, Joel; Keays, Maria; Fuentes, Alfonso Munoz-Pomer; Petryszak, Robert; Stein, Lincoln D; Ware, Doreen; Jaiswal, Pankaj

2017-01-04

Plant Reactome (http://plantreactome.gramene.org/) is a free, open-source, curated plant pathway database portal, provided as part of the Gramene project. The database provides intuitive bioinformatics tools for the visualization, analysis and interpretation of pathway knowledge to support genome annotation, genome analysis, modeling, systems biology, basic research and education. Plant Reactome employs the structural framework of a plant cell to show metabolic, transport, genetic, developmental and signaling pathways. We manually curate molecular details of pathways in these domains for reference species Oryza sativa (rice) supported by published literature and annotation of well-characterized genes. Two hundred twenty-two rice pathways, 1025 reactions associated with 1173 proteins, 907 small molecules and 256 literature references have been curated to date. These reference annotations were used to project pathways for 62 model, crop and evolutionarily significant plant species based on gene homology. Database users can search and browse various components of the database, visualize curated baseline expression of pathway-associated genes provided by the Expression Atlas and upload and analyze their Omics datasets. The database also offers data access via Application Programming Interfaces (APIs) and in various standardized pathway formats, such as SBML and BioPAX. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Meta-All: a system for managing metabolic pathway information.

PubMed

Weise, Stephan; Grosse, Ivo; Klukas, Christian; Koschützki, Dirk; Scholz, Uwe; Schreiber, Falk; Junker, Björn H

2006-10-23

Many attempts are being made to understand biological subjects at a systems level. A major resource for these approaches are biological databases, storing manifold information about DNA, RNA and protein sequences including their functional and structural motifs, molecular markers, mRNA expression levels, metabolite concentrations, protein-protein interactions, phenotypic traits or taxonomic relationships. The use of these databases is often hampered by the fact that they are designed for special application areas and thus lack universality. Databases on metabolic pathways, which provide an increasingly important foundation for many analyses of biochemical processes at a systems level, are no exception from the rule. Data stored in central databases such as KEGG, BRENDA or SABIO-RK is often limited to read-only access. If experimentalists want to store their own data, possibly still under investigation, there are two possibilities. They can either develop their own information system for managing that own data, which is very time-consuming and costly, or they can try to store their data in existing systems, which is often restricted. Hence, an out-of-the-box information system for managing metabolic pathway data is needed. We have designed META-ALL, an information system that allows the management of metabolic pathways, including reaction kinetics, detailed locations, environmental factors and taxonomic information. Data can be stored together with quality tags and in different parallel versions. META-ALL uses Oracle DBMS and Oracle Application Express. We provide the META-ALL information system for download and use. In this paper, we describe the database structure and give information about the tools for submitting and accessing the data. As a first application of META-ALL, we show how the information contained in a detailed kinetic model can be stored and accessed. META-ALL is a system for managing information about metabolic pathways. It facilitates the handling of pathway-related data and is designed to help biochemists and molecular biologists in their daily research. It is available on the Web at http://bic-gh.de/meta-all and can be downloaded free of charge and installed locally.
Meta-All: a system for managing metabolic pathway information

PubMed Central

Weise, Stephan; Grosse, Ivo; Klukas, Christian; Koschützki, Dirk; Scholz, Uwe; Schreiber, Falk; Junker, Björn H

2006-01-01

Background Many attempts are being made to understand biological subjects at a systems level. A major resource for these approaches are biological databases, storing manifold information about DNA, RNA and protein sequences including their functional and structural motifs, molecular markers, mRNA expression levels, metabolite concentrations, protein-protein interactions, phenotypic traits or taxonomic relationships. The use of these databases is often hampered by the fact that they are designed for special application areas and thus lack universality. Databases on metabolic pathways, which provide an increasingly important foundation for many analyses of biochemical processes at a systems level, are no exception from the rule. Data stored in central databases such as KEGG, BRENDA or SABIO-RK is often limited to read-only access. If experimentalists want to store their own data, possibly still under investigation, there are two possibilities. They can either develop their own information system for managing that own data, which is very time-consuming and costly, or they can try to store their data in existing systems, which is often restricted. Hence, an out-of-the-box information system for managing metabolic pathway data is needed. Results We have designed META-ALL, an information system that allows the management of metabolic pathways, including reaction kinetics, detailed locations, environmental factors and taxonomic information. Data can be stored together with quality tags and in different parallel versions. META-ALL uses Oracle DBMS and Oracle Application Express. We provide the META-ALL information system for download and use. In this paper, we describe the database structure and give information about the tools for submitting and accessing the data. As a first application of META-ALL, we show how the information contained in a detailed kinetic model can be stored and accessed. Conclusion META-ALL is a system for managing information about metabolic pathways. It facilitates the handling of pathway-related data and is designed to help biochemists and molecular biologists in their daily research. It is available on the Web at and can be downloaded free of charge and installed locally. PMID:17059592
Identification of key genes and pathways associated with neuropathic pain in uninjured dorsal root ganglion by using bioinformatic analysis.

PubMed

Chen, Chao-Jin; Liu, De-Zhao; Yao, Wei-Feng; Gu, Yu; Huang, Fei; Hei, Zi-Qing; Li, Xiang

2017-01-01

Neuropathic pain is a complex chronic condition occurring post-nervous system damage. The transcriptional reprogramming of injured dorsal root ganglia (DRGs) drives neuropathic pain. However, few comparative analyses using high-throughput platforms have investigated uninjured DRG in neuropathic pain, and potential interactions among differentially expressed genes (DEGs) and pathways were not taken into consideration. The aim of this study was to identify changes in genes and pathways associated with neuropathic pain in uninjured L4 DRG after L5 spinal nerve ligation (SNL) by using bioinformatic analysis. The microarray profile GSE24982 was downloaded from the Gene Expression Omnibus database to identify DEGs between DRGs in SNL and sham rats. The prioritization for these DEGs was performed using the Toppgene database followed by gene ontology and pathway enrichment analyses. The relationships among DEGs from the protein interactive perspective were analyzed using protein-protein interaction (PPI) network and module analysis. Real-time polymerase chain reaction (PCR) and Western blotting were used to confirm the expression of DEGs in the rodent neuropathic pain model. A total of 206 DEGs that might play a role in neuropathic pain were identified in L4 DRG, of which 75 were upregulated and 131 were downregulated. The upregulated DEGs were enriched in biological processes related to transcription regulation and molecular functions such as DNA binding, cell cycle, and the FoxO signaling pathway. Ctnnb1 protein had the highest connectivity degrees in the PPI network. The in vivo studies also validated that mRNA and protein levels of Ctnnb1 were upregulated in both L4 and L5 DRGs. This study provides insight into the functional gene sets and pathways associated with neuropathic pain in L4 uninjured DRG after L5 SNL, which might promote our understanding of the molecular mechanisms underlying the development of neuropathic pain.
Proteomic analysis of PSD-93 knockout mice following the induction of ischemic cerebral injury.

PubMed

Rong, Rong; Yang, Hui; Rong, Liangqun; Wei, Xiue; Li, Qingjie; Liu, Xiaomei; Gao, Hong; Xu, Yun; Zhang, Qingxiu

2016-03-01

Postsynaptic density protein-93 (PSD-93) is enriched in the postsynaptic density and is involved in N-methyl-d-aspartate receptor (NMDAR) triggered neurotoxicity through PSD-93/NMDAR/nNOS signaling pathway. In the present study, we found that PSD-93 deficiency reduced infarcted volume and neurological deficits induced by transient middle cerebral artery occlusion (tMCAO) in the mice. To identify novel targets of PSD-93 related neurotoxicity, we applied isobaric tags for relative and absolute quantitative (iTRAQ) labeling and combined this labeling with on-line two-dimensional LC/MS/MS technology to elucidate the changes in protein expression in PSD-93 knockout mice following tMCAO. The proteomic data set consisted of 1892 proteins. Compared to control group, differences in expression levels in ischemic group >1.5-fold and <0.66-fold were considered as differential expression. A total of 104 unique proteins with differential abundance levels were identified, among which 17 proteins were selected for further validation. Gene ontology analysis using UniProt database revealed that these differentially expressed proteins are involved in diverse function such as synaptic transmission, neuronal neurotransmitter and ion transport, modification of organelle membrane components. Moreover, network analysis revealed that the interacting proteins were involved in the transport of synaptic vesicles, the integrity of synaptic membranes and the activation of the ionotropic glutamate receptors NMDAR1 and NMDAR2B. Finally, RT-PCR and Western blot analysis showed that SynGAP, syntaxin-1A, protein kinase C β, and voltage-dependent L-type calcium channels were inhibited by ischemia-reperfusion. Identification of these proteins provides valuable clues to elucidate the mechanisms underlying the actions of PSD-93 in ischemia-reperfusion induced neurotoxicity. Copyright © 2015 Elsevier Inc. All rights reserved.
Using co-expression analysis and stress-based screens to uncover Arabidopsis peroxisomal proteins involved in drought response

DOE PAGES

Li, Jiying; Hu, Jianping; Bassham, Diane

2015-09-14

Peroxisomes are essential organelles that house a wide array of metabolic reactions important for plant growth and development. However, our knowledge regarding the role of peroxisomal proteins in various biological processes, including plant stress response, is still incomplete. Recent proteomic studies of plant peroxisomes significantly increased the number of known peroxisomal proteins and greatly facilitated the study of peroxisomes at the systems level. The objectives of this study were to determine whether genes that encode peroxisomal proteins with related functions are co-expressed in Arabidopsis and identify peroxisomal proteins involved in stress response using in silico analysis and mutant screens. Usingmore » microarray data from online databases, we performed hierarchical clustering analysis to generate a comprehensive view of transcript level changes for Arabidopsis peroxisomal genes during development and under abiotic and biotic stress conditions. Many genes involved in the same metabolic pathways exhibited co-expression, some genes known to be involved in stress response are regulated by the corresponding stress conditions, and function of some peroxisomal proteins could be predicted based on their coexpression pattern. Since drought caused expression changes to the highest number of genes that encode peroxisomal proteins, we subjected a subset of Arabidopsis peroxisomal mutants to a drought stress assay. Mutants of the LON2 protease and the photorespiratory enzyme hydroxypyruvate reductase 1 (HPR1) showed enhanced susceptibility to drought, suggesting the involvement of peroxisomal quality control and photorespiration in drought resistance. Lastly, our study provided a global view of how genes that encode peroxisomal proteins respond to developmental and environmental cues and began to reveal additional peroxisomal proteins involved in stress response, thus opening up new avenues to investigate the role of peroxisomes in plant adaptation to environmental stresses.« less
The PEPR GeneChip data warehouse, and implementation of a dynamic time series query tool (SGQT) with graphical interface.

PubMed

Chen, Josephine; Zhao, Po; Massaro, Donald; Clerch, Linda B; Almon, Richard R; DuBois, Debra C; Jusko, William J; Hoffman, Eric P

2004-01-01

Publicly accessible DNA databases (genome browsers) are rapidly accelerating post-genomic research (see http://www.genome.ucsc.edu/), with integrated genomic DNA, gene structure, EST/ splicing and cross-species ortholog data. DNA databases have relatively low dimensionality; the genome is a linear code that anchors all associated data. In contrast, RNA expression and protein databases need to be able to handle very high dimensional data, with time, tissue, cell type and genes, as interrelated variables. The high dimensionality of microarray expression profile data, and the lack of a standard experimental platform have complicated the development of web-accessible databases and analytical tools. We have designed and implemented a public resource of expression profile data containing 1024 human, mouse and rat Affymetrix GeneChip expression profiles, generated in the same laboratory, and subject to the same quality and procedural controls (Public Expression Profiling Resource; PEPR). Our Oracle-based PEPR data warehouse includes a novel time series query analysis tool (SGQT), enabling dynamic generation of graphs and spreadsheets showing the action of any transcript of interest over time. In this report, we demonstrate the utility of this tool using a 27 time point, in vivo muscle regeneration series. This data warehouse and associated analysis tools provides access to multidimensional microarray data through web-based interfaces, both for download of all types of raw data for independent analysis, and also for straightforward gene-based queries. Planned implementations of PEPR will include web-based remote entry of projects adhering to quality control and standard operating procedure (QC/SOP) criteria, and automated output of alternative probe set algorithms for each project (see http://microarray.cnmcresearch.org/pgadatatable.asp).
The PEPR GeneChip data warehouse, and implementation of a dynamic time series query tool (SGQT) with graphical interface

PubMed Central

Chen, Josephine; Zhao, Po; Massaro, Donald; Clerch, Linda B.; Almon, Richard R.; DuBois, Debra C.; Jusko, William J.; Hoffman, Eric P.

2004-01-01

Publicly accessible DNA databases (genome browsers) are rapidly accelerating post-genomic research (see http://www.genome.ucsc.edu/), with integrated genomic DNA, gene structure, EST/ splicing and cross-species ortholog data. DNA databases have relatively low dimensionality; the genome is a linear code that anchors all associated data. In contrast, RNA expression and protein databases need to be able to handle very high dimensional data, with time, tissue, cell type and genes, as interrelated variables. The high dimensionality of microarray expression profile data, and the lack of a standard experimental platform have complicated the development of web-accessible databases and analytical tools. We have designed and implemented a public resource of expression profile data containing 1024 human, mouse and rat Affymetrix GeneChip expression profiles, generated in the same laboratory, and subject to the same quality and procedural controls (Public Expression Profiling Resource; PEPR). Our Oracle-based PEPR data warehouse includes a novel time series query analysis tool (SGQT), enabling dynamic generation of graphs and spreadsheets showing the action of any transcript of interest over time. In this report, we demonstrate the utility of this tool using a 27 time point, in vivo muscle regeneration series. This data warehouse and associated analysis tools provides access to multidimensional microarray data through web-based interfaces, both for download of all types of raw data for independent analysis, and also for straightforward gene-based queries. Planned implementations of PEPR will include web-based remote entry of projects adhering to quality control and standard operating procedure (QC/SOP) criteria, and automated output of alternative probe set algorithms for each project (see http://microarray.cnmcresearch.org/pgadatatable.asp). PMID:14681485
A database application for pre-processing, storage and comparison of mass spectra derived from patients and controls

PubMed Central

Titulaer, Mark K; Siccama, Ivar; Dekker, Lennard J; van Rijswijk, Angelique LCT; Heeren, Ron MA; Sillevis Smitt, Peter A; Luider, Theo M

2006-01-01

Background Statistical comparison of peptide profiles in biomarker discovery requires fast, user-friendly software for high throughput data analysis. Important features are flexibility in changing input variables and statistical analysis of peptides that are differentially expressed between patient and control groups. In addition, integration the mass spectrometry data with the results of other experiments, such as microarray analysis, and information from other databases requires a central storage of the profile matrix, where protein id's can be added to peptide masses of interest. Results A new database application is presented, to detect and identify significantly differentially expressed peptides in peptide profiles obtained from body fluids of patient and control groups. The presented modular software is capable of central storage of mass spectra and results in fast analysis. The software architecture consists of 4 pillars, 1) a Graphical User Interface written in Java, 2) a MySQL database, which contains all metadata, such as experiment numbers and sample codes, 3) a FTP (File Transport Protocol) server to store all raw mass spectrometry files and processed data, and 4) the software package R, which is used for modular statistical calculations, such as the Wilcoxon-Mann-Whitney rank sum test. Statistic analysis by the Wilcoxon-Mann-Whitney test in R demonstrates that peptide-profiles of two patient groups 1) breast cancer patients with leptomeningeal metastases and 2) prostate cancer patients in end stage disease can be distinguished from those of control groups. Conclusion The database application is capable to distinguish patient Matrix Assisted Laser Desorption Ionization (MALDI-TOF) peptide profiles from control groups using large size datasets. The modular architecture of the application makes it possible to adapt the application to handle also large sized data from MS/MS- and Fourier Transform Ion Cyclotron Resonance (FT-ICR) mass spectrometry experiments. It is expected that the higher resolution and mass accuracy of the FT-ICR mass spectrometry prevents the clustering of peaks of different peptides and allows the identification of differentially expressed proteins from the peptide profiles. PMID:16953879
A database application for pre-processing, storage and comparison of mass spectra derived from patients and controls.

PubMed

Titulaer, Mark K; Siccama, Ivar; Dekker, Lennard J; van Rijswijk, Angelique L C T; Heeren, Ron M A; Sillevis Smitt, Peter A; Luider, Theo M

2006-09-05

Statistical comparison of peptide profiles in biomarker discovery requires fast, user-friendly software for high throughput data analysis. Important features are flexibility in changing input variables and statistical analysis of peptides that are differentially expressed between patient and control groups. In addition, integration the mass spectrometry data with the results of other experiments, such as microarray analysis, and information from other databases requires a central storage of the profile matrix, where protein id's can be added to peptide masses of interest. A new database application is presented, to detect and identify significantly differentially expressed peptides in peptide profiles obtained from body fluids of patient and control groups. The presented modular software is capable of central storage of mass spectra and results in fast analysis. The software architecture consists of 4 pillars, 1) a Graphical User Interface written in Java, 2) a MySQL database, which contains all metadata, such as experiment numbers and sample codes, 3) a FTP (File Transport Protocol) server to store all raw mass spectrometry files and processed data, and 4) the software package R, which is used for modular statistical calculations, such as the Wilcoxon-Mann-Whitney rank sum test. Statistic analysis by the Wilcoxon-Mann-Whitney test in R demonstrates that peptide-profiles of two patient groups 1) breast cancer patients with leptomeningeal metastases and 2) prostate cancer patients in end stage disease can be distinguished from those of control groups. The database application is capable to distinguish patient Matrix Assisted Laser Desorption Ionization (MALDI-TOF) peptide profiles from control groups using large size datasets. The modular architecture of the application makes it possible to adapt the application to handle also large sized data from MS/MS- and Fourier Transform Ion Cyclotron Resonance (FT-ICR) mass spectrometry experiments. It is expected that the higher resolution and mass accuracy of the FT-ICR mass spectrometry prevents the clustering of peaks of different peptides and allows the identification of differentially expressed proteins from the peptide profiles.
Improving links between literature and biological data with text mining: a case study with GEO, PDB and MEDLINE.

PubMed

Névéol, Aurélie; Wilbur, W John; Lu, Zhiyong

2012-01-01

High-throughput experiments and bioinformatics techniques are creating an exploding volume of data that are becoming overwhelming to keep track of for biologists and researchers who need to access, analyze and process existing data. Much of the available data are being deposited in specialized databases, such as the Gene Expression Omnibus (GEO) for microarrays or the Protein Data Bank (PDB) for protein structures and coordinates. Data sets are also being described by their authors in publications archived in literature databases such as MEDLINE and PubMed Central. Currently, the curation of links between biological databases and the literature mainly relies on manual labour, which makes it a time-consuming and daunting task. Herein, we analysed the current state of link curation between GEO, PDB and MEDLINE. We found that the link curation is heterogeneous depending on the sources and databases involved, and that overlap between sources is low, <50% for PDB and GEO. Furthermore, we showed that text-mining tools can automatically provide valuable evidence to help curators broaden the scope of articles and database entries that they review. As a result, we made recommendations to improve the coverage of curated links, as well as the consistency of information available from different databases while maintaining high-quality curation. Database URLs: http://www.ncbi.nlm.nih.gov/PubMed, http://www.ncbi.nlm.nih.gov/geo/, http://www.rcsb.org/pdb/

Improving links between literature and biological data with text mining: a case study with GEO, PDB and MEDLINE

PubMed Central

Névéol, Aurélie; Wilbur, W. John; Lu, Zhiyong

2012-01-01

High-throughput experiments and bioinformatics techniques are creating an exploding volume of data that are becoming overwhelming to keep track of for biologists and researchers who need to access, analyze and process existing data. Much of the available data are being deposited in specialized databases, such as the Gene Expression Omnibus (GEO) for microarrays or the Protein Data Bank (PDB) for protein structures and coordinates. Data sets are also being described by their authors in publications archived in literature databases such as MEDLINE and PubMed Central. Currently, the curation of links between biological databases and the literature mainly relies on manual labour, which makes it a time-consuming and daunting task. Herein, we analysed the current state of link curation between GEO, PDB and MEDLINE. We found that the link curation is heterogeneous depending on the sources and databases involved, and that overlap between sources is low, <50% for PDB and GEO. Furthermore, we showed that text-mining tools can automatically provide valuable evidence to help curators broaden the scope of articles and database entries that they review. As a result, we made recommendations to improve the coverage of curated links, as well as the consistency of information available from different databases while maintaining high-quality curation. Database URLs: http://www.ncbi.nlm.nih.gov/PubMed, http://www.ncbi.nlm.nih.gov/geo/, http://www.rcsb.org/pdb/ PMID:22685160
A Potential Role for Drosophila Mucins in Development and Physiology

PubMed Central

Syed, Zulfeqhar A.; Härd, Torleif; Uv, Anne; van Dijk-Härd, Iris F.

2008-01-01

Vital vertebrate organs are protected from the external environment by a barrier that to a large extent consists of mucins. These proteins are characterized by poorly conserved repeated sequences that are rich in prolines and potentially glycosylated threonines and serines (PTS). We have now used the characteristics of the PTS repeat domain to identify Drosophila mucins in a simple bioinformatics approach. Searching the predicted protein database for proteins with at least 4 repeats and a high ST content, more than 30 mucin-like proteins were identified, ranging from 300–23000 amino acids in length. We find that Drosophila mucins are present at all stages of the fly life cycle, and that their transcripts localize to selective organs analogous to sites of vertebrate mucin expression. The results could allow for addressing basic questions about human mucin-related diseases in this model system. Additionally, many of the mucins are expressed in selective tissues during embryogenesis, thus revealing new potential functions for mucins as apical matrix components during organ morphogenesis. PMID:18725942
Global Proteomics Analysis of the Response to Starvation in C. elegans*

PubMed Central

Larance, Mark; Pourkarimi, Ehsan; Wang, Bin; Brenes Murillo, Alejandro; Kent, Robert; Lamond, Angus I.; Gartner, Anton

2015-01-01

Periodic starvation of animals induces large shifts in metabolism but may also influence many other cellular systems and can lead to adaption to prolonged starvation conditions. To date, there is limited understanding of how starvation affects gene expression, particularly at the protein level. Here, we have used mass-spectrometry-based quantitative proteomics to identify global changes in the Caenorhabditis elegans proteome due to acute starvation of young adult animals. Measuring changes in the abundance of over 5,000 proteins, we show that acute starvation rapidly alters the levels of hundreds of proteins, many involved in central metabolic pathways, highlighting key regulatory responses. Surprisingly, we also detect changes in the abundance of chromatin-associated proteins, including specific linker histones, histone variants, and histone posttranslational modifications associated with the epigenetic control of gene expression. To maximize community access to these data, they are presented in an online searchable database, the Encyclopedia of Proteome Dynamics (http://www.peptracker.com/epd/). PMID:25963834
Comprehensive Analysis of Gene Expression Profiles of Sepsis-Induced Multiorgan Failure Identified Its Valuable Biomarkers.

PubMed

Wang, Yumei; Yin, Xiaoling; Yang, Fang

2018-02-01

Sepsis is an inflammatory-related disease, and severe sepsis would induce multiorgan dysfunction, which is the most common cause of death of patients in noncoronary intensive care units. Progression of novel therapeutic strategies has proven to be of little impact on the mortality of severe sepsis, and unfortunately, its mechanisms still remain poorly understood. In this study, we analyzed gene expression profiles of severe sepsis with failure of lung, kidney, and liver for the identification of potential biomarkers. We first downloaded the gene expression profiles from the Gene Expression Omnibus and performed preprocessing of raw microarray data sets and identification of differential expression genes (DEGs) through the R programming software; then, significantly enriched functions of DEGs in lung, kidney, and liver failure sepsis samples were obtained from the Database for Annotation, Visualization, and Integrated Discovery; finally, protein-protein interaction network was constructed for DEGs based on the STRING database, and network modules were also obtained through the MCODE cluster method. As a result, lung failure sepsis has the highest number of DEGs of 859, whereas the number of DEGs in kidney and liver failure sepsis samples is 178 and 175, respectively. In addition, 17 overlaps were obtained among the three lists of DEGs. Biological processes related to immune and inflammatory response were found to be significantly enriched in DEGs. Network and module analysis identified four gene clusters in which all or most of genes were upregulated. The expression changes of Icam1 and Socs3 were further validated through quantitative PCR analysis. This study should shed light on the development of sepsis and provide potential therapeutic targets for sepsis-induced multiorgan failure.
Gene: a gene-centered information resource at NCBI.

PubMed

Brown, Garth R; Hem, Vichet; Katz, Kenneth S; Ovetsky, Michael; Wallin, Craig; Ermolaeva, Olga; Tolstoy, Igor; Tatusova, Tatiana; Pruitt, Kim D; Maglott, Donna R; Murphy, Terence D

2015-01-01

The National Center for Biotechnology Information's (NCBI) Gene database (www.ncbi.nlm.nih.gov/gene) integrates gene-specific information from multiple data sources. NCBI Reference Sequence (RefSeq) genomes for viruses, prokaryotes and eukaryotes are the primary foundation for Gene records in that they form the critical association between sequence and a tracked gene upon which additional functional and descriptive content is anchored. Additional content is integrated based on the genomic location and RefSeq transcript and protein sequence data. The content of a Gene record represents the integration of curation and automated processing from RefSeq, collaborating model organism databases, consortia such as Gene Ontology, and other databases within NCBI. Records in Gene are assigned unique, tracked integers as identifiers. The content (citations, nomenclature, genomic location, gene products and their attributes, phenotypes, sequences, interactions, variation details, maps, expression, homologs, protein domains and external databases) is available via interactive browsing through NCBI's Entrez system, via NCBI's Entrez programming utilities (E-Utilities and Entrez Direct) and for bulk transfer by FTP. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
gEVE: a genome-based endogenous viral element database provides comprehensive viral protein-coding sequences in mammalian genomes.

PubMed

Nakagawa, So; Takahashi, Mahoko Ueda

2016-01-01

In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species.Database URL: http://geve.med.u-tokai.ac.jp. © The Author(s) 2016. Published by Oxford University Press.
From 20th century metabolic wall charts to 21st century systems biology: database of mammalian metabolic enzymes

PubMed Central

Corcoran, Callan C.; Grady, Cameron R.; Pisitkun, Trairak; Parulekar, Jaya

2017-01-01

The organization of the mammalian genome into gene subsets corresponding to specific functional classes has provided key tools for systems biology research. Here, we have created a web-accessible resource called the Mammalian Metabolic Enzyme Database (https://hpcwebapps.cit.nih.gov/ESBL/Database/MetabolicEnzymes/MetabolicEnzymeDatabase.html) keyed to the biochemical reactions represented on iconic metabolic pathway wall charts created in the previous century. Overall, we have mapped 1,647 genes to these pathways, representing ~7 percent of the protein-coding genome. To illustrate the use of the database, we apply it to the area of kidney physiology. In so doing, we have created an additional database (Database of Metabolic Enzymes in Kidney Tubule Segments: https://hpcwebapps.cit.nih.gov/ESBL/Database/MetabolicEnzymes/), mapping mRNA abundance measurements (mined from RNA-Seq studies) for all metabolic enzymes to each of 14 renal tubule segments. We carry out bioinformatics analysis of the enzyme expression pattern among renal tubule segments and mine various data sources to identify vasopressin-regulated metabolic enzymes in the renal collecting duct. PMID:27974320
SM-TF: A structural database of small molecule-transcription factor complexes.

PubMed

Xu, Xianjin; Ma, Zhiwei; Sun, Hongmin; Zou, Xiaoqin

2016-06-30

Transcription factors (TFs) are the proteins involved in the transcription process, ensuring the correct expression of specific genes. Numerous diseases arise from the dysfunction of specific TFs. In fact, over 30 TFs have been identified as therapeutic targets of about 9% of the approved drugs. In this study, we created a structural database of small molecule-transcription factor (SM-TF) complexes, available online at http://zoulab.dalton.missouri.edu/SM-TF. The 3D structures of the co-bound small molecule and the corresponding binding sites on TFs are provided in the database, serving as a valuable resource to assist structure-based drug design related to TFs. Currently, the SM-TF database contains 934 entries covering 176 TFs from a variety of species. The database is further classified into several subsets by species and organisms. The entries in the SM-TF database are linked to the UniProt database and other sequence-based TF databases. Furthermore, the druggable TFs from human and the corresponding approved drugs are linked to the DrugBank. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
gEVE: a genome-based endogenous viral element database provides comprehensive viral protein-coding sequences in mammalian genomes

PubMed Central

Nakagawa, So; Takahashi, Mahoko Ueda

2016-01-01

In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species. Database URL: http://geve.med.u-tokai.ac.jp PMID:27242033
Inflammatory and mitochondrial gene expression data in GPER-deficient cardiomyocytes from male and female mice.

PubMed

Wang, Hao; Sun, Xuming; Chou, Jeff; Lin, Marina; Ferrario, Carlos M; Zapata-Sudo, Gisele; Groban, Leanne

2017-02-01

We previously showed that cardiomyocyte-specific G protein-coupled estrogen receptor (GPER) gene deletion leads to sex-specific adverse effects on cardiac structure and function; alterations which may be due to distinct differences in mitochondrial and inflammatory processes between sexes. Here, we provide the results of Gene Set Enrichment Analysis (GSEA) based on the DNA microarray data from GPER-knockout versus GPER-intact (intact) cardiomyocytes. This article contains complete data on the mitochondrial and inflammatory response-related gene expression changes that were significant in GPER knockout versus intact cardiomyocytes from adult male and female mice. The data are supplemental to our original research article "Cardiomyocyte-specific deletion of the G protein-coupled estrogen receptor (GPER) leads to left ventricular dysfunction and adverse remodeling: a sex-specific gene profiling" (Wang et al., 2016) [1]. Data have been deposited to the Gene Expression Omnibus (GEO) database repository with the dataset identifier GSE86843.
Human epidermis is a novel site of phospholipase B expression.

PubMed

Maury, Eric; Prévost, Marie Claude; Nauze, Michel; Redoulès, Daniel; Tarroux, Roger; Charvéron, Marie; Salles, Jean Pierre; Perret, Bertrand; Chap, Hugues; Gassama-Diagne, Ama

2002-07-12

Phospholipase B (PLB) is an enzyme that displays both phospholipase A(2) and lysophospholipase activities. Analysis of human epidermis homogenates indicated the presence of a 97 kDa PLB protein, as well as a phospholipase A(2) activity, both being enriched in the soluble fraction. Immunolabelling and in situ hybridization experiments showed that this enzyme is expressed in the different layers of epidermis with an accumulation at the dermo-epidermis junction. RT-PCR data indicated that PLB is specifically expressed in natural and reconstructed epidermis. By 3'-RACE-PCR and screening of human genome databases, we obtained a 3600 bp cDNA coding for human PLB highly homologous to already described intestinal brush border PLBs. These data led us to conclude that the soluble PLB corresponds to a proteolytic cleavage of the membrane anchored protein. Altogether, our results provide the first characterization of human PLB which should play an important role in epidermal barrier function.
Understanding sequence similarity and framework analysis between centromere proteins using computational biology.

PubMed

Doss, C George Priya; Chakrabarty, Chiranjib; Debajyoti, C; Debottam, S

2014-11-01

Certain mysteries pointing toward their recruitment pathways, cell cycle regulation mechanisms, spindle checkpoint assembly, and chromosome segregation process are considered the centre of attraction in cancer research. In modern times, with the established databases, ranges of computational platforms have provided a platform to examine almost all the physiological and biochemical evidences in disease-associated phenotypes. Using existing computational methods, we have utilized the amino acid residues to understand the similarity within the evolutionary variance of different associated centromere proteins. This study related to sequence similarity, protein-protein networking, co-expression analysis, and evolutionary trajectory of centromere proteins will speed up the understanding about centromere biology and will create a road map for upcoming researchers who are initiating their work of clinical sequencing using centromere proteins.
Members of the Dof transcription factor family in Triticum aestivum are associated with light-mediated gene regulation.

PubMed

Shaw, Lindsay M; McIntyre, C Lynne; Gresshoff, Peter M; Xue, Gang-Ping

2009-11-01

DNA binding with One Finger (Dof) protein is a plant-specific transcription factor implicated in the regulation of many important plant-specific processes, including photosynthesis and carbohydrate metabolism. This study has identified 31 Dof genes (TaDof) in bread wheat through extensive analysis of current nucleotide databases. Phylogenetic analysis suggests that the TaDof family can be divided into four clades. Expression analysis of the TaDof family across all major organs using quantitative RT-PCR and searches of the wheat genome array database revealed that the majority of TaDof members were predominately expressed in vegetative organs. A large number of TaDof members were down-regulated by drought and/or were responsive to the light and dark cycle. Further expression analysis revealed that light up-regulated TaDof members were highly correlated in expression with a number of genes that are involved in photosynthesis or sucrose transport. These data suggest that the TaDof family may have an important role in light-mediated gene regulation, including involvement in the photosynthetic process.
Bioinformatics analysis of the prognostic value of Tripartite Motif 28 in breast cancer.

PubMed

Hao, Ling; Leng, Jun; Xiao, Ruijing; Kingsley, Tembo; Li, Xinran; Tu, Zhenbo; Yang, Xiangyong; Deng, Xinzhou; Xiong, Meng; Xiong, Jie; Zhang, Qiuping

2017-04-01

Tripartite motif containing 28 (TRIM28) is a transcriptional regulator acting as an essential corepressor for Krüppel-associated box zinc finger domain-containing proteins in multiple tissue and cell types. An increasing number of studies have investigated the function of TRIM28; however, its prognostic value in breast cancer (BC) remains unclear. In the present study, the expression of TRIM28 was identified to be significantly higher in cancerous compared with healthy tissue samples. Furthermore, it was demonstrated that TRIM28 expression was significantly correlated with several clinicopathological characteristics of patients with BC, such as p53 mutation, tumor recurrence and Elston grade of the tumor. In addition, a protein-protein interaction network was created to illustrate the interactions of TRIM28 with other proteins. The prognostic value of TRIM28 in patients with BC was investigated using the Kaplan-Meier Plotter database, which revealed that high expression of TRIM28 is a predictor of poor prognosis in patients with BC. In conclusion, the results of the present study indicate that TRIM28 provides a survival advantage to patients with BC and is a novel prognostic biomarker, in addition to being a therapeutic target for the treatment of BC.
The Plant Structure Ontology, a Unified Vocabulary of Anatomy and Morphology of a Flowering Plant1[W][OA

PubMed Central

Ilic, Katica; Kellogg, Elizabeth A.; Jaiswal, Pankaj; Zapata, Felipe; Stevens, Peter F.; Vincent, Leszek P.; Avraham, Shulamit; Reiser, Leonore; Pujar, Anuradha; Sachs, Martin M.; Whitman, Noah T.; McCouch, Susan R.; Schaeffer, Mary L.; Ware, Doreen H.; Stein, Lincoln D.; Rhee, Seung Y.

2007-01-01

Formal description of plant phenotypes and standardized annotation of gene expression and protein localization data require uniform terminology that accurately describes plant anatomy and morphology. This facilitates cross species comparative studies and quantitative comparison of phenotypes and expression patterns. A major drawback is variable terminology that is used to describe plant anatomy and morphology in publications and genomic databases for different species. The same terms are sometimes applied to different plant structures in different taxonomic groups. Conversely, similar structures are named by their species-specific terms. To address this problem, we created the Plant Structure Ontology (PSO), the first generic ontological representation of anatomy and morphology of a flowering plant. The PSO is intended for a broad plant research community, including bench scientists, curators in genomic databases, and bioinformaticians. The initial releases of the PSO integrated existing ontologies for Arabidopsis (Arabidopsis thaliana), maize (Zea mays), and rice (Oryza sativa); more recent versions of the ontology encompass terms relevant to Fabaceae, Solanaceae, additional cereal crops, and poplar (Populus spp.). Databases such as The Arabidopsis Information Resource, Nottingham Arabidopsis Stock Centre, Gramene, MaizeGDB, and SOL Genomics Network are using the PSO to describe expression patterns of genes and phenotypes of mutants and natural variants and are regularly contributing new annotations to the Plant Ontology database. The PSO is also used in specialized public databases, such as BRENDA, GENEVESTIGATOR, NASCArrays, and others. Over 10,000 gene annotations and phenotype descriptions from participating databases can be queried and retrieved using the Plant Ontology browser. The PSO, as well as contributed gene associations, can be obtained at www.plantontology.org. PMID:17142475
Vascular endothelial cells express isoforms of protein kinase A inhibitor.

PubMed

Lum, Hazel; Hao, Zengping; Gayle, Dave; Kumar, Priyadarsini; Patterson, Carolyn E; Uhler, Michael D

2002-01-01

The expression and function of the endogenous inhibitor of cAMP-dependent protein kinase (PKI) in endothelial cells are unknown. In this study, overexpression of rabbit muscle PKI gene into endothelial cells inhibited the cAMP-mediated increase and exacerbated thrombin-induced decrease in endothelial barrier function. We investigated PKI expression in human pulmonary artery (HPAECs), foreskin microvessel (HMECs), and brain microvessel endothelial cells (HBMECs). RT-PCR using specific primers for human PKI alpha, human PKI gamma, and mouse PKI beta sequences detected PKI alpha and PKI gamma mRNA in all three cell types. Sequencing and BLAST analysis indicated that forward and reverse DNA strands for PKI alpha and PKI gamma were of >96% identity with database sequences. RNase protection assays showed protection of the 542 nucleotides in HBMEC and HPAEC PKI alpha mRNA and 240 nucleotides in HBMEC, HPAEC, and HMEC PKI gamma mRNA. Western blot analysis indicated that PKI gamma protein was detected in all three cell types, whereas PKI alpha was found in HBMECs. In summary, endothelial cells from three different vascular beds express PKI alpha and PKI gamma, which may be physiologically important in endothelial barrier function.
Endocytosis and membrane receptor internalization: implication of F-BAR protein Carom

PubMed Central

Xu, Yanjie; Liu, Suxuan; Xia, Jixiang; Stein, Sam; Ramon, Cueto; Xi, Hang; Wang, Luqiao; Xiong, Xinyu; Zhang, Lixiao; He, Dingwen; Yang, William; Zhao, Xianxian; Cheng, Xiaoshu; Yang, Xiaofeng; Wang, Hong

2016-01-01

Endocytosis is a cellular process mostly responsible for membrane receptor internalization. Cell membrane receptors bind to their ligands and form a complex which can be internalized. We previously proposed that F-BAR protein initiates membrane curvature and mediates endocytosis via their binding partners. However, F-BAR protein partners involved in membrane receptor endocytosis and the regulatory mechanism remain unknown. In this study, we established a group of database mining strategies to explore mechanisms underlying receptor-related endocytosis. We identified 34 endocytic membrane receptors and 10 regulating proteins for vesicle formation in clathrin-dependent endocytosis (CDE), a major process of membrane receptor internalization. We found that F-BAR protein FCHSD2 (Carom) may facilitate endocytosis via 9 endocytic partners. Carom is highly expressed, along with highly expressed endocytic membrane receptors and partners, in endothelial cells and macrophages. We established 3 models of Carom-receptor complex and their intracellular trafficking based on protein-protein interaction and subcellular localization. We conclude that Carom may mediate receptor endocytosis and transport endocytic receptors to the cytoplasm for receptor signaling and lysosome/proteasome degradation, or to the nucleus for RNA processing, gene transcription and DNA repair. PMID:28199211
Prolactin family of the guinea pig, Cavia porcellus.

PubMed

Alam, S M Khorshed; Konno, Toshihiro; Rumi, M A Karim; Dong, Yafeng; Weiner, Carl P; Soares, Michael J

2010-08-01

Prolactin (PRL) is a multifunctional hormone with prominent roles in regulating growth and reproduction. The guinea pig (Cavia porcellus) has been extensively used in endocrine and reproduction research. Thus far, the PRL cDNA and protein have not been isolated from the guinea pig. In the present study, we used information derived from the public guinea pig genome database as a tool for identifying guinea pig PRL and PRL-related proteins. Guinea pig PRL exhibits prominent nucleotide and amino acid sequence differences when compared with PRLs of other eutherian mammals. In contrast, guinea pig GH is highly conserved. Expression of PRL and GH in the guinea pig is prominent in the anterior pituitary, similar to known expression patterns of PRL and GH for other species. Two additional guinea pig cDNAs were identified and termed PRL-related proteins (PRLRP1, PRLRP2). They exhibited a more distant relationship to PRL and their expression was restricted to the placenta. Recombinant guinea pig PRL protein was generated and shown to be biologically active in the PRL-responsive Nb2 lymphoma cell bioassay. In contrast, recombinant guinea pig PRLRP1 protein did not exhibit PRL-like bioactivity. In summary, we have developed a new set of research tools for investigating the biology of the PRL family in an important animal model, the guinea pig.
Label-Free Quantitative Proteomic Analysis of Harmless and Pathogenic Strains of Infectious Microalgae, Prototheca spp.

PubMed Central

Murugaiyan, Jayaseelan; Eravci, Murat; Weise, Christoph; Roesler, Uwe

2016-01-01

Microalgae of the genus Prototheca (P.) spp are associated with rare algal infections of invertebrates termed protothecosis. Among the seven generally accepted species, P. zopfii genotype 2 (GT2) is associated with a severe form of bovine mastitis while P. blaschkeae causes the mild and sub-clinical form of mastitis. The reason behind the infectious nature of P. zopfii GT2, while genotype 1 (GT1) remains non-infectious, is not known. Therefore, in the present study we investigated the protein expression level difference between the genotypes of P. zopfii and P. blaschkeae. Cells were cultured to the mid-exponential phase, harvested, and processed for LC-MS analysis. Peptide data was acquired on an LTQ Orbitrap Velos, raw spectra were quantitatively analyzed with MaxQuant software and matching with the reference database of Chlorella variabilis and Auxenochlorella protothecoides resulted in the identification of 226 proteins. Comparison of an environmental strain with infectious strains resulted in the identification of 51 differentially expressed proteins related to carbohydrate metabolism, energy production and protein translation. The expression level of Hsp70 proteins and their role in the infectious process is worth further investigation. All mass spectrometry data are available via ProteomeXchange with identifier PXD005305. PMID:28036087
Current projects in Pre-analytics: where to go?

PubMed

Sapino, Anna; Annaratone, Laura; Marchiò, Caterina

2015-01-01

The current clinical practice of tissue handling and sample preparation is multifaceted and lacks strict standardisation: this scenario leads to significant variability in the quality of clinical samples. Poor tissue preservation has a detrimental effect thus leading to morphological artefacts, hampering the reproducibility of immunocytochemical and molecular diagnostic results (protein expression, DNA gene mutations, RNA gene expression) and affecting the research outcomes with irreproducible gene expression and post-transcriptional data. Altogether, this limits the opportunity to share and pool national databases into European common databases. At the European level, standardization of pre-analytical steps is just at the beginning and issues regarding bio-specimen collection and management are still debated. A joint (public-private) project entitled on standardization of tissue handling in pre-analytical procedures has been recently funded in Italy with the aim of proposing novel approaches to the neglected issue of pre-analytical procedures. In this chapter, we will show how investing in pre-analytics may impact both public health problems and practical innovation in solid tumour processing.

Protein Information Resource: a community resource for expert annotation of protein data

PubMed Central

Barker, Winona C.; Garavelli, John S.; Hou, Zhenglin; Huang, Hongzhan; Ledley, Robert S.; McGarvey, Peter B.; Mewes, Hans-Werner; Orcutt, Bruce C.; Pfeiffer, Friedhelm; Tsugita, Akira; Vinayaka, C. R.; Xiao, Chunlin; Yeh, Lai-Su L.; Wu, Cathy

2001-01-01

The Protein Information Resource, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the most comprehensive and expertly annotated protein sequence database in the public domain, the PIR-International Protein Sequence Database. To provide timely and high quality annotation and promote database interoperability, the PIR-International employs rule-based and classification-driven procedures based on controlled vocabulary and standard nomenclature and includes status tags to distinguish experimentally determined from predicted protein features. The database contains about 200 000 non-redundant protein sequences, which are classified into families and superfamilies and their domains and motifs identified. Entries are extensively cross-referenced to other sequence, classification, genome, structure and activity databases. The PIR web site features search engines that use sequence similarity and database annotation to facilitate the analysis and functional identification of proteins. The PIR-International databases and search tools are accessible on the PIR web site at http://pir.georgetown.edu/ and at the MIPS web site at http://www.mips.biochem.mpg.de. The PIR-International Protein Sequence Database and other files are also available by FTP. PMID:11125041
Increased expression of argininosuccinate synthetase protein predicts poor prognosis in human gastric cancer

PubMed Central

SHAN, YAN-SHEN; HSU, HUI-PING; LAI, MING-DERG; YEN, MENG-CHI; LUO, YI-PEY; CHEN, YI-LING

2015-01-01

Aberrant expression of argininosuccinate synthetase (ASS1, also known as ASS) has been found in cancer cells and is involved in the carcinogenesis of gastric cancer. The aim of the present study was to investigate the level of ASS expression in human gastric cancer and to determine the possible correlations between ASS expression and clinicopathological findings. Immunohistochemistry was performed on paraffin-embedded tissues to determine whether ASS was expressed in 11 of 11 specimens from patients with gastric cancer. The protein was localized primarily to the cytoplasm of cancer cells and normal epithelium. In the Oncomine cancer microarray database, expression of the ASS gene was significantly increased in gastric cancer tissues. To investigate the clinicopathological and prognostic roles of ASS expression, we performed western blot analysis of 35 matched specimens of gastric adenocarcinomas and normal tissue obtained from patients treated at the National Cheng Kung University Hospital. The ratio of relative ASS expression (expressed as the ASS/β-actin ratio) in tumor tissues to that in normal tissues was correlated with large tumor size (P=0.007) and with the tumor, node, metastasis (TNM) stage of the American Joint Committee on Cancer staging system (P=0.031). Patients whose cancer had increased the relative expression of ASS were positive for perineural invasion and had poor recurrence-free survival. In summary, ASS expression in gastric cancer was associated with a poor prognosis. Further study of mechanisms to silence the ASS gene or decrease the enzymatic activity of ASS protein has the potential to provide new treatments for patients with gastric cancer. PMID:25333458
Proteome analysis during pod, zygotic and somatic embryo maturation of Theobroma cacao.

PubMed

Niemenak, Nicolas; Kaiser, Edward; Maximova, Siela N; Laremore, Tatiana; Guiltinan, Mark J

2015-05-15

Two dimensional electrophoresis and nano-LC-MS were performed in order to identify alterations in protein abundance that correlate with maturation of cacao zygotic and somatic embryos. The cacao pod proteome was also characterized during development. The recently published cacao genome sequence was used to create a predicted proteolytic fragment database. Several hundred protein spots were resolved on each tissue analysis, of which 72 variable spots were subjected to MS analysis, resulting in 49 identifications. The identified proteins represent an array of functional categories, including seed storage, stress response, photosynthesis and translation factors. The seed storage protein was strongly accumulated in cacao zygotic embryos compared to their somatic counterpart. However, sucrose treatment (60 g L(-1)) allows up-regulation of storage protein in SE. A high similarity in the profiles of acidic proteins was observed in mature zygotic and somatic embryos. Differential expression in both tissues was observed in proteins having high pI. Several proteins were detected exclusively in fruit tissues, including a chitinase and a 14-3-3 protein. We also identified a novel cacao protein related to known mabinlin type sweet storage proteins. Moreover, the specific presence of thaumatin-like protein, another sweet protein, was also detected in fruit tissue. We discuss our observed correlations between protein expression profiles, developmental stage and stress responses. Copyright © 2015 Elsevier GmbH. All rights reserved.
Evaluation of hepatitis B virus replication and proteomic analysis of HepG2.2.15 cell line after cyclosporine A treatment.

PubMed

Xie, Hai-Yang; Xia, Wei-Liang; Zhang, Chun-Chao; Wu, Li-Ming; Ji, Hao-Feng; Cheng, Yu; Zheng, Shu-Sen

2007-07-01

The effect of cyclosporine A (CsA) on hepatitis B virus (HBV) replication was investigated, and proteomics expression differentiation after CsA treatment was studied in order to provide clues to explore the effect of CsA on HBV replication. Methyl thiazolyl tetrazolium (MTT) assay was used to evaluate the cytotoxicity of CsA. The HBV replication level in the HBV genomic DNA transfected HepG2.2.15 cell line was determined by an ELISA analysis of hepatitis B surface antigens (HBsAg) and Hepatitis B e antigens (HBeAg) in culture supernatant, while the intracellular HBV DNA replication level was analyzed by slot blot hybridization. Two-dimensional electrophoresis was used to investigate the alteration of protein expression in HepG2.2.15 after CsA treatment in vitro. The differentially-expressed proteins were identified by Matrix-assisted laser desorption/ionization-time of flight mass spectrometry combined with an online database search. CsA was able to inhibit the expression of HBsAg, HBeAg, and HBV DNA replication in vitro in a dose-dependent manner. A proteomics analysis indicated that the expression of 17 proteins changed significantly in the CsA treatment group compared to the control group. Eleven of the 17 proteins were identified, including the overexpression of eukaryotic translation initiation factors (eIF) 3k, otubain 1, 14.3.3 protein, eIF2-1 alpha, eIF5A, and the tyrosine 3/tryptophan 5-mono-oxygenase activation protein in CsA-treated HepG2.2.15 cells. The downregulation of the ferritin light subunit, erythrocyte cytosolic protein of 51 kDa (ECP-51), stathmin 1/oncoprotein, adenine phosphoribosyl-transferase, and the position of a tumor protein, translationally controlled 1, was shifted, suggesting it had undergone posttranslational modifications. Our study identified the inhibitory effect of CsA on HBV replication, and found that a group of proteins may be responsible for this inhibitory effect.
Expression Atlas: gene and protein expression across multiple studies and organisms

PubMed Central

Tang, Y Amy; Bazant, Wojciech; Burke, Melissa; Fuentes, Alfonso Muñoz-Pomer; George, Nancy; Koskinen, Satu; Mohammed, Suhaib; Geniza, Matthew; Preece, Justin; Jarnuczak, Andrew F; Huber, Wolfgang; Stegle, Oliver; Brazma, Alvis; Petryszak, Robert

2018-01-01

Abstract Expression Atlas (http://www.ebi.ac.uk/gxa) is an added value database that provides information about gene and protein expression in different species and contexts, such as tissue, developmental stage, disease or cell type. The available public and controlled access data sets from different sources are curated and re-analysed using standardized, open source pipelines and made available for queries, download and visualization. As of August 2017, Expression Atlas holds data from 3,126 studies across 33 different species, including 731 from plants. Data from large-scale RNA sequencing studies including Blueprint, PCAWG, ENCODE, GTEx and HipSci can be visualized next to each other. In Expression Atlas, users can query genes or gene-sets of interest and explore their expression across or within species, tissues, developmental stages in a constitutive or differential context, representing the effects of diseases, conditions or experimental interventions. All processed data matrices are available for direct download in tab-delimited format or as R-data. In addition to the web interface, data sets can now be searched and downloaded through the Expression Atlas R package. Novel features and visualizations include the on-the-fly analysis of gene set overlaps and the option to view gene co-expression in experiments investigating constitutive gene expression across tissues or other conditions. PMID:29165655
Proteome reference maps of Medicago truncatula embryogenic cell cultures generated from single protoplasts.

PubMed

Imin, Nijat; De Jong, Femke; Mathesius, Ulrike; van Noorden, Giel; Saeed, Nasir A; Wang, Xin-Ding; Rose, Ray J; Rolfe, Barry G

2004-07-01

Using a combination of two-dimensional gel electrophoresis (2-DE) protein mapping and mass spectrometry (MS) analysis, we have established proteome reference maps of Medicago truncatula embryogenic tissue culture cells. The cultures were generated from single protoplasts, which provided a relatively homogeneous cell population. We used these to analyze protein expression at the globular stages of somatic embryogenesis, which is the earliest morphogenetic embryonic stage. Over 3000 proteins could reproducibly be resolved over a pI range of 4-11. Three hundred and twelve protein spots were extracted from colloidal Coomassie Blue-stained 2-DE gels and analyzed by matrix-assisted laser desorption/ionization-time of flight MS analysis and tandem MS sequencing. This enabled the identification of 169 protein spots representing 128 unique gene products using a publicly available expressed sequence tag database and the MASCOT search engine. These reference maps will be valuable for the investigation of the molecular events which occur during somatic embryogenesis in M. truncatula. The proteome reference maps and supplementary materials will be available and updated for public access at http://semele.anu.edu.au/.
A novel feature ranking method for prediction of cancer stages using proteomics data

PubMed Central

Saghapour, Ehsan; Sehhati, Mohammadreza

2017-01-01

Proteomic analysis of cancers' stages has provided new opportunities for the development of novel, highly sensitive diagnostic tools which helps early detection of cancer. This paper introduces a new feature ranking approach called FRMT. FRMT is based on the Technique for Order of Preference by Similarity to Ideal Solution method (TOPSIS) which select the most discriminative proteins from proteomics data for cancer staging. In this approach, outcomes of 10 feature selection techniques were combined by TOPSIS method, to select the final discriminative proteins from seven different proteomic databases of protein expression profiles. In the proposed workflow, feature selection methods and protein expressions have been considered as criteria and alternatives in TOPSIS, respectively. The proposed method is tested on seven various classifier models in a 10-fold cross validation procedure that repeated 30 times on the seven cancer datasets. The obtained results proved the higher stability and superior classification performance of method in comparison with other methods, and it is less sensitive to the applied classifier. Moreover, the final introduced proteins are informative and have the potential for application in the real medical practice. PMID:28934234
Molecular characterization of DNA repair protein Ku70 from Vitis vinifera and its purification from transgenic tobacco.

PubMed

Tak, Himanshu; Mhatre, Minal

2013-08-01

The DNA double strand break repair in plants is preferentially by non homologous end joining (NHEJ) pathway. A key protein of NHEJ pathway is Ku70. We have identified Ku70 homolog (VvKu70) from grapevine genome database. In this report we characterize a Ku70 homologue from Vitis vinifera cv. Mango. The VvKu70 expression was found to increase strongly in response to gamma radiation. The transcript level of VvKu70 was found to increase up to 36 h in gamma irradiated shoots of grapevine. The expression of VvKu70 was found in many organs like stem, leaves and roots. A GFP fused VvKu70 protein was found to be nuclear localized which indicates that the VvKu70 is a nuclear localized protein. The VvKu70 identified by in silico approaches is present as a single copy number in V. vinifera cv. Mango genome. The VvKu70-GFP fused protein possesses ATPase activity and fails to bind dsDNA but binds ssDNA.
Isolation of genes differentially expressed during development and ripening of Fragaria chiloensis fruit by suppression subtractive hybridization.

PubMed

Pimentel, Paula; Salvatierra, Ariel; Moya-León, María Alejandra; Herrera, Raúl

2010-09-15

Fragaria chiloensis, the native Chilean strawberry, is noted for its good fruit quality characters. However, it is a highly perishable fruit due to its rapid softening. With the aim to screen for genes differentially expressed during development and ripening of strawberry fruit, the subtractive suppressive hybridization (SSH) methodology was employed. Six libraries were generated contrasting transcripts from four different developmental stages. A set of 1807 genes was isolated and characterized. In our EST collection, approximately 90% of partial cDNAs showed significant similarity to proteins with known or unknown function registered in databases. Among them, proteins related to protein fate were identified in a large green fruit library and protein related with cellular transport, cell wall-related proteins, and transcription regulators were identified in a ripe fruit library. Thirteen genes were analyzed by qRT-PCR during development and ripening of the Chilean strawberry fruit. The information generated in this study provides new clues to aid the understanding of the ripening process in F. chiloensis fruit. Copyright 2010 Elsevier GmbH. All rights reserved.
Analysis of Sir2E in the cellular slime mold Dictyostelium discoideum: cellular localization, spatial expression and overexpression.

PubMed

Katayama, Takahiro; Yasukawa, Hiro

2008-10-01

It has been reported that Dictyostelium discoideum encodes four silent information regulator 2 (Sir2) proteins (Sir2A-D) showing sequence similarity to human homologues of Sir2 (SIRT1-3). Further screening in a database revealed that D. discoideum encodes an additional Sir2 homologue (Sir2E). The amino acid sequence of Sir2E is not similar to those of SIRTs but is similar to those of proteins encoded by Giardia lamblia, Cryptosporidium hominis and Cryptosporidium parvum. Fluorescence of Sir2E-green fluorescent protein fusion protein was detected in the D. discoideum nucleus, indicating that Sir2E is a nuclear localizing protein. Reverse transcription-polymerase chain reaction and whole-mount in situ hybridization analyses showed that D. discoideum expressed sir2E in amoebae in the growth phase and in prestalk cells in the developmental phase. D. discoideum overexpressing sir2E grew faster than the wild type. These results indicate that Sir2E plays important roles both in the growth phase and developmental phase of D. discoideum.
Comparative analysis of methicillin-sensitive and resistant Staphylococcus aureus exposed to emodin based on proteomic profiling.

PubMed

Ji, Xiaoyu; Liu, Xiaoqiang; Peng, Yuanxia; Zhan, Ruoting; Xu, Hui; Ge, Xijin

2017-12-09

Emodin has a strong antibacterial activity, including methicillin-resistant Staphylococcus aureus (MRSA). However, the mechanism by which emodin induces growth inhibition against MRSA remains unclear. In this study, the isobaric tags for relative and absolute quantitation (iTRAQ) proteomics approach was used to investigate the modes of action of emodin on a MRSA isolate and methicillin-sensitive S. aureus ATCC29213(MSSA). Proteomic analysis showed that expression levels of 145 and 122 proteins were changed significantly in MRSA and MSSA, respectively, after emodin treatment. Comparative analysis of the functions of differentially expressed proteins between the two strains was performed via bioinformatics tools blast2go and STRING database. Proteins related to pyruvate pathway imbalance induction, protein synthesis inhibition, and DNA synthesis suppression were found in both methicillin-sensitive and resistant strains. Moreover, Interference proteins related to membrane damage mechanism were also observed in MRSA. Our findings indicate that emodin is a potential antibacterial agent targeting MRSA via multiple mechanisms. Copyright © 2017 Elsevier Inc. All rights reserved.
Soybean DREB1/CBF-type transcription factors function in heat and drought as well as cold stress-responsive gene expression.

PubMed

Kidokoro, Satoshi; Watanabe, Keitaro; Ohori, Teppei; Moriwaki, Takashi; Maruyama, Kyonoshin; Mizoi, Junya; Myint Phyu Sin Htwe, Nang; Fujita, Yasunari; Sekita, Sachiko; Shinozaki, Kazuo; Yamaguchi-Shinozaki, Kazuko

2015-02-01

Soybean (Glycine max) is a globally important crop, and its growth and yield are severely reduced by abiotic stresses, such as drought, heat, and cold. The cis-acting element DRE (dehydration-responsive element)/CRT plays an important role in activating gene expression in response to these stresses. The Arabidopsis DREB1/CBF genes that encode DRE-binding proteins function as transcriptional activators in the cold stress responsive gene expression. In this study, we identified 14 DREB1-type transcription factors (GmDREB1s) from a soybean genome database. The expression of most GmDREB1 genes in soybean was strongly induced by a variety of abiotic stresses, such as cold, drought, high salt, and heat. The GmDREB1 proteins activated transcription via DREs (dehydration-responsive element) in Arabidopsis and soybean protoplasts. Transcriptome analyses using transgenic Arabidopsis plants overexpressing GmDREB1s indicated that many of the downstream genes are cold-inducible and overlap with those of Arabidopsis DREB1A. We then comprehensively analyzed the downstream genes of GmDREB1B;1, which is closely related to DREB1A, using a transient expression system in soybean protoplasts. The expression of numerous genes induced by various abiotic stresses were increased by overexpressing GmDREB1B;1 in soybean, and DREs were the most conserved element in the promoters of these genes. The downstream genes of GmDREB1B;1 included numerous soybean-specific stress-inducible genes that encode an ABA receptor family protein, GmPYL21, and translation-related genes, such as ribosomal proteins. We confirmed that GmDREB1B;1 directly activates GmPYL21 expression and enhances ABRE-mediated gene expression in an ABA-independent manner. These results suggest that GmDREB1 proteins activate the expression of numerous soybean-specific stress-responsive genes under diverse abiotic stress conditions. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.
Combining Evidence of Preferential Gene-Tissue Relationships from Multiple Sources

PubMed Central

Guo, Jing; Hammar, Mårten; Öberg, Lisa; Padmanabhuni, Shanmukha S.; Bjäreland, Marcus; Dalevi, Daniel

2013-01-01

An important challenge in drug discovery and disease prognosis is to predict genes that are preferentially expressed in one or a few tissues, i.e. showing a considerably higher expression in one tissue(s) compared to the others. Although several data sources and methods have been published explicitly for this purpose, they often disagree and it is not evident how to retrieve these genes and how to distinguish true biological findings from those that are due to choice-of-method and/or experimental settings. In this work we have developed a computational approach that combines results from multiple methods and datasets with the aim to eliminate method/study-specific biases and to improve the predictability of preferentially expressed human genes. A rule-based score is used to merge and assign support to the results. Five sets of genes with known tissue specificity were used for parameter pruning and cross-validation. In total we identify 3434 tissue-specific genes. We compare the genes of highest scores with the public databases: PaGenBase (microarray), TiGER (EST) and HPA (protein expression data). The results have 85% overlap to PaGenBase, 71% to TiGER and only 28% to HPA. 99% of our predictions have support from at least one of these databases. Our approach also performs better than any of the databases on identifying drug targets and biomarkers with known tissue-specificity. PMID:23950964
Finding the Subcellular Location of Barley, Wheat, Rice and Maize Proteins: The Compendium of Crop Proteins with Annotated Locations (cropPAL).

PubMed

Hooper, Cornelia M; Castleden, Ian R; Aryamanesh, Nader; Jacoby, Richard P; Millar, A Harvey

2016-01-01

Barley, wheat, rice and maize provide the bulk of human nutrition and have extensive industrial use as agricultural products. The genomes of these crops each contains >40,000 genes encoding proteins; however, the major genome databases for these species lack annotation information of protein subcellular location for >80% of these gene products. We address this gap, by constructing the compendium of crop protein subcellular locations called crop Proteins with Annotated Locations (cropPAL). Subcellular location is most commonly determined by fluorescent protein tagging of live cells or mass spectrometry detection in subcellular purifications, but can also be predicted from amino acid sequence or protein expression patterns. The cropPAL database collates 556 published studies, from >300 research institutes in >30 countries that have been previously published, as well as compiling eight pre-computed subcellular predictions for all Hordeum vulgare, Triticum aestivum, Oryza sativa and Zea mays protein sequences. The data collection including metadata for proteins and published studies can be accessed through a search portal http://crop-PAL.org. The subcellular localization information housed in cropPAL helps to depict plant cells as compartmentalized protein networks that can be investigated for improving crop yield and quality, and developing new biotechnological solutions to agricultural challenges. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
[Herbs for calming liver and suppressing yang in treatment of hyperthyroidism with hyperactive liver yang: herbal effects on lymphocyte protein expression].

PubMed

Li, Xiangping; Yin, Tao; Zhong, Guangwei; Li, Wei; Luo, Yanhong; Xiang, Lingli; Liu, Zhehao

2011-07-01

To observe the herbal effects on hyperthyroidism patients with syndrome of hyperactivity of liver-Yang by method for calming the liver and suppressing Yang and investigate its effects on the lymphocyte protein expression. This approach may lay a foundation for the further investigation of the curative mechanisms of calming the liver and suppressing Yang treatment. A total of 48 hyperthyroidism patients with syndrome of hyperactivity of liver-Yang were randomly divided into treatment group and control group. The treatment group was treated by method for calming the liver and suppressing Yang in accordance with traditional Chinese medicine (TCM) and the control group with thiamazole tablets for three periods of treatment The therapeutic effects, the score of TCM symptom, electrocardiogram (P wave), thyroid hormones and ultrasound were observed in both groups before and after the treatment. The side effects in the treatment course were observed in both groups. The level of differential protein expression was analyzed by two-dimensional electrphoresis and matrix assisted laser desorption/ionizaton time-of-flight mass spectrometry. The treatment group has the effect on stepping down the heart rate, cutting down the P wave amplitude changes, regulating the level of thyroid hormones and decreasing the volume of thyromegaly. There are not statistically significant between the treatment group and control group. However, the treatment group has obviously better effect on regulating TCM symptom and decreasing the side reaction than the control group (P<0.05). There are not statistically significant on the total effective between the treatment group and control group. The average spots in lymphocyte for normal people, before and after treating hyperthyroidism patients with syndrome of hyperactivity of liver-Yang were (429 +/- 31), (452 +/- 28) and (437 +/- 36) spots respectively. Eight down-regulated protein expressions and 11 up-regulated protein expressions were obtained in the hyperthyroidism patients with syndrome of hyperactivity of liver-Yang and normal people. Five strengthened expressions of protein were also obtained in 8 down-regulated expressions of protein and 8 lower expressions of protein in 11 up-regulated expressions of protein before and after treating the migraine patients with syndrome of hyperactivity of liver-Yang. Ten of the total 8 differential protein spots were successfully identified by MALDI-TOF-MS. The functions of these proteins were involved in metabolism associated, transportation, antioxidation, sigal transduction and immume associated protein, etc. according to information provided by NCBI and MSDB database. In this study, the TCM complex prescription with herbs for calming the liver and suppressing Yang can regulate the thyroid hormones, improves TCM symptoms, and decrease the adverse reaction. It can possibly regulate lymphocyte protein expression.
ExAtlas: An interactive online tool for meta-analysis of gene expression data.

PubMed

Sharov, Alexei A; Schlessinger, David; Ko, Minoru S H

2015-12-01

We have developed ExAtlas, an on-line software tool for meta-analysis and visualization of gene expression data. In contrast to existing software tools, ExAtlas compares multi-component data sets and generates results for all combinations (e.g. all gene expression profiles versus all Gene Ontology annotations). ExAtlas handles both users' own data and data extracted semi-automatically from the public repository (GEO/NCBI database). ExAtlas provides a variety of tools for meta-analyses: (1) standard meta-analysis (fixed effects, random effects, z-score, and Fisher's methods); (2) analyses of global correlations between gene expression data sets; (3) gene set enrichment; (4) gene set overlap; (5) gene association by expression profile; (6) gene specificity; and (7) statistical analysis (ANOVA, pairwise comparison, and PCA). ExAtlas produces graphical outputs, including heatmaps, scatter-plots, bar-charts, and three-dimensional images. Some of the most widely used public data sets (e.g. GNF/BioGPS, Gene Ontology, KEGG, GAD phenotypes, BrainScan, ENCODE ChIP-seq, and protein-protein interaction) are pre-loaded and can be used for functional annotations.
Inventory of high-abundance mRNAs in skeletal muscle of normal men.

PubMed

Welle, S; Bhatt, K; Thornton, C A

1999-05-01

G42875rial analysis of gene expression (SAGE) method was used to generate a catalog of 53,875 short (14 base) expressed sequence tags from polyadenylated RNA obtained from vastus lateralis muscle of healthy young men. Over 12,000 unique tags were detected. The frequency of occurrence of each tag reflects the relative abundance of the corresponding mRNA. The mRNA species that were detected 10 or more times, each comprising >/=0.02% of the mRNA population, accounted for 64% of the mRNA mass but <10% of the total number of mRNA species detected. Almost all of the abundant tags matched mRNA or EST sequences cataloged in GenBank. Mitochondrial transcripts accounted for approximately 20% of the polyadenylated RNA. Transcripts encoding proteins of the myofibrils were the most abundant nuclear-encoded mRNAs. Transcripts encoding ribosomal proteins, and those encoding proteins involved in energy metabolism, also were very abundant. The database can be used as a reference for investigations of alterations in gene expression associated with conditions that influence muscle function, such as muscular dystrophies, aging, and exercise.
Accounting for host cell protein behavior in anion-exchange chromatography.

PubMed

Swanson, Ryan K; Xu, Ruo; Nettleton, Daniel S; Glatz, Charles E

2016-11-01

Host cell proteins (HCP) are a problematic set of impurities in downstream processing (DSP) as they behave most similarly to the target protein during separation. Approaching DSP with the knowledge of HCP separation behavior would be beneficial for the production of high purity recombinant biologics. Therefore, this work was aimed at characterizing the separation behavior of complex mixtures of HCP during a commonly used method: anion-exchange chromatography (AEX). An additional goal was to evaluate the performance of a statistical methodology, based on the characterization data, as a tool for predicting protein separation behavior. Aqueous two-phase partitioning followed by two-dimensional electrophoresis provided data on the three physicochemical properties most commonly exploited during DSP for each HCP: pI (isoelectric point), molecular weight, and surface hydrophobicity. The protein separation behaviors of two alternative expression host extracts (corn germ and E. coli) were characterized. A multivariate random forest (MVRF) statistical methodology was then applied to the database of characterized proteins creating a tool for predicting the AEX behavior of a mixture of proteins. The accuracy of the MVRF method was determined by calculating a root mean squared error value for each database. This measure never exceeded a value of 0.045 (fraction of protein populating each of the multiple separation fractions) for AEX. © 2016 American Institute of Chemical Engineers Biotechnol. Prog., 32:1453-1463, 2016. © 2016 American Institute of Chemical Engineers.
ProBiS-database: precalculated binding site similarities and local pairwise alignments of PDB structures.

PubMed

Konc, Janez; Cesnik, Tomo; Konc, Joanna Trykowska; Penca, Matej; Janežič, Dušanka

2012-02-27

ProBiS-Database is a searchable repository of precalculated local structural alignments in proteins detected by the ProBiS algorithm in the Protein Data Bank. Identification of functionally important binding regions of the protein is facilitated by structural similarity scores mapped to the query protein structure. PDB structures that have been aligned with a query protein may be rapidly retrieved from the ProBiS-Database, which is thus able to generate hypotheses concerning the roles of uncharacterized proteins. Presented with uncharacterized protein structure, ProBiS-Database can discern relationships between such a query protein and other better known proteins in the PDB. Fast access and a user-friendly graphical interface promote easy exploration of this database of over 420 million local structural alignments. The ProBiS-Database is updated weekly and is freely available online at http://probis.cmm.ki.si/database.
Integration of deep transcriptome and proteome analyses reveals the components of alkaloid metabolism in opium poppy cell cultures

PubMed Central

2010-01-01

Background Papaver somniferum (opium poppy) is the source for several pharmaceutical benzylisoquinoline alkaloids including morphine, the codeine and sanguinarine. In response to treatment with a fungal elicitor, the biosynthesis and accumulation of sanguinarine is induced along with other plant defense responses in opium poppy cell cultures. The transcriptional induction of alkaloid metabolism in cultured cells provides an opportunity to identify components of this process via the integration of deep transcriptome and proteome databases generated using next-generation technologies. Results A cDNA library was prepared for opium poppy cell cultures treated with a fungal elicitor for 10 h. Using 454 GS-FLX Titanium pyrosequencing, 427,369 expressed sequence tags (ESTs) with an average length of 462 bp were generated. Assembly of these sequences yielded 93,723 unigenes, of which 23,753 were assigned Gene Ontology annotations. Transcripts encoding all known sanguinarine biosynthetic enzymes were identified in the EST database, 5 of which were represented among the 50 most abundant transcripts. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) of total protein extracts from cell cultures treated with a fungal elicitor for 50 h facilitated the identification of 1,004 proteins. Proteins were fractionated by one-dimensional SDS-PAGE and digested with trypsin prior to LC-MS/MS analysis. Query of an opium poppy-specific EST database substantially enhanced peptide identification. Eight out of 10 known sanguinarine biosynthetic enzymes and many relevant primary metabolic enzymes were represented in the peptide database. Conclusions The integration of deep transcriptome and proteome analyses provides an effective platform to catalogue the components of secondary metabolism, and to identify genes encoding uncharacterized enzymes. The establishment of corresponding transcript and protein databases generated by next-generation technologies in a system with a well-defined metabolite profile facilitates an improved linkage between genes, enzymes, and pathway components. The proteome database represents the most relevant alkaloid-producing enzymes, compared with the much deeper and more complete transcriptome library. The transcript database contained full-length mRNAs encoding most alkaloid biosynthetic enzymes, which is a key requirement for the functional characterization of novel gene candidates. PMID:21083930

Molecular cloning, characterization and differential expression of novel phytocystatin gene during tropospheric ozone stress in maize (Zea mays) leaves.

PubMed

Ahmad, Rafiq; Zuily-Fodil, Yasmine; Passaquet, Chantal; Ali Khan, Sabaz; Repellin, Anne

2015-03-01

In this study, a full-length cDNA encoding a novel phytocystatin gene, designated CC14, was identified in maize leaves. The CC14 gene sequence reported in this study has been deposited in the GenBank database (accession number JF290478). The CC14 gene was cloned into an expression vector pET30 EK/LIC and was then transformed into Escherichia coli strain BL21 (DE3) pLysS to produce a recombinant CC14 protein. The recombinant protein was purified by nickel nitrilotriacetic acid affinity chromatography after induction with 1 mM IPTG. The purified CC14 protein was electrophoresed on SDS-PAGE and a protein 25 kDa in size was observed. Antiprotease activities of the purified recombinant CC14 protein against cysteine proteases and commercially available papain were tested. The results showed that CC14 purified protein suppressed 100% activity of papain and 57-86% plant cysteine protease activity. Moreover, an upregulation of CC14 gene expression was observed after 20 days of ozone stress in maize leaves. Together, these observations concurred to conclude that CC14 gene could potentially be used as a basis for the development of transgenic crops and natural pesticides that resist biotic and abiotic stresses. Copyright © 2014 Académie des sciences. Published by Elsevier SAS. All rights reserved.
Micro-proteomics with iterative data analysis: Proteome analysis in C. elegans at the single worm level.

PubMed

Bensaddek, Dalila; Narayan, Vikram; Nicolas, Armel; Murillo, Alejandro Brenes; Gartner, Anton; Kenyon, Cynthia J; Lamond, Angus I

2016-02-01

Proteomics studies typically analyze proteins at a population level, using extracts prepared from tens of thousands to millions of cells. The resulting measurements correspond to average values across the cell population and can mask considerable variation in protein expression and function between individual cells or organisms. Here, we report the development of micro-proteomics for the analysis of Caenorhabditis elegans, a eukaryote composed of 959 somatic cells and ∼1500 germ cells, measuring the worm proteome at a single organism level to a depth of ∼3000 proteins. This includes detection of proteins across a wide dynamic range of expression levels (>6 orders of magnitude), including many chromatin-associated factors involved in chromosome structure and gene regulation. We apply the micro-proteomics workflow to measure the global proteome response to heat-shock in individual nematodes. This shows variation between individual animals in the magnitude of proteome response following heat-shock, including variable induction of heat-shock proteins. The micro-proteomics pipeline thus facilitates the investigation of stochastic variation in protein expression between individuals within an isogenic population of C. elegans. All data described in this study are available online via the Encyclopedia of Proteome Dynamics (http://www.peptracker.com/epd), an open access, searchable database resource. © 2015 The Authors. PROTEOMICS Published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Proteomics of xenografted human breast cancer indicates novel targets related to tamoxifen resistance.

PubMed

Besada, Vladimir; Diaz, Maylin; Becker, Michael; Ramos, Yassel; Castellanos-Serra, Lila; Fichtner, Iduna

2006-02-01

Tamoxifen is the most frequently used drug for hormone therapy of breast cancer patients, even though a high percentage of women are (or become) refractory to this treatment. The proteins involved in tamoxifen resistance of breast tumor cells as well as the mechanisms by which they interact, are still unknown. Some years ago, we established the xenograft breast tumor 3366, sensitive to tamoxifen and the 3366/TAM, resistant to tamoxifen, derived after two years of in vivo passages of the parental 3366 under tamoxifen treatment. Here, we compare the protein expression levels of both xenografts. 2-DE of proteins from total cell extracts showed very high reproducibility among tumors from each group (tamoxifen sensitive and tamoxifen resistant). The heuristic clustering analysis of these gels pooled them correctly in both groups. Twelve proteins were found up-regulated in the tamoxifen-resistant line, while nine were down-regulated. The proteins differentially expressed were identified by MS and sequence database analysis. Biological functions of these proteins are related to cell-cell adhesion and interaction, signal transduction, DNA and protein synthesis machinery, mitochondrial respiratory chain, oxidative stress processes and apoptosis. Three of the identified proteins (ALG-2 interacting protein and two GDP-dissociation inhibitors) could be directly involved in the resistance phenomenon.
Multiparametric Determination of Radiation Risk

NASA Technical Reports Server (NTRS)

Richmond, Robert C.

2003-01-01

Predicting risk of human cancer following exposure to ionizing space radiation is challenging in part because of uncertainties of low-dose distribution amongst cells, of unknown potentially synergistic effects of microgravity upon cellular protein-expression, and of processing dose-related damage within cells to produce rare and late-appearing malignant transformation, degrade the confidence of cancer risk-estimates. The NASA- specific responsibility to estimate the risks of radiogenic cancer in a limited number of astronauts is not amenable to epidemiologic study, thereby increasing this challenge. Developing adequately sensitive cellular biodosimeters that simultaneously report 1) the quantity of absorbed close after exposure to ionizing radiation, 2) the quality of radiation delivering that dose, and 3) the risk of developing malignant transformation by the cells absorbing that dose could be useful for resolving these challenges. Use of a multiparametric cellular biodosimeter is suggested using analyses of gene-expression and protein-expression whereby large datasets of cellular response to radiation-induced damage are obtained and analyzed for expression-profiles correlated with established end points and molecular markers predictive for cancer-risk. Analytical techniques of genomics and proteomics may be used to establish dose-dependency of multiple gene- and protein- expressions resulting from radiation-induced cellular damage. Furthermore, gene- and protein-expression from cells in microgravity are known to be altered relative to cells grown on the ground at 1g. Therefore, hypotheses are proposed that 1) macromolecular expression caused by radiation-induced damage in cells in microgravity may be different than on the ground, and 2) different patterns of macromolecular expression in microgravity may alter human radiogenic cancer risk relative to radiation exposure on Earth. A new paradigm is accordingly suggested as a national database wherein genomic and proteomic datasets are registered and interrogated in order to provide statistically significant dose-dependent risk estimation of radiogenic cancer in astronauts.
Prediction of novel target genes and pathways involved in bevacizumab-resistant colorectal cancer

PubMed Central

Makondi, Precious Takondwa; Lee, Chia-Hwa; Huang, Chien-Yu; Chu, Chi-Ming; Chang, Yu-Jia

2018-01-01

Bevacizumab combined with cytotoxic chemotherapy is the backbone of metastatic colorectal cancer (mCRC) therapy; however, its treatment efficacy is hampered by therapeutic resistance. Therefore, understanding the mechanisms underlying bevacizumab resistance is crucial to increasing the therapeutic efficacy of bevacizumab. The Gene Expression Omnibus (GEO) database (dataset, GSE86525) was used to identify the key genes and pathways involved in bevacizumab-resistant mCRC. The GEO2R web tool was used to identify differentially expressed genes (DEGs). Functional and pathway enrichment analyses of the DEGs were performed using the Database for Annotation, Visualization, and Integrated Discovery(DAVID). Protein–protein interaction (PPI) networks were established using the Search Tool for the Retrieval of Interacting Genes/Proteins database(STRING) and visualized using Cytoscape software. A total of 124 DEGs were obtained, 57 of which upregulated and 67 were downregulated. PPI network analysis showed that seven upregulated genes and nine downregulated genes exhibited high PPI degrees. In the functional enrichment, the DEGs were mainly enriched in negative regulation of phosphate metabolic process and positive regulation of cell cycle process gene ontologies (GOs); the enriched pathways were the phosphoinositide 3-kinase-serine/threonine kinase signaling pathway, bladder cancer, and microRNAs in cancer. Cyclin-dependent kinase inhibitor 1A(CDKN1A), toll-like receptor 4 (TLR4), CD19 molecule (CD19), breast cancer 1, early onset (BRCA1), platelet-derived growth factor subunit A (PDGFA), and matrix metallopeptidase 1 (MMP1) were the DEGs involved in the pathways and the PPIs. The clinical validation of the DEGs in mCRC (TNM clinical stages 3 and 4) revealed that high PDGFA expression levels were associated with poor overall survival, whereas high BRCA1 and MMP1 expression levels were associated with favorable progress free survival(PFS). The identified genes and pathways can be potential targets and predictors of therapeutic resistance and prognosis in bevacizumab-treated patients with mCRC. PMID:29342159
Generation and Analysis of a Large-Scale Expressed Sequence Tag Database from a Full-Length Enriched cDNA Library of Developing Leaves of Gossypium hirsutum L

PubMed Central

Pang, Chaoyou; Fan, Shuli; Song, Meizhen; Yu, Shuxun

2013-01-01

Background Cotton (Gossypium hirsutum L.) is one of the world’s most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. Methodology/Principal Findings In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR), which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. Conclusions/Significance These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence assembly and annotation in G. hirsutum and comparative genomics among Gossypium species. PMID:24146870
Topological and organizational properties of the products of house-keeping and tissue-specific genes in protein-protein interaction networks.

PubMed

Lin, Wen-Hsien; Liu, Wei-Chung; Hwang, Ming-Jing

2009-03-11

Human cells of various tissue types differ greatly in morphology despite having the same set of genetic information. Some genes are expressed in all cell types to perform house-keeping functions, while some are selectively expressed to perform tissue-specific functions. In this study, we wished to elucidate how proteins encoded by human house-keeping genes and tissue-specific genes are organized in human protein-protein interaction networks. We constructed protein-protein interaction networks for different tissue types using two gene expression datasets and one protein-protein interaction database. We then calculated three network indices of topological importance, the degree, closeness, and betweenness centralities, to measure the network position of proteins encoded by house-keeping and tissue-specific genes, and quantified their local connectivity structure. Compared to a random selection of proteins, house-keeping gene-encoded proteins tended to have a greater number of directly interacting neighbors and occupy network positions in several shortest paths of interaction between protein pairs, whereas tissue-specific gene-encoded proteins did not. In addition, house-keeping gene-encoded proteins tended to connect with other house-keeping gene-encoded proteins in all tissue types, whereas tissue-specific gene-encoded proteins also tended to connect with other tissue-specific gene-encoded proteins, but only in approximately half of the tissue types examined. Our analysis showed that house-keeping gene-encoded proteins tend to occupy important network positions, while those encoded by tissue-specific genes do not. The biological implications of our findings were discussed and we proposed a hypothesis regarding how cells organize their protein tools in protein-protein interaction networks. Our results led us to speculate that house-keeping gene-encoded proteins might form a core in human protein-protein interaction networks, while clusters of tissue-specific gene-encoded proteins are attached to the core at more peripheral positions of the networks.
Nano-LC-ESI MS/MS analysis of proteins in dried sea dragon Solenognathus hardwickii and bioinformatic analysis of its protein expression profiling.

PubMed

Zhang, Dong-Mei; Feng, Li-Xing; Li, Lu; Liu, Miao; Jiang, Bao-Hong; Yang, Min; Li, Guo-Qiang; Wu, Wan-Ying; Guo, De-An; Liu, Xuan

2016-09-01

The sea dragon Solenognathus hardwickii has long been used as a traditional Chinese medicine for the treatment of various diseases, such as male impotency. To gain a comprehensive insight into the protein components of the sea dragon, shotgun proteomic analysis of its protein expression profiling was conducted in the present study. Proteins were extracted from dried sea dragon using a trichloroacetic acid/acetone precipitation method and then separated by SDS-PAGE. The protein bands were cut from the gel and digested by trypsin to generate peptide mixture. The peptide fragments were then analyzed using nano liquid chromatography tandem mass spectrometry (nano-LC-ESI MS/MS). 810 proteins and 1 577 peptides were identified in the dried sea dragon. The identified proteins exhibited molecular weight values ranging from 1 900 to 3 516 900 Da and pI values from 3.8 to 12.18. Bioinformatic analysis was conducted using the DAVID Bioinformatics Resources 6.7 Gene Ontology (GO) analysis tool to explore possible functions of the identified proteins. Ascribed functions of the proteins mainly included intracellular non-membrane-bound organelle, non-membrane-bounded organelle, cytoskeleton, structural molecule activity, calcium ion binding and etc. Furthermore, possible signal networks of the identified proteins were predicted using STRING (Search Tool for the Retrieval of Interacting Genes) database. Ribosomal protein synthesis was found to play an important role in the signal network. The results of this study, to best of our knowledge, were the first to provide a reference proteome profile for the sea dragon, and would aid in the understanding of the expression and functions of the identified proteins. Copyright © 2016 China Pharmaceutical University. Published by Elsevier B.V. All rights reserved.
Gene Expression Profiling Reveals Functional Specialization along the Intestinal Tract of a Carnivorous Teleostean Fish (Dicentrarchus labrax)

PubMed Central

Calduch-Giner, Josep A.; Sitjà-Bobadilla, Ariadna; Pérez-Sánchez, Jaume

2016-01-01

High-quality sequencing reads from the intestine of European sea bass were assembled, annotated by similarity against protein reference databases and combined with nucleotide sequences from public and private databases. After redundancy filtering, 24,906 non-redundant annotated sequences encoding 15,367 different gene descriptions were obtained. These annotated sequences were used to design a custom, high-density oligo-microarray (8 × 15 K) for the transcriptomic profiling of anterior (AI), middle (MI), and posterior (PI) intestinal segments. Similar molecular signatures were found for AI and MI segments, which were combined in a single group (AI-MI) whereas the PI outstood separately, with more than 1900 differentially expressed genes with a fold-change cutoff of 2. Functional analysis revealed that molecular and cellular functions related to feed digestion and nutrient absorption and transport were over-represented in AI-MI segments. By contrast, the initiation and establishment of immune defense mechanisms became especially relevant in PI, although the microarray expression profiling validated by qPCR indicated that these functional changes are gradual from anterior to posterior intestinal segments. This functional divergence occurred in association with spatial transcriptional changes in nutrient transporters and the mucosal chemosensing system via G protein-coupled receptors. These findings contribute to identify key indicators of gut functions and to compare different fish feeding strategies and immune defense mechanisms acquired along the evolution of teleosts. PMID:27610085
Gene Expression Profiling Reveals Functional Specialization along the Intestinal Tract of a Carnivorous Teleostean Fish (Dicentrarchus labrax).

PubMed

Calduch-Giner, Josep A; Sitjà-Bobadilla, Ariadna; Pérez-Sánchez, Jaume

2016-01-01

High-quality sequencing reads from the intestine of European sea bass were assembled, annotated by similarity against protein reference databases and combined with nucleotide sequences from public and private databases. After redundancy filtering, 24,906 non-redundant annotated sequences encoding 15,367 different gene descriptions were obtained. These annotated sequences were used to design a custom, high-density oligo-microarray (8 × 15 K) for the transcriptomic profiling of anterior (AI), middle (MI), and posterior (PI) intestinal segments. Similar molecular signatures were found for AI and MI segments, which were combined in a single group (AI-MI) whereas the PI outstood separately, with more than 1900 differentially expressed genes with a fold-change cutoff of 2. Functional analysis revealed that molecular and cellular functions related to feed digestion and nutrient absorption and transport were over-represented in AI-MI segments. By contrast, the initiation and establishment of immune defense mechanisms became especially relevant in PI, although the microarray expression profiling validated by qPCR indicated that these functional changes are gradual from anterior to posterior intestinal segments. This functional divergence occurred in association with spatial transcriptional changes in nutrient transporters and the mucosal chemosensing system via G protein-coupled receptors. These findings contribute to identify key indicators of gut functions and to compare different fish feeding strategies and immune defense mechanisms acquired along the evolution of teleosts.
The prognostic role of Leucine-rich repeat-containing G-protein-coupled receptor 5 in gastric cancer: A systematic review with meta-analysis.

PubMed

Huang, Tianchen; Qiu, Xinguang; Xiao, Jianan; Wang, Qingbing; Wang, Yanjun; Zhang, Yong; Bai, Dongxiao

2016-04-01

The prognostic value of Leucine-rich repeat-containing G-protein-coupled receptor 5 (LGR5) in gastric cancer remains controversial. To further investigate this relationship, we performed meta-analyses to systematically review the association between LGR5 expression and various clinical parameters in gastric cancer patients. Eligible studies from PubMed, Embase, Web of Science, CNKI (Chinese National Knowledge Infrastructure), Wangfang (Database of Chinese Ministry of Science & Technology) and CBM (China Biological Medicine) databases were evaluated to investigate the association of LGR5 expression with overall survival (OS) and clinicopathological features of gastric cancer. LGR5 overexpression was significantly associated with poor OS in patients with gastric cancer (HR 1.66, 95% CI 1.02-2.69). LGR5 overexpression was also significantly associated with TNM stage (TIII/TIV vs TI/TII: OR 5.42, 95% CI 1.02-28.72) and lymph node metastasis (positive vs negative: OR 2.30, 95% CI 1.06-5.0). Our meta-analysis indicates that LGR5 may be a predictive factor for invasion and metastasis, and poor prognosis in patients with gastric cancer. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Choosing an Optimal Database for Protein Identification from Tandem Mass Spectrometry Data.

PubMed

Kumar, Dhirendra; Yadav, Amit Kumar; Dash, Debasis

2017-01-01

Database searching is the preferred method for protein identification from digital spectra of mass to charge ratios (m/z) detected for protein samples through mass spectrometers. The search database is one of the major influencing factors in discovering proteins present in the sample and thus in deriving biological conclusions. In most cases the choice of search database is arbitrary. Here we describe common search databases used in proteomic studies and their impact on final list of identified proteins. We also elaborate upon factors like composition and size of the search database that can influence the protein identification process. In conclusion, we suggest that choice of the database depends on the type of inferences to be derived from proteomics data. However, making additional efforts to build a compact and concise database for a targeted question should generally be rewarding in achieving confident protein identifications.
Initial proteome analysis of caffeine-induced proteins in Aspergillus tamarii using two-dimensional fluorescence difference gel electrophoresis.

PubMed

Gutiérrez-Sánchez, Gerardo; Atwood, James; Kolli, V S Kumar; Roussos, Sévastianos; Augur, Christopher

2012-04-01

Caffeine is toxic to most microorganisms. However, some filamentous fungi, such as Aspergillus tamarii, are able to metabolize this alkaloid when fed caffeine as the sole nitrogen source. The aim of the present work was to identify intracellular A. tamarii proteins, regulated by caffeine, using fluorescence difference two-dimensional gel electrophoresis. Specific proteins from two culture media of A. tamarii grown either on ammonium sulfate or caffeine as the sole nitrogen source were analysed by mass spectrometry. Thirteen out of a total of 85 differentially expressed spots were identified after database search. Identified up-regulated proteins include phosphoglycerate kinase, malate dehydrogenase, dyp-type peroxidase family protein, heat shock protein, Cu, Zn superoxidase dismutase and xanthine dehydrogenase. Some of the proteins identified in this study are involved in the caffeine degradation pathway as well as in stress response, suggesting that stress proteins could be involved in caffeine metabolism in filamentous fungi.
Complement C4a inhibits the secretion of hepatitis B virus screened by surface-enhanced laser desorption ionization time-flight mass spectrometry-based ProteinChip analysis.

PubMed

Song, Ya-Nan; Zhang, Gui-Biao; Hu, Xue-Qing; Lu, Yi-Yu; Zhao, Yu; Yang, Yang; Yang, Yi-Fu; Zhang, Yong-Yu; Hu, Yi-Yang; Su, Shi-Bing

2015-12-01

Chronic hepatitis B (CHB) is a kind of chronic liver disease caused by persistent hepatitis B virus (HBV) infection. The study aims to seek the factors of host resistance to HBV and investigate their roles. Protein profiles of 58 healthy controls and 121 CHB patients were obtained by SELDI-TOF/MS. Predicted protein was validated by ELISA. Protein expression was evaluated by Western blot in the persistently HBV expressing cell line HepG2.2.15 and non-HBV expressing cell line HepG2. The level of HBV DNA was subsequently detected by quantitative real-time PCR in HepG2.2.15 cells with complement C4a treatment. Significantly altered protein peaks were found through statistical analysis, and m/z 4300 was predicted by databases and successfully matched with the fragment of complement C4a. According to ELISA, serum complement C4a was found to be significantly lower in CHB patients compared with healthy controls (p < 0.001) and the area under receiver operating characteristics curve is 0.78. Furthermore, complement C4a showed lower expression in HepG2.2.5 cells and the secretion of HBV DNA was inhibited by complement C4a. The present study implied the important role of complement C4a in inhibiting the HBV DNA secretion in CHB. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Discovering genes associated with dormancy in the monogonont rotifer Brachionus plicatilis

PubMed Central

Denekamp, Nadav Y; Thorne, Michael AS; Clark, Melody S; Kube, Michael; Reinhardt, Richard; Lubzens, Esther

2009-01-01

Background Microscopic monogonont rotifers, including the euryhaline species Brachionus plicatilis, are typically found in water bodies where environmental factors restrict population growth to short periods lasting days or months. The survival of the population is ensured via the production of resting eggs that show a remarkable tolerance to unfavorable conditions and remain viable for decades. The aim of this study was to generate Expressed Sequence Tags (ESTs) for molecular characterisation of processes associated with the formation of resting eggs, their survival during dormancy and hatching. Results Four normalized and four subtractive libraries were constructed to provide a resource for rotifer transcriptomics associated with resting-egg formation, storage and hatching. A total of 47,926 sequences were assembled into 18,000 putative transcripts and analyzed using both Blast and GO annotation. About 28–55% (depending on the library) of the clones produced significant matches against the Swissprot and Trembl databases. Genes known to be associated with desiccation tolerance during dormancy in other organisms were identified in the EST libraries. These included genes associated with antioxidant activity, low molecular weight heat shock proteins and Late Embryonic Abundant (LEA) proteins. Real-time PCR confirmed that LEA transcripts, small heat-shock proteins and some antioxidant genes were upregulated in resting eggs, therefore suggesting that desiccation tolerance is a characteristic feature of resting eggs even though they do not necessarily fully desiccate during dormancy. The role of trehalose in resting-egg formation and survival remains unclear since there was no significant difference between resting-egg producing females and amictic females in the expression of the tps-1 gene. In view of the absence of vitellogenin transcripts, matches to lipoprotein lipase proteins suggest that, similar to the situation in dipterans, these proteins may serve as the yolk proteins in rotifers. Conclusion The 47,926 ESTs expand significantly the current sequence resource of B. plicatilis. It describes, for the first time, genes putatively associated with resting eggs and will serve as a database for future global expression experiments, particularly for the further identification of dormancy related genes. PMID:19284654
Discovering genes associated with dormancy in the monogonont rotifer Brachionus plicatilis.

PubMed

Denekamp, Nadav Y; Thorne, Michael A S; Clark, Melody S; Kube, Michael; Reinhardt, Richard; Lubzens, Esther

2009-03-13

Microscopic monogonont rotifers, including the euryhaline species Brachionus plicatilis, are typically found in water bodies where environmental factors restrict population growth to short periods lasting days or months. The survival of the population is ensured via the production of resting eggs that show a remarkable tolerance to unfavorable conditions and remain viable for decades. The aim of this study was to generate Expressed Sequence Tags (ESTs) for molecular characterisation of processes associated with the formation of resting eggs, their survival during dormancy and hatching. Four normalized and four subtractive libraries were constructed to provide a resource for rotifer transcriptomics associated with resting-egg formation, storage and hatching. A total of 47,926 sequences were assembled into 18,000 putative transcripts and analyzed using both Blast and GO annotation. About 28-55% (depending on the library) of the clones produced significant matches against the Swissprot and Trembl databases. Genes known to be associated with desiccation tolerance during dormancy in other organisms were identified in the EST libraries. These included genes associated with antioxidant activity, low molecular weight heat shock proteins and Late Embryonic Abundant (LEA) proteins. Real-time PCR confirmed that LEA transcripts, small heat-shock proteins and some antioxidant genes were upregulated in resting eggs, therefore suggesting that desiccation tolerance is a characteristic feature of resting eggs even though they do not necessarily fully desiccate during dormancy. The role of trehalose in resting-egg formation and survival remains unclear since there was no significant difference between resting-egg producing females and amictic females in the expression of the tps-1 gene. In view of the absence of vitellogenin transcripts, matches to lipoprotein lipase proteins suggest that, similar to the situation in dipterans, these proteins may serve as the yolk proteins in rotifers. The 47,926 ESTs expand significantly the current sequence resource of B. plicatilis. It describes, for the first time, genes putatively associated with resting eggs and will serve as a database for future global expression experiments, particularly for the further identification of dormancy related genes.
SorghumFDB: sorghum functional genomics database with multidimensional network analysis.

PubMed

Tian, Tian; You, Qi; Zhang, Liwei; Yi, Xin; Yan, Hengyu; Xu, Wenying; Su, Zhen

2016-01-01

Sorghum (Sorghum bicolor [L.] Moench) has excellent agronomic traits and biological properties, such as heat and drought-tolerance. It is a C4 grass and potential bioenergy-producing plant, which makes it an important crop worldwide. With the sorghum genome sequence released, it is essential to establish a sorghum functional genomics data mining platform. We collected genomic data and some functional annotations to construct a sorghum functional genomics database (SorghumFDB). SorghumFDB integrated knowledge of sorghum gene family classifications (transcription regulators/factors, carbohydrate-active enzymes, protein kinases, ubiquitins, cytochrome P450, monolignol biosynthesis related enzymes, R-genes and organelle-genes), detailed gene annotations, miRNA and target gene information, orthologous pairs in the model plants Arabidopsis, rice and maize, gene loci conversions and a genome browser. We further constructed a dynamic network of multidimensional biological relationships, comprised of the co-expression data, protein-protein interactions and miRNA-target pairs. We took effective measures to combine the network, gene set enrichment and motif analyses to determine the key regulators that participate in related metabolic pathways, such as the lignin pathway, which is a major biological process in bioenergy-producing plants.Database URL: http://structuralbiology.cau.edu.cn/sorghum/index.html. © The Author(s) 2016. Published by Oxford University Press.
Essential proteins and possible therapeutic targets of Wolbachia endosymbiont and development of FiloBase-a comprehensive drug target database for Lymphatic filariasis

NASA Astrophysics Data System (ADS)

Sharma, Om Prakash; Kumar, Muthuvel Suresh

2016-01-01

Lymphatic filariasis (Lf) is one of the oldest and most debilitating tropical diseases. Millions of people are suffering from this prevalent disease. It is estimated to infect over 120 million people in at least 80 nations of the world through the tropical and subtropical regions. More than one billion people are in danger of getting affected with this life-threatening disease. Several studies were suggested its emerging limitations and resistance towards the available drugs and therapeutic targets for Lf. Therefore, better medicine and drug targets are in demand. We took an initiative to identify the essential proteins of Wolbachia endosymbiont of Brugia malayi, which are indispensable for their survival and non-homologous to human host proteins. In this current study, we have used proteome subtractive approach to screen the possible therapeutic targets for wBm. In addition, numerous literatures were mined in the hunt for potential drug targets, drugs, epitopes, crystal structures, and expressed sequence tag (EST) sequences for filarial causing nematodes. Data obtained from our study were presented in a user friendly database named FiloBase. We hope that information stored in this database may be used for further research and drug development process against filariasis. URL: http://filobase.bicpu.edu.in.
Large-Scale Concatenation cDNA Sequencing

PubMed Central

Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.

1997-01-01

A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
DenHunt - A Comprehensive Database of the Intricate Network of Dengue-Human Interactions

PubMed Central

Arjunan, Selvam; Sastri, Narayan P.; Chandra, Nagasuma

2016-01-01

Dengue virus (DENV) is a human pathogen and its etiology has been widely established. There are many interactions between DENV and human proteins that have been reported in literature. However, no publicly accessible resource for efficiently retrieving the information is yet available. In this study, we mined all publicly available dengue–human interactions that have been reported in the literature into a database called DenHunt. We retrieved 682 direct interactions of human proteins with dengue viral components, 382 indirect interactions and 4120 differentially expressed human genes in dengue infected cell lines and patients. We have illustrated the importance of DenHunt by mapping the dengue–human interactions on to the host interactome and observed that the virus targets multiple host functional complexes of important cellular processes such as metabolism, immune system and signaling pathways suggesting a potential role of these interactions in viral pathogenesis. We also observed that 7 percent of the dengue virus interacting human proteins are also associated with other infectious and non-infectious diseases. Finally, the understanding that comes from such analyses could be used to design better strategies to counteract the diseases caused by dengue virus. The whole dataset has been catalogued in a searchable database, called DenHunt (http://proline.biochem.iisc.ernet.in/DenHunt/). PMID:27618709

DenHunt - A Comprehensive Database of the Intricate Network of Dengue-Human Interactions.

PubMed

Karyala, Prashanthi; Metri, Rahul; Bathula, Christopher; Yelamanchi, Syam K; Sahoo, Lipika; Arjunan, Selvam; Sastri, Narayan P; Chandra, Nagasuma

2016-09-01

Dengue virus (DENV) is a human pathogen and its etiology has been widely established. There are many interactions between DENV and human proteins that have been reported in literature. However, no publicly accessible resource for efficiently retrieving the information is yet available. In this study, we mined all publicly available dengue-human interactions that have been reported in the literature into a database called DenHunt. We retrieved 682 direct interactions of human proteins with dengue viral components, 382 indirect interactions and 4120 differentially expressed human genes in dengue infected cell lines and patients. We have illustrated the importance of DenHunt by mapping the dengue-human interactions on to the host interactome and observed that the virus targets multiple host functional complexes of important cellular processes such as metabolism, immune system and signaling pathways suggesting a potential role of these interactions in viral pathogenesis. We also observed that 7 percent of the dengue virus interacting human proteins are also associated with other infectious and non-infectious diseases. Finally, the understanding that comes from such analyses could be used to design better strategies to counteract the diseases caused by dengue virus. The whole dataset has been catalogued in a searchable database, called DenHunt (http://proline.biochem.iisc.ernet.in/DenHunt/).
Identification of genes differentially expressed in ectomycorrhizal roots during the Pinus pinaster-Laccaria bicolor interaction.

PubMed

Flores-Monterroso, Aranzazu; Canales, Javier; de la Torre, Fernando; Ávila, Concepción; Cánovas, Francisco M

2013-06-01

Ectomycorrhizal associations are of major ecological importance in temperate and boreal forests. The development of a functional ectomycorrhiza requires many genetic and biochemical changes. In this study, suppressive subtraction hybridization was used to identify differentially expressed genes in the roots of maritime pine (Pinus pinaster Aiton) inoculated with Laccaria bicolor, a mycorrhizal fungus. A total number of 200 unigenes were identified as being differentially regulated in maritime pine roots during the development of mycorrhiza. These unigenes were classified into 10 categories according to the function of their homologues in the GenBank database. Approximately, 40 % of the differentially expressed transcripts were genes that coded for unknown proteins in the databases or that had no homology to known genes. A group of these differentially expressed genes was selected to validate the results using quantitative real-time PCR. The transcript levels of the representative genes were compared between the non-inoculated and inoculated plants at 1, 5, 15 and 30 days after inoculation. The observed expression patterns indicate (1) changes in the composition of the wall cell, (2) tight regulation of defence genes during the development of mycorrhiza and (3) changes in carbon and nitrogen metabolism. Ammonium excess or deficiency dramatically affected the stability of ectomycorrhiza and altered gene expression in maritime pine roots.
Serum depletion induces changes in protein expression in the trophoblast-derived cell line HTR-8/SVneo.

PubMed

Novoa-Herran, Susana; Umaña-Perez, Adriana; Canals, Francesc; Sanchez-Gomez, Myriam

2016-01-01

How nutrition and growth factor restriction due to serum depletion affect trophoblast function remains poorly understood. We performed a proteomic differential study of the effects of serum depletion on a first trimester human immortalized trophoblast cell line. The viability of HTR-8/SVneo trophoblast cells in culture with 0, 0.5 and 10 % fetal bovine serum (FBS) were assayed via MTT at 24, 48 and 64 h. A comparative proteomic analysis of the cells grown with those FBS levels for 24 h was performed using two-dimensional electrophoresis (2DE), followed by mass spectrometry for protein spot identification, and a database search and bioinformatics analysis of the expressed proteins. Differential spots were identified using the Kolmogorov-Smirnov test ( n = 3, significance level 0.10, D > 0.642) and/or ANOVA ( n = 3, p < 0.05). The results showed that low serum doses or serum depletion differentially affect cell growth and protein expression. Differential expression was seen in 25 % of the protein spots grown with 0.5 % FBS and in 84 % of those grown with 0 % FBS, using 10 % serum as the physiological control. In 0.5 % FBS, this difference was related with biological processes typically affected by the serum, such as cell cycle, regulation of apoptosis and proliferation. In addition to these changes, in the serum-depleted proteome we observed downregulation of keratin 8, and upregulation of vimentin, the glycolytic enzymes enolase and pyruvate kinase (PKM2) and tumor progression-related inosine-5'-monophosphate dehydrogenase 2 (IMPDH2) enzyme. The proteins regulated by total serum depletion, but not affected by growth in 0.5 % serum, are members of the glycolytic and nucleotide metabolic pathways and the epithelial-to-mesenchymal transition (EMT), suggesting an adaptive switch characteristic of malignant cells. This comparative proteomic analysis and the identified proteins are the first evidence of a protein expression response to serum depletion in a trophoblast cell model. Our results show that serum depletion induces specific changes in protein expression concordant with main cell metabolic adaptations and EMT, resembling the progression to a malignant phenotype.
Identification of new stress-induced microRNA and their targets in wheat using computational approach.

PubMed

Pandey, Bharati; Gupta, Om Prakash; Pandey, Dev Mani; Sharma, Indu; Sharma, Pradeep

2013-05-01

MicroRNAs (miRNAs) are a class of short endogenous non-coding small RNA molecules of about 18-22 nucleotides in length. Their main function is to downregulate gene expression in different manners like translational repression, mRNA cleavage and epigenetic modification. Computational predictions have raised the number of miRNAs in wheat significantly using an EST based approach. Hence, a combinatorial approach which is amalgamation of bioinformatics software and perl script was used to identify new miRNA to add to the growing database of wheat miRNA. Identification of miRNAs was initiated by mining the EST (Expressed Sequence Tags) database available at National Center for Biotechnology Information. In this investigation, 4677 mature microRNA sequences belonging to 50 miRNA families from different plant species were used to predict miRNA in wheat. A total of five abiotic stress-responsive new miRNAs were predicted and named Ta-miR5653, Ta-miR855, Ta-miR819k, Ta-miR3708 and Ta-miR5156. In addition, four previously identified miRNA, i.e., Ta-miR1122, miR1117, Ta-miR1134 and Ta-miR1133 were predicted in newly identified EST sequence and 14 potential target genes were subsequently predicted, most of which seems to encode ubiquitin carrier protein, serine/threonine protein kinase, 40S ribosomal protein, F-box/kelch-repeat protein, BTB/POZ domain-containing protein, transcription factors which are involved in growth, development, metabolism and stress response. Our result has increased the number of miRNAs in wheat, which should be useful for further investigation into the biological functions and evolution of miRNAs in wheat and other plant species.
Alternative promoter usage generates novel shorter MAPT mRNA transcripts in Alzheimer's disease and progressive supranuclear palsy brains.

PubMed

Huin, Vincent; Buée, Luc; Behal, Hélène; Labreuche, Julien; Sablonnière, Bernard; Dhaenens, Claire-Marie

2017-10-03

Alternative promoter usage is an important mechanism for transcriptome diversity and the regulation of gene expression. Indeed, this alternative usage may influence tissue/subcellular specificity, protein translation and function of the proteins. The existence of an alternative promoter for MAPT gene was considered for a long time to explain differential tissue specificity and differential response to transcription and growth factors between mRNA transcripts. The alternative promoter usage could explain partly the different tau proteins expression patterns observed in tauopathies. Here, we report on our discovery of a functional alternative promoter for MAPT, located upstream of the gene's second exon (exon 1). By analyzing genome databases and brain tissue from control individuals and patients with Alzheimer's disease or progressive supranuclear palsy, we identified novel shorter transcripts derived from this alternative promoter. These transcripts are increased in patients' brain tissue as assessed by 5'RACE-PCR and qPCR. We suggest that these new MAPT isoforms can be translated into normal or amino-terminal-truncated tau proteins. We further suggest that activation of MAPT's alternative promoter under pathological conditions leads to the production of truncated proteins, changes in protein localization and function, and thus neurodegeneration.
The Human Cell Surfaceome of Breast Tumors

PubMed Central

da Cunha, Júlia Pinheiro Chagas; Galante, Pedro Alexandre Favoretto; de Souza, Jorge Estefano Santana; Pieprzyk, Martin; Carraro, Dirce Maria; Old, Lloyd J.; Camargo, Anamaria Aranha; de Souza, Sandro José

2013-01-01

Introduction. Cell surface proteins are ideal targets for cancer therapy and diagnosis. We have identified a set of more than 3700 genes that code for transmembrane proteins believed to be at human cell surface. Methods. We used a high-throuput qPCR system for the analysis of 573 cell surface protein-coding genes in 12 primary breast tumors, 8 breast cell lines, and 21 normal human tissues including breast. To better understand the role of these genes in breast tumors, we used a series of bioinformatics strategies to integrates different type, of the datasets, such as KEGG, protein-protein interaction databases, ONCOMINE, and data from, literature. Results. We found that at least 77 genes are overexpressed in breast primary tumors while at least 2 of them have also a restricted expression pattern in normal tissues. We found common signaling pathways that may be regulated in breast tumors through the overexpression of these cell surface protein-coding genes. Furthermore, a comparison was made between the genes found in this report and other genes associated with features clinically relevant for breast tumorigenesis. Conclusions. The expression profiling generated in this study, together with an integrative bioinformatics analysis, allowed us to identify putative targets for breast tumors. PMID:24195083
MIPS: analysis and annotation of proteins from whole genomes.

PubMed

Mewes, H W; Amid, C; Arnold, R; Frishman, D; Güldener, U; Mannhaupt, G; Münsterkötter, M; Pagel, P; Strack, N; Stümpflen, V; Warfsmann, J; Ruepp, A

2004-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein-protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).
Proteomic Analysis of Matched Formalin-Fixed, Paraffin-Embedded Specimens in Patients with Advanced Serous Ovarian Carcinoma

PubMed Central

Smith, Ashlee L.; Sun, Mai; Bhargava, Rohit; Stewart, Nicolas A.; Flint, Melanie S.; Bigbee, William L.; Krivak, Thomas C.; Strange, Mary A.; Cooper, Kristine L.; Zorn, Kristin K.

2013-01-01

Objective: The biology of high grade serous ovarian carcinoma (HGSOC) is poorly understood. Little has been reported on intratumoral homogeneity or heterogeneity of primary HGSOC tumors and their metastases. We evaluated the global protein expression profiles of paired primary and metastatic HGSOC from formalin-fixed, paraffin-embedded (FFPE) tissue samples. Methods: After IRB approval, six patients with advanced HGSOC were identified with tumor in both ovaries at initial surgery. Laser capture microdissection (LCM) was used to extract tumor for protein digestion. Peptides were extracted and analyzed by reversed-phase liquid chromatography coupled to a linear ion trap mass spectrometer. Tandem mass spectra were searched against the UniProt human protein database. Differences in protein abundance between samples were assessed and analyzed by Ingenuity Pathway Analysis software. Immunohistochemistry (IHC) for select proteins from the original and an additional validation set of five patients was performed. Results: Unsupervised clustering of the abundance profiles placed the paired specimens adjacent to each other. IHC H-score analysis of the validation set revealed a strong correlation between paired samples for all proteins. For the similarly expressed proteins, the estimated correlation coefficients in two of three experimental samples and all validation samples were statistically significant (p < 0.05). The estimated correlation coefficients in the experimental sample proteins classified as differentially expressed were not statistically significant. Conclusion: A global proteomic screen of primary HGSOC tumors and their metastatic lesions identifies tumoral homogeneity and heterogeneity and provides preliminary insight into these protein profiles and the cellular pathways they constitute. PMID:28250404
IntegromeDB: an integrated system and biological search engine.

PubMed

Baitaluk, Michael; Kozhenkov, Sergey; Dubinina, Yulia; Ponomarenko, Julia

2012-01-19

With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback.
Identification of membrane proteome of Paracoccidioides lutzii and its regulation by zinc

PubMed Central

de Curcio, Juliana Santana; Silva, Marielle Garcia; Silva Bailão, Mirelle Garcia; Báo, Sônia Nair; Casaletti, Luciana; Bailão, Alexandre Mello; de Almeida Soares, Célia Maria

2017-01-01

Aim: During infection development in the host, Paracoccidioides spp. faces the deprivation of micronutrients, a mechanism called nutritional immunity. This condition induces the remodeling of proteins present in different metabolic pathways. Therefore, we attempted to identify membrane proteins and their regulation by zinc in Paracoccidioides lutzii. Materials & methods: Membranes enriched fraction of yeast cells of P. lutzii were isolated, purified and identified by 2D LC–MS/MS detection and database search. Results & conclusion: Zinc deprivation suppressed the expression of membrane proteins such as glycoproteins, those involved in cell wall synthesis and those related to oxidative phosphorylation. This is the first study describing membrane proteins and the effect of zinc deficiency in their regulation in one member of the genus Paracoccidioides. PMID:29134119
Identification of Proteins Modulated in the Date Palm Stem Infested with Red Palm Weevil (Rhynchophorus ferrugineus Oliv.) Using Two Dimensional Differential Gel Electrophoresis and Mass Spectrometry

PubMed Central

Rasool, Khawaja Ghulam; Khan, Muhammad Altaf; Aldawood, Abdulrahman Saad; Tufail, Muhammad; Mukhtar, Muhammad; Takeda, Makio

2015-01-01

A state of the art proteomic methodology using Matrix Assisted Laser Desorption/Ionization-Time of Flight (MALDI TOF) has been employed to characterize peptides modulated in the date palm stem subsequent to infestation with red palm weevil (RPW). Our analyses revealed 32 differentially expressed peptides associated with RPW infestation in date palm stem. To identify RPW infestation associated peptides (I), artificially wounded plants (W) were used as additional control beside uninfested plants, a conventional control (C). A constant unique pattern of differential expression in infested (I), wounded (W) stem samples compared to control (C) was observed. The upregulated proteins showed relative fold intensity in order of I > W and downregulated spots trend as W > I, a quite interesting pattern. This study also reveals that artificially wounding of date palm stem affects almost the same proteins as infestation; however, relative intensity is quite lower than in infested samples both in up and downregulated spots. All 32 differentially expressed spots were subjected to MALDI-TOF analysis for their identification and we were able to match 21 proteins in the already existing databases. Relatively significant modulated expression pattern of a number of peptides in infested plants predicts the possibility of developing a quick and reliable molecular methodology for detecting plants infested with date palm. PMID:26287180
Identification of Proteins Modulated in the Date Palm Stem Infested with Red Palm Weevil (Rhynchophorus ferrugineus Oliv.) Using Two Dimensional Differential Gel Electrophoresis and Mass Spectrometry.

PubMed

Rasool, Khawaja Ghulam; Khan, Muhammad Altaf; Aldawood, Abdulrahman Saad; Tufail, Muhammad; Mukhtar, Muhammad; Takeda, Makio

2015-08-17

A state of the art proteomic methodology using Matrix Assisted Laser Desorption/Ionization-Time of Flight (MALDI TOF) has been employed to characterize peptides modulated in the date palm stem subsequent to infestation with red palm weevil (RPW). Our analyses revealed 32 differentially expressed peptides associated with RPW infestation in date palm stem. To identify RPW infestation associated peptides (I), artificially wounded plants (W) were used as additional control beside uninfested plants, a conventional control (C). A constant unique pattern of differential expression in infested (I), wounded (W) stem samples compared to control (C) was observed. The upregulated proteins showed relative fold intensity in order of I > W and downregulated spots trend as W > I, a quite interesting pattern. This study also reveals that artificially wounding of date palm stem affects almost the same proteins as infestation; however, relative intensity is quite lower than in infested samples both in up and downregulated spots. All 32 differentially expressed spots were subjected to MALDI-TOF analysis for their identification and we were able to match 21 proteins in the already existing databases. Relatively significant modulated expression pattern of a number of peptides in infested plants predicts the possibility of developing a quick and reliable molecular methodology for detecting plants infested with date palm.
Protein interaction networks from literature mining

NASA Astrophysics Data System (ADS)

Ihara, Sigeo

2005-03-01

The ability to accurately predict and understand physiological changes in the biological network system in response to disease or drug therapeutics is of crucial importance in life science. The extensive amount of gene expression data generated from even a single microarray experiment often proves difficult to fully interpret and comprehend the biological significance. An increasing knowledge of protein interactions stored in the PubMed database, as well as the advancement of natural language processing, however, makes it possible to construct protein interaction networks from the gene expression information that are essential for understanding the biological meaning. From the in house literature mining system we have developed, the protein interaction network for humans was constructed. By analysis based on the graph-theoretical characterization of the total interaction network in literature, we found that the network is scale-free and semantic long-ranged interactions (i.e. inhibit, induce) between proteins dominate in the total interaction network, reducing the degree exponent. Interaction networks generated based on scientific text in which the interaction event is ambiguously described result in disconnected networks. In contrast interaction networks based on text in which the interaction events are clearly stated result in strongly connected networks. The results of protein-protein interaction networks obtained in real applications from microarray experiments are discussed: For example, comparisons of the gene expression data indicative of either a good or a poor prognosis for acute lymphoblastic leukemia with MLL rearrangements, using our system, showed newly discovered signaling cross-talk.
Novel signatures of cancer-associated fibroblasts.

PubMed

Bozóky, Benedek; Savchenko, Andrii; Csermely, Péter; Korcsmáros, Tamás; Dúl, Zoltán; Pontén, Fredrik; Székely, László; Klein, George

2013-07-15

Increasing evidence indicates the importance of the tumor microenvironment, in particular cancer-associated fibroblasts, in cancer development and progression. In our study, we developed a novel, visually based method to identify new immunohistochemical signatures of these fibroblasts. The method employed a protein list based on 759 protein products of genes identified by RNA profiling from our previous study, comparing fibroblasts with differential growth-modulating effect on human cancers cells, and their first neighbors in the human protein interactome. These 2,654 proteins were analyzed in the Human Protein Atlas online database by comparing their immunohistochemical expression patterns in normal versus tumor-associated fibroblasts. Twelve new proteins differentially expressed in cancer-associated fibroblasts were identified (DLG1, BHLHE40, ROCK2, RAB31, AZI2, PKM2, ARHGAP31, ARHGAP26, ITCH, EGLN1, RNF19A and PLOD2), four of them can be connected to the Rho kinase signaling pathway. They were further analyzed in several additional tumor stromata and revealed that the majority showed congruence among the different tumors. Many of them were also positive in normal myofibroblast-like cells. The new signatures can be useful in immunohistochemical analysis of different tumor stromata and may also give us an insight into the pathways activated in them in their true in vivo context. The method itself could be used for other similar analysis to identify proteins expressed in other cell types in tumors and their surrounding microenvironment. Copyright © 2013 UICC.
A human protein atlas for normal and cancer tissues based on antibody proteomics.

PubMed

Uhlén, Mathias; Björling, Erik; Agaton, Charlotta; Szigyarto, Cristina Al-Khalili; Amini, Bahram; Andersen, Elisabet; Andersson, Ann-Catrin; Angelidou, Pia; Asplund, Anna; Asplund, Caroline; Berglund, Lisa; Bergström, Kristina; Brumer, Harry; Cerjan, Dijana; Ekström, Marica; Elobeid, Adila; Eriksson, Cecilia; Fagerberg, Linn; Falk, Ronny; Fall, Jenny; Forsberg, Mattias; Björklund, Marcus Gry; Gumbel, Kristoffer; Halimi, Asif; Hallin, Inga; Hamsten, Carl; Hansson, Marianne; Hedhammar, My; Hercules, Görel; Kampf, Caroline; Larsson, Karin; Lindskog, Mats; Lodewyckx, Wald; Lund, Jan; Lundeberg, Joakim; Magnusson, Kristina; Malm, Erik; Nilsson, Peter; Odling, Jenny; Oksvold, Per; Olsson, Ingmarie; Oster, Emma; Ottosson, Jenny; Paavilainen, Linda; Persson, Anja; Rimini, Rebecca; Rockberg, Johan; Runeson, Marcus; Sivertsson, Asa; Sköllermo, Anna; Steen, Johanna; Stenvall, Maria; Sterky, Fredrik; Strömberg, Sara; Sundberg, Mårten; Tegel, Hanna; Tourle, Samuel; Wahlund, Eva; Waldén, Annelie; Wan, Jinghong; Wernérus, Henrik; Westberg, Joakim; Wester, Kenneth; Wrethagen, Ulla; Xu, Lan Lan; Hober, Sophia; Pontén, Fredrik

2005-12-01

Antibody-based proteomics provides a powerful approach for the functional study of the human proteome involving the systematic generation of protein-specific affinity reagents. We used this strategy to construct a comprehensive, antibody-based protein atlas for expression and localization profiles in 48 normal human tissues and 20 different cancers. Here we report a new publicly available database containing, in the first version, approximately 400,000 high resolution images corresponding to more than 700 antibodies toward human proteins. Each image has been annotated by a certified pathologist to provide a knowledge base for functional studies and to allow queries about protein profiles in normal and disease tissues. Our results suggest it should be possible to extend this analysis to the majority of all human proteins thus providing a valuable tool for medical and biological research.
Proteome analysis provides insight into the regulation of bioactive metabolites in Hericium erinaceus.

PubMed

Zeng, Xu; Ling, Hong; Yang, Jianwen; Chen, Juan; Guo, Shunxing

2018-05-05

Hericium erinaceus, a famous edible mushroom, is also a well-known traditional medicinal fungus. To date, a large number of bioactive metabolites with antitumor, antibacterial, and immune-boosting effects were isolated from the free-living mycelium and fruiting body of H. erinaceus. Here we used the proteomic approach to explore proteins involved in the regulation of bioactive metabolites, including terpenoid, polyketide, sterol and etc. RESULTS: Using mass spectrometry, a total of 2543 unique proteins were identified using H. erinaceus genome, of which 2449, 1855, 1533 and 690 proteins were successfully annotated in Nr, KOG, KEGG and GO databases. Among them, 722 proteins were differentially expressed (528 up- and 194 down-regulated) in fruiting body compared with mycelium. Most of differentially expressed proteins were putatively involved in energy metabolism, molecular signaling, and secondary metabolism. Additionally, numerous proteins involved in terpenoid, polyketide, and sterol biosynthesis were identified. Our data revealed that proteins involved in polyketide biosynthesis were up-regulated in the fruiting body, while some proteins in mevalonate (MEP) pathway from terpenoid biosynthesis were generally up-regulated in mycelium. The present study suggested that the differential regulation of biosynthesis genes could produce various bioactive metabolites with pharmacological effects in H. erinaceus. Copyright © 2017. Published by Elsevier B.V.
NCBI GEO: archive for functional genomics data sets—10 years on

PubMed Central

Barrett, Tanya; Troup, Dennis B.; Wilhite, Stephen E.; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F.; Tomashevsky, Maxim; Marshall, Kimberly A.; Phillippy, Katherine H.; Sherman, Patti M.; Muertter, Rolf N.; Holko, Michelle; Ayanbule, Oluwabukunmi; Yefanov, Andrey; Soboleva, Alexandra

2011-01-01

A decade ago, the Gene Expression Omnibus (GEO) database was established at the National Center for Biotechnology Information (NCBI). The original objective of GEO was to serve as a public repository for high-throughput gene expression data generated mostly by microarray technology. However, the research community quickly applied microarrays to non-gene-expression studies, including examination of genome copy number variation and genome-wide profiling of DNA-binding proteins. Because the GEO database was designed with a flexible structure, it was possible to quickly adapt the repository to store these data types. More recently, as the microarray community switches to next-generation sequencing technologies, GEO has again adapted to host these data sets. Today, GEO stores over 20 000 microarray- and sequence-based functional genomics studies, and continues to handle the majority of direct high-throughput data submissions from the research community. Multiple mechanisms are provided to help users effectively search, browse, download and visualize the data at the level of individual genes or entire studies. This paper describes recent database enhancements, including new search and data representation tools, as well as a brief review of how the community uses GEO data. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/. PMID:21097893
The Multinational Arabidopsis Steering Subcommittee for Proteomics Assembles the Largest Proteome Database Resource for Plant Systems Biology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Weckwerth, Wolfram; Baginsky, Sacha; Van Wijk, Klass

2009-12-01

In the past 10 years, we have witnessed remarkable advances in the field of plant molecular biology. The rapid development of proteomic technologies and the speed with which these techniques have been applied to the field have altered our perception of how we can analyze proteins in complex systems. At nearly the same time, the availability of the complete genome for the model plant Arabidopsis thaliana was released; this effort provides an unsurpassed resource for the identification of proteins when researchers use MS to analyze plant samples. Recognizing the growth in this area, the Multinational Arabidopsis Steering Committee (MASC) establishedmore » a subcommittee for A. thaliana proteomics in 2006 with the objective of consolidating databases, technique standards, and experimentally validated candidate genes and functions. Since the establishment of the Multinational Arabidopsis Steering Subcommittee for Proteomics (MASCP), many new approaches and resources have become available. Recently, the subcommittee established a webpage to consolidate this information (www.masc-proteomics.org). It includes links to plant proteomic databases, general information about proteomic techniques, meeting information, a summary of proteomic standards, and other relevant resources. Altogether, this website provides a useful resource for the Arabidopsis proteomics community. In the future, the website will host discussions and investigate the cross-linking of databases. The subcommittee members have extensive experience in arabidopsis proteomics and collectively have produced some of the most extensive proteomics data sets for this model plant (Table S1 in the Supporting Information has a list of resources). The largest collection of proteomics data from a single study in A. thaliana was assembled into an accessible database (AtProteome; http://fgcz-atproteome.unizh.ch/index.php) and was recently published by the Baginsky lab.1 The database provides links to major Arabidopsis online resources, and raw data have been deposited in PRIDE and PRIDE BioMart. Included in this database is an Arabidopsis proteome map that provides evidence for the expression of {approx}50% of all predicted gene models, including several alternative gene models that are not represented in The Arabidopsis Information Resource (TAIR) protein database. A set of organ-specific biomarkers is provided, as well as organ-specific proteotypic peptides for 4105 proteins that can be used to facilitate targeted quantitative proteomic surveys. In the future, the AtProteome database will be linked to additional existing resources developed by MASCP members, such as PPDB, ProMEX, and SUBA. The most comprehensive study on the Arabidopsis chloroplast proteome, which includes information on chloroplast sorting signals, posttranslational modifications (PTMs), and protein abundances (analyzed by high-accuracy MS [Orbitrap]), was recently published by the van Wijk lab.2 These and previous data are available via the plant proteome database (PPDB; http://ppdb.tc.cornell.edu) for A. thaliana and maize. PPDB provides genome-wide experimental and functional characterization of the A. thaliana and maize proteomes, including PTMs and subcellular localization information, with an emphasis on leaf and plastid proteins. Maize and Arabidopsis proteome entries are directly linked via internal BLAST alignments within PPDB. Direct links for each protein to TAIR, SUBA, ProMEX, and other resources are also provided.« less
Identification of Genes Whose Expression Profile Is Associated with Non-Progression towards AIDS Using eQTLs

PubMed Central

Le Clerc, Sigrid; van Manen, Daniëlle; Coulonges, Cédric; Ulveling, Damien; Laville, Vincent; Labib, Taoufik; Taing, Lieng; Delaneau, Olivier; Montes, Matthieu; Schuitemaker, Hanneke; Zagury, Jean-François

2015-01-01

Background Many genome-wide association studies have been performed on progression towards the acquired immune deficiency syndrome (AIDS) and they mainly identified associations within the HLA loci. In this study, we demonstrate that the integration of biological information, namely gene expression data, can enhance the sensitivity of genetic studies to unravel new genetic associations relevant to AIDS. Methods We collated the biological information compiled from three databases of expression quantitative trait loci (eQTLs) involved in cells of the immune system. We derived a list of single nucleotide polymorphisms (SNPs) that are functional in that they correlate with differential expression of genes in at least two of the databases. We tested the association of those SNPs with AIDS progression in two cohorts, GRIV and ACS. Tests on permuted phenotypes of the GRIV and ACS cohorts or on randomised sets of equivalent SNPs allowed us to assess the statistical robustness of this method and to estimate the true positive rate. Results Eight genes were identified with high confidence (p = 0.001, rate of true positives 75%). Some of those genes had previously been linked with HIV infection. Notably, ENTPD4 belongs to the same family as CD39, whose expression has already been associated with AIDS progression; while DNAJB12 is part of the HSP90 pathway, which is involved in the control of HIV latency. Our study also drew our attention to lesser-known functions such as mitochondrial ribosomal proteins and a zinc finger protein, ZFP57, which could be central to the effectiveness of HIV infection. Interestingly, for six out of those eight genes, down-regulation is associated with non-progression, which makes them appealing targets to develop drugs against HIV. PMID:26367535
Molecular phenotype of zebrafish ovarian follicle by serial analysis of gene expression and proteomic profiling, and comparison with the transcriptomes of other animals

PubMed Central

Knoll-Gellida, Anja; André, Michèle; Gattegno, Tamar; Forgue, Jean; Admon, Arie; Babin, Patrick J

2006-01-01

Background The ability of an oocyte to develop into a viable embryo depends on the accumulation of specific maternal information and molecules, such as RNAs and proteins. A serial analysis of gene expression (SAGE) was carried out in parallel with proteomic analysis on fully-grown ovarian follicles from zebrafish (Danio rerio). The data obtained were compared with ovary/follicle/egg molecular phenotypes of other animals, published or available in public sequence databases. Results Sequencing of 27,486 SAGE tags identified 11,399 different ones, including 3,329 tags with an occurrence superior to one. Fifty-eight genes were expressed at over 0.15% of the total population and represented 17.34% of the mRNA population identified. The three most expressed transcripts were a rhamnose-binding lectin, beta-actin 2, and a transcribed locus similar to the H2B histone family. Comparison with the large-scale expressed sequence tags sequencing approach revealed highly expressed transcripts that were not previously known to be expressed at high levels in fish ovaries, like the short-sized polarized metallothionein 2 transcript. A higher sensitivity for the detection of transcripts with a characterized maternal genetic contribution was also demonstrated compared to large-scale sequencing of cDNA libraries. Ferritin heavy polypeptide 1, heat shock protein 90-beta, lactate dehydrogenase B4, beta-actin isoforms, tubulin beta 2, ATP synthase subunit 9, together with 40 S ribosomal protein S27a, were common highly-expressed transcripts of vertebrate ovary/unfertilized egg. Comparison of transcriptome and proteome data revealed that transcript levels provide little predictive value with respect to the extent of protein abundance. All the proteins identified by proteomic analysis of fully-grown zebrafish follicles had at least one transcript counterpart, with two exceptions: eosinophil chemotactic cytokine and nothepsin. Conclusion This study provides a complete sequence data set of maternal mRNA stored in zebrafish germ cells at the end of oogenesis. This catalogue contains highly-expressed transcripts that are part of a vertebrate ovarian expressed gene signature. Comparison of transcriptome and proteome data identified downregulated transcripts or proteins potentially incorporated in the oocyte by endocytosis. The molecular phenotype described provides groundwork for future experimental approaches aimed at identifying functionally important stored maternal transcripts and proteins involved in oogenesis and early stages of embryo development. PMID:16526958

Differential protein-coding gene and long noncoding RNA expression in smoking-related lung squamous cell carcinoma.

PubMed

Li, Shicheng; Sun, Xiao; Miao, Shuncheng; Liu, Jia; Jiao, Wenjie

2017-11-01

Cigarette smoking is one of the greatest preventable risk factors for developing cancer, and most cases of lung squamous cell carcinoma (lung SCC) are associated with smoking. The pathogenesis mechanism of tumor progress is unclear. This study aimed to identify biomarkers in smoking-related lung cancer, including protein-coding gene, long noncoding RNA, and transcription factors. We selected and obtained messenger RNA microarray datasets and clinical data from the Gene Expression Omnibus database to identify gene expression altered by cigarette smoking. Integrated bioinformatic analysis was used to clarify biological functions of the identified genes, including Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, the construction of a protein-protein interaction network, transcription factor, and statistical analyses. Subsequent quantitative real-time PCR was utilized to verify these bioinformatic analyses. Five hundred and ninety-eight differentially expressed genes and 21 long noncoding RNA were identified in smoking-related lung SCC. GO and KEGG pathway analysis showed that identified genes were enriched in the cancer-related functions and pathways. The protein-protein interaction network revealed seven hub genes identified in lung SCC. Several transcription factors and their binding sites were predicted. The results of real-time quantitative PCR revealed that AURKA and BIRC5 were significantly upregulated and LINC00094 was downregulated in the tumor tissues of smoking patients. Further statistical analysis indicated that dysregulation of AURKA, BIRC5, and LINC00094 indicated poor prognosis in lung SCC. Protein-coding genes AURKA, BIRC5, and LINC00094 could be biomarkers or therapeutic targets for smoking-related lung SCC. © 2017 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.
Transcriptomic Analysis and the Expression of Disease-Resistant Genes in Oryza meyeriana under Native Condition

PubMed Central

He, Bin; Tao, Xiang; Gu, Yinghong; Wei, Changhe; Cheng, Xiaojie; Xiao, Suqin; Cheng, Zaiquan; Zhang, Yizheng

2015-01-01

Oryza meyeriana (O. meyeriana), with a GG genome type (2n = 24), accumulated plentiful excellent characteristics with respect to resistance to many diseases such as rice shade and blast, even immunity to bacterial blight. It is very important to know if the diseases-resistant genes exist and express in this wild rice under native conditions. However, limited genomic or transcriptomic data of O. meyeriana are currently available. In this study, we present the first comprehensive characterization of the O. meyeriana transcriptome using RNA-seq and obtained 185,323 contigs with an average length of 1,692 bp and an N50 of 2,391 bp. Through differential expression analysis, it was found that there were most tissue-specifically expressed genes in roots, and next to stems and leaves. By similarity search against protein databases, 146,450 had at least a significant alignment to existed gene models. Comparison with the Oryza sativa (japonica-type Nipponbare and indica-type 93–11) genomes revealed that 13% of the O. meyeriana contigs had not been detected in O. sativa. Many diseases-resistant genes, such as bacterial blight resistant, blast resistant, rust resistant, fusarium resistant, cyst nematode resistant and downy mildew gene, were mined from the transcriptomic database. There are two kinds of rice bacterial blight-resistant genes (Xa1 and Xa26) differentially or specifically expressed in O. meyeriana. The 4 Xa1 contigs were all only expressed in root, while three of Xa26 contigs have the highest expression level in leaves, two of Xa26 contigs have the highest expression profile in stems and one of Xa26 contigs was expressed dominantly in roots. The transcriptomic database of O. meyeriana has been constructed and many diseases-resistant genes were found to express under native condition, which provides a foundation for future discovery of a number of novel genes and provides a basis for studying the molecular mechanisms associated with disease resistance in O. meyeriana. PMID:26640944
Update on Genomic Databases and Resources at the National Center for Biotechnology Information.

PubMed

Tatusova, Tatiana

2016-01-01

The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI website, text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.Comparative genome analysis tools lead to further understanding of evolution processes quickening the pace of discovery. Recent technological innovations have ignited an explosion in genome sequencing that has fundamentally changed our understanding of the biology of living organisms. This huge increase in DNA sequence data presents new challenges for the information management system and the visualization tools. New strategies have been designed to bring an order to this genome sequence shockwave and improve the usability of associated data.
Deep mRNA Sequencing of the Tritonia diomedea Brain Transcriptome Provides Access to Gene Homologues for Neuronal Excitability, Synaptic Transmission and Peptidergic Signalling

PubMed Central

Senatore, Adriano; Edirisinghe, Neranjan; Katz, Paul S.

2015-01-01

Background The sea slug Tritonia diomedea (Mollusca, Gastropoda, Nudibranchia), has a simple and highly accessible nervous system, making it useful for studying neuronal and synaptic mechanisms underlying behavior. Although many important contributions have been made using Tritonia, until now, a lack of genetic information has impeded exploration at the molecular level. Results We performed Illumina sequencing of central nervous system mRNAs from Tritonia, generating 133.1 million 100 base pair, paired-end reads. De novo reconstruction of the RNA-Seq data yielded a total of 185,546 contigs, which partitioned into 123,154 non-redundant gene clusters (unigenes). BLAST comparison with RefSeq and Swiss-Prot protein databases, as well as mRNA data from other invertebrates (gastropod molluscs: Aplysia californica, Lymnaea stagnalis and Biomphalaria glabrata; cnidarian: Nematostella vectensis) revealed that up to 76,292 unigenes in the Tritonia transcriptome have putative homologues in other databases, 18,246 of which are below a more stringent E-value cut-off of 1x10-6. In silico prediction of secreted proteins from the Tritonia transcriptome shotgun assembly (TSA) produced a database of 579 unique sequences of secreted proteins, which also exhibited markedly higher expression levels compared to other genes in the TSA. Conclusions Our efforts greatly expand the availability of gene sequences available for Tritonia diomedea. We were able to extract full length protein sequences for most queried genes, including those involved in electrical excitability, synaptic vesicle release and neurotransmission, thus confirming that the transcriptome will serve as a useful tool for probing the molecular correlates of behavior in this species. We also generated a neurosecretome database that will serve as a useful tool for probing peptidergic signalling systems in the Tritonia brain. PMID:25719197
The Salivary Secretome of the Tsetse Fly Glossina pallidipes (Diptera: Glossinidae) Infected by Salivary Gland Hypertrophy Virus

PubMed Central

Kariithi, Henry M.; Ince, Ikbal A.; Boeren, Sjef; Abd-Alla, Adly M. M.; Parker, Andrew G.; Aksoy, Serap; Vlak, Just M.; van Oers, Monique M.

2011-01-01

Background The competence of the tsetse fly Glossina pallidipes (Diptera; Glossinidae) to acquire salivary gland hypertrophy virus (SGHV), to support virus replication and successfully transmit the virus depends on complex interactions between Glossina and SGHV macromolecules. Critical requisites to SGHV transmission are its replication and secretion of mature virions into the fly's salivary gland (SG) lumen. However, secretion of host proteins is of equal importance for successful transmission and requires cataloging of G. pallidipes secretome proteins from hypertrophied and non-hypertrophied SGs. Methodology/Principal Findings After electrophoretic profiling and in-gel trypsin digestion, saliva proteins were analyzed by nano-LC-MS/MS. MaxQuant/Andromeda search of the MS data against the non-redundant (nr) GenBank database and a G. morsitans morsitans SG EST database, yielded a total of 521 hits, 31 of which were SGHV-encoded. On a false discovery rate limit of 1% and detection threshold of least 2 unique peptides per protein, the analysis resulted in 292 Glossina and 25 SGHV MS-supported proteins. When annotated by the Blast2GO suite, at least one gene ontology (GO) term could be assigned to 89.9% (285/317) of the detected proteins. Five (∼1.8%) Glossina and three (∼12%) SGHV proteins remained without a predicted function after blast searches against the nr database. Sixty-five of the 292 detected Glossina proteins contained an N-terminal signal/secretion peptide sequence. Eight of the SGHV proteins were predicted to be non-structural (NS), and fourteen are known structural (VP) proteins. Conclusions/Significance SGHV alters the protein expression pattern in Glossina. The G. pallidipes SG secretome encompasses a spectrum of proteins that may be required during the SGHV infection cycle. These detected proteins have putative interactions with at least 21 of the 25 SGHV-encoded proteins. Our findings opens venues for developing novel SGHV mitigation strategies to block SGHV infections in tsetse production facilities such as using SGHV-specific antibodies and phage display-selected gut epithelia-binding peptides. PMID:22132244
Proteasome Inhibition Contributed to the Cytotoxicity of Arenobufagin after Its Binding with Na, K-ATPase in Human Cervical Carcinoma HeLa Cells

PubMed Central

Zhen, Hong; Huang, Ming; Zheng, Xi; Feng, Lixing; Jiang, Baohong; Yang, Min; Wu, Wanying; Liu, Xuan; Guo, Dean

2016-01-01

Although the possibility of developing cardiac steroids/cardiac glycosides as novel cancer therapeutic agents has been recognized, the mechanism of their anticancer activity is still not clear enough. Toad venom extract containing bufadienolides, which belong to cardiac steroids, has actually long been used as traditional Chinese medicine in clinic for cancer therapy in China. The cytotoxicity of arenobufagin, a bufadienolide isolated from toad venom, on human cervical carcinoma HeLa cells was checked. And, the protein expression profile of control HeLa cells and HeLa cells treated with arenobufagin for 48 h was analyzed using two-dimensional electrophoresis, respectively. Differently expressed proteins in HeLa cells treated with arenobufagin were identified and the pathways related to these proteins were mapped from KEGG database. Computational molecular docking was performed to verify the binding of arenobufagin and Na, K-ATPase. The effects of arenobufagin on Na, K-ATPase activity and proteasome activity of HeLa cells were checked. The protein-protein interaction network between Na, K-ATPase and proteasome was constructed and the expression of possible intermediate proteins ataxin-1 and translationally-controlled tumor protein in HeLa cells treated with arenobufagin was then checked. Arenobufagin induced apoptosis and G2/M cell cycle arrest in HeLa cells. The cytotoxic effect of arenobufagin was associated with 25 differently expressed proteins including proteasome-related proteins, calcium ion binding-related proteins, oxidative stress-related proteins, metabolism-related enzymes and others. The results of computational molecular docking revealed that arenobufagin was bound in the cavity formed by the transmembrane alpha subunits of Na, K-ATPase, which blocked the pathway of extracellular Na+/K+ cation exchange and inhibited the function of ion exchange. Arenobufagin inhibited the activity of Na, K-ATPase and proteasome, decreased the expression of Na, K-ATPase α1 and α3 subunits and increased the expression of WEE1 in HeLa cells. Antibodies against Na, K-ATPase α1 and α3 subunits alone or combinated with arenobufagin also inhibited the activity of proteasome. Furthermore, the expression of the possible intermediate proteins ataxin-1 and translationally-controlled tumor protein was increased in HeLa cells treated with arenobufagin by flow cytometry analysis, respectively. These results indicated that arenobufagin might directly bind with Na, K-ATPase α1 and α3 subunits and the inhibitive effect of arenobufagin on proteasomal activity of HeLa cells might be related to its binding with Na, K-ATPase. PMID:27428326
Proteasome Inhibition Contributed to the Cytotoxicity of Arenobufagin after Its Binding with Na, K-ATPase in Human Cervical Carcinoma HeLa Cells.

PubMed

Yue, Qingxi; Zhen, Hong; Huang, Ming; Zheng, Xi; Feng, Lixing; Jiang, Baohong; Yang, Min; Wu, Wanying; Liu, Xuan; Guo, Dean

2016-01-01

Although the possibility of developing cardiac steroids/cardiac glycosides as novel cancer therapeutic agents has been recognized, the mechanism of their anticancer activity is still not clear enough. Toad venom extract containing bufadienolides, which belong to cardiac steroids, has actually long been used as traditional Chinese medicine in clinic for cancer therapy in China. The cytotoxicity of arenobufagin, a bufadienolide isolated from toad venom, on human cervical carcinoma HeLa cells was checked. And, the protein expression profile of control HeLa cells and HeLa cells treated with arenobufagin for 48 h was analyzed using two-dimensional electrophoresis, respectively. Differently expressed proteins in HeLa cells treated with arenobufagin were identified and the pathways related to these proteins were mapped from KEGG database. Computational molecular docking was performed to verify the binding of arenobufagin and Na, K-ATPase. The effects of arenobufagin on Na, K-ATPase activity and proteasome activity of HeLa cells were checked. The protein-protein interaction network between Na, K-ATPase and proteasome was constructed and the expression of possible intermediate proteins ataxin-1 and translationally-controlled tumor protein in HeLa cells treated with arenobufagin was then checked. Arenobufagin induced apoptosis and G2/M cell cycle arrest in HeLa cells. The cytotoxic effect of arenobufagin was associated with 25 differently expressed proteins including proteasome-related proteins, calcium ion binding-related proteins, oxidative stress-related proteins, metabolism-related enzymes and others. The results of computational molecular docking revealed that arenobufagin was bound in the cavity formed by the transmembrane alpha subunits of Na, K-ATPase, which blocked the pathway of extracellular Na+/K+ cation exchange and inhibited the function of ion exchange. Arenobufagin inhibited the activity of Na, K-ATPase and proteasome, decreased the expression of Na, K-ATPase α1 and α3 subunits and increased the expression of WEE1 in HeLa cells. Antibodies against Na, K-ATPase α1 and α3 subunits alone or combinated with arenobufagin also inhibited the activity of proteasome. Furthermore, the expression of the possible intermediate proteins ataxin-1 and translationally-controlled tumor protein was increased in HeLa cells treated with arenobufagin by flow cytometry analysis, respectively. These results indicated that arenobufagin might directly bind with Na, K-ATPase α1 and α3 subunits and the inhibitive effect of arenobufagin on proteasomal activity of HeLa cells might be related to its binding with Na, K-ATPase.
Role of miR-452-5p in the tumorigenesis of prostate cancer: A study based on the Cancer Genome Atl(TCGA), Gene Expression Omnibus (GEO), and bioinformatics analysis.

PubMed

Gao, Li; Zhang, Li-Jie; Li, Sheng-Hua; Wei, Li-Li; Luo, Bin; He, Rong-Quan; Xia, Shuang

2018-03-06

MiR-452-5p has been reported to be down-regulated in prostate cancer, affecting the development of this type of cancer. However, the molecular mechanism of miR-452-5p in prostate cancer remains unclear. Therefore, we investigated the network of target genes of miR-452-5p in prostate cancer using bioinformatics analyses. We first analyzed the expression profiles and prognostic value of miR-452-5p in prostate cancer tissues from a public database. Gene Ontology (GO), the Kyoto Encyclopedia of Genes and Genomes (KEGG), PANTHER pathway analyses, and a disease ontology (DG) analysis were performed to find the molecular functions of the target genes from GSE datasets and miRWalk. Finally, we validated hub genes from the protein-protein interaction (PPI) networks of the target genes in the Human Protein Atlas (HPA) database and Gene Expression Profiling Interactive Analysis (GEPIA). Narrowing down the optimal target genes was conducted by seeking the common parts of up-regulated genes from GEPIA, down-regulated genes from GSE datasets, and predicted genes in miRWalk. Based on mining of GEO and ArrayExpress microarray chips and miRNA-Seq data in the TCGA database, which includes 1007 prostate cancer samples and 387 non-cancer samples, miR-452-5p is shown to be down-regulated in prostate cancer. GO, KEGG, and PANTHER pathway analyses suggested that the target genes might participate in important biological processes, such as transforming growth factor beta signaling and the positive regulation of brown fat cell differentiation and mesenchymal cell differentiation, as well as the Ras signaling pathway and pathways regulating the pluripotency of stem cells and arrhythmogenic right ventricular cardiomyopathy (ARVC). Nine genes-GABBR, PNISR, NTSR1, DOCK1, EREG, SFRP1, PTGS2, LEF1, and BMP2-were defined as hub genes in the PPI network. Three genes-FAM174B, SLC30A4, and SLIT1-were jointly shared by GEPIA, the GSE datasets, and miRWalk. Down-regulated miR-452-5p might play an essential role in the tumorigenesis of prostate cancer. Copyright © 2018. Published by Elsevier GmbH.
DNA sequence analysis of a 10 624 bp fragment of the left arm of chromosome XV from Saccharomyces cerevisiae reveals a RNA binding protein, a mitochondrial protein, two ribosomal proteins and two new open reading frames.

PubMed

Lafuente, M J; Gamo, F J; Gancedo, C

1996-09-01

We have determined the sequence of a 10624 bp DNA segment located in the left arm of chromosome XV of Saccharomyces cerevisiae. The sequence contains eight open reading frames (ORFs) longer than 100 amino acids. Two of them do not present significant homology with sequences found in the databases. The product of ORF o0553 is identical to the protein encoded by the gene SMF1. Internal to it there is another ORF, o0555 that is apparently expressed. The proteins encoded by ORFs o0559 and o0565 are identical to ribosomal proteins S19.e and L18 respectively. ORF o0550 encodes a protein with an RNA binding signature including RNP motifs and stretches rich in asparagine, glutamine and arginine.
MIPS: a database for genomes and protein sequences.

PubMed Central

Mewes, H W; Heumann, K; Kaps, A; Mayer, K; Pfeiffer, F; Stocker, S; Frishman, D

1999-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried near Munich, Germany, develops and maintains genome oriented databases. It is commonplace that the amount of sequence data available increases rapidly, but not the capacity of qualified manual annotation at the sequence databases. Therefore, our strategy aims to cope with the data stream by the comprehensive application of analysis tools to sequences of complete genomes, the systematic classification of protein sequences and the active support of sequence analysis and functional genomics projects. This report describes the systematic and up-to-date analysis of genomes (PEDANT), a comprehensive database of the yeast genome (MYGD), a database reflecting the progress in sequencing the Arabidopsis thaliana genome (MATD), the database of assembled, annotated human EST clusters (MEST), and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). MIPS provides access through its WWW server (http://www.mips.biochem.mpg.de) to a spectrum of generic databases, including the above mentioned as well as a database of protein families (PROTFAM), the MITOP database, and the all-against-all FASTA database. PMID:9847138
Microarray analysis reveals key genes and pathways in Tetralogy of Fallot

PubMed Central

He, Yue-E; Qiu, Hui-Xian; Jiang, Jian-Bing; Wu, Rong-Zhou; Xiang, Ru-Lian; Zhang, Yuan-Hai

2017-01-01

The aim of the present study was to identify key genes that may be involved in the pathogenesis of Tetralogy of Fallot (TOF) using bioinformatics methods. The GSE26125 microarray dataset, which includes cardiovascular tissue samples derived from 16 children with TOF and five healthy age-matched control infants, was downloaded from the Gene Expression Omnibus database. Differential expression analysis was performed between TOF and control samples to identify differentially expressed genes (DEGs) using Student's t-test, and the R/limma package, with a log2 fold-change of >2 and a false discovery rate of <0.01 set as thresholds. The biological functions of DEGs were analyzed using the ToppGene database. The ReactomeFIViz application was used to construct functional interaction (FI) networks, and the genes in each module were subjected to pathway enrichment analysis. The iRegulon plugin was used to identify transcription factors predicted to regulate the DEGs in the FI network, and the gene-transcription factor pairs were then visualized using Cytoscape software. A total of 878 DEGs were identified, including 848 upregulated genes and 30 downregulated genes. The gene FI network contained seven function modules, which were all comprised of upregulated genes. Genes enriched in Module 1 were enriched in the following three neurological disorder-associated signaling pathways: Parkinson's disease, Alzheimer's disease and Huntington's disease. Genes in Modules 0, 3 and 5 were dominantly enriched in pathways associated with ribosomes and protein translation. The Xbox binding protein 1 transcription factor was demonstrated to be involved in the regulation of genes encoding the subunits of cytoplasmic and mitochondrial ribosomes, as well as genes involved in neurodegenerative disorders. Therefore, dysfunction of genes involved in signaling pathways associated with neurodegenerative disorders, ribosome function and protein translation may contribute to the pathogenesis of TOF. PMID:28713939
Identification of potential target genes and related regulatory transcription factors in spontaneous hairline fracture induced by hypervitaminosis A.

PubMed

Peng, Chuangang; Yang, Qi; Wei, Bo; Liu, Yong; Li, Yuxiang; Gu, Dawei; Yin, Guochao; Wang, Bo; Xu, Dehui; Zhang, Xuebing; Kong, Daliang

2017-07-01

The aim was to research the molecular changes of bone cells induced by excessive dose of vitamin A, and analyze molecular mechanism underlying spontaneous fracture. The gene expression profile of GSE29859, including 4 cortical bone marrow samples with excessive doses of Vitamin A and 4 control cortical bone marrow samples, was obtained from the Gene Expression Omnibus (GEO) database. Differentially expressed genes (DGEs) between cortical bone marrow samples and control samples were screened out and pathway enrichment analysis was undertaken. Based on the MSigDB database, the potential regulatory transcription factors (TFs) were identified. A total of 373 DEGs including 342 up- and 31 down-regulated genes were identified. These DEGs were significantly enriched in pathways of protein processing in endoplasmic reticulum, ubiquitin mediated proteolysis and glycerophospholipid metabolism. Finally, the most significant regulatory TFs were obtained, including E2F Transcription Factor 1 (E2F1), GA Binding Protein Transcription Factor (GABP), Nuclear Factor, Erythroid 2-Like 2 (NRF2) and ELK1, Member of ETS Oncogene Family (ELK1). Key TFs including E2F1, GABP, NRF2 and ELK1 and their targets genes such as Ube2d3, Uba1, Phb2 and Tomm22 may play potential key roles in spontaneous fracture induced by hypervitaminosis A. The pathways of protein processing in endoplasmic reticulum, ubiquitin mediated proteolysis and glycerophospholipid metabolism may be key mechanisms involved in spontaneous fracture induced by hypervitaminosis A. Our findings will provide new insights for the target selection in clinical application to prevent spontaneous fracture induced by hypervitaminosis A. Copyright © 2017 Elsevier Ltd. All rights reserved.
The Halophile protein database.

PubMed

Sharma, Naveen; Farooqi, Mohammad Samir; Chaturvedi, Krishna Kumar; Lal, Shashi Bhushan; Grover, Monendra; Rai, Anil; Pandey, Pankaj

2014-01-01

Halophilic archaea/bacteria adapt to different salt concentration, namely extreme, moderate and low. These type of adaptations may occur as a result of modification of protein structure and other changes in different cell organelles. Thus proteins may play an important role in the adaptation of halophilic archaea/bacteria to saline conditions. The Halophile protein database (HProtDB) is a systematic attempt to document the biochemical and biophysical properties of proteins from halophilic archaea/bacteria which may be involved in adaptation of these organisms to saline conditions. In this database, various physicochemical properties such as molecular weight, theoretical pI, amino acid composition, atomic composition, estimated half-life, instability index, aliphatic index and grand average of hydropathicity (Gravy) have been listed. These physicochemical properties play an important role in identifying the protein structure, bonding pattern and function of the specific proteins. This database is comprehensive, manually curated, non-redundant catalogue of proteins. The database currently contains 59 897 proteins properties extracted from 21 different strains of halophilic archaea/bacteria. The database can be accessed through link. Database URL: http://webapp.cabgrid.res.in/protein/ © The Author(s) 2014. Published by Oxford University Press.
GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis

PubMed Central

Gadelha, Luiz; Ribeiro-Alves, Marcelo; Porto, Fábio

2017-01-01

There are many steps in analyzing transcriptome data, from the acquisition of raw data to the selection of a subset of representative genes that explain a scientific hypothesis. The data produced can be represented as networks of interactions among genes and these may additionally be integrated with other biological databases, such as Protein-Protein Interactions, transcription factors and gene annotation. However, the results of these analyses remain fragmented, imposing difficulties, either for posterior inspection of results, or for meta-analysis by the incorporation of new related data. Integrating databases and tools into scientific workflows, orchestrating their execution, and managing the resulting data and its respective metadata are challenging tasks. Additionally, a great amount of effort is equally required to run in-silico experiments to structure and compose the information as needed for analysis. Different programs may need to be applied and different files are produced during the experiment cycle. In this context, the availability of a platform supporting experiment execution is paramount. We present GeNNet, an integrated transcriptome analysis platform that unifies scientific workflows with graph databases for selecting relevant genes according to the evaluated biological systems. It includes GeNNet-Wf, a scientific workflow that pre-loads biological data, pre-processes raw microarray data and conducts a series of analyses including normalization, differential expression inference, clusterization and gene set enrichment analysis. A user-friendly web interface, GeNNet-Web, allows for setting parameters, executing, and visualizing the results of GeNNet-Wf executions. To demonstrate the features of GeNNet, we performed case studies with data retrieved from GEO, particularly using a single-factor experiment in different analysis scenarios. As a result, we obtained differentially expressed genes for which biological functions were analyzed. The results are integrated into GeNNet-DB, a database about genes, clusters, experiments and their properties and relationships. The resulting graph database is explored with queries that demonstrate the expressiveness of this data model for reasoning about gene interaction networks. GeNNet is the first platform to integrate the analytical process of transcriptome data with graph databases. It provides a comprehensive set of tools that would otherwise be challenging for non-expert users to install and use. Developers can add new functionality to components of GeNNet. The derived data allows for testing previous hypotheses about an experiment and exploring new ones through the interactive graph database environment. It enables the analysis of different data on humans, rhesus, mice and rat coming from Affymetrix platforms. GeNNet is available as an open source platform at https://github.com/raquele/GeNNet and can be retrieved as a software container with the command docker pull quelopes/gennet. PMID:28695067
GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis.

PubMed

Costa, Raquel L; Gadelha, Luiz; Ribeiro-Alves, Marcelo; Porto, Fábio

2017-01-01

There are many steps in analyzing transcriptome data, from the acquisition of raw data to the selection of a subset of representative genes that explain a scientific hypothesis. The data produced can be represented as networks of interactions among genes and these may additionally be integrated with other biological databases, such as Protein-Protein Interactions, transcription factors and gene annotation. However, the results of these analyses remain fragmented, imposing difficulties, either for posterior inspection of results, or for meta-analysis by the incorporation of new related data. Integrating databases and tools into scientific workflows, orchestrating their execution, and managing the resulting data and its respective metadata are challenging tasks. Additionally, a great amount of effort is equally required to run in-silico experiments to structure and compose the information as needed for analysis. Different programs may need to be applied and different files are produced during the experiment cycle. In this context, the availability of a platform supporting experiment execution is paramount. We present GeNNet, an integrated transcriptome analysis platform that unifies scientific workflows with graph databases for selecting relevant genes according to the evaluated biological systems. It includes GeNNet-Wf, a scientific workflow that pre-loads biological data, pre-processes raw microarray data and conducts a series of analyses including normalization, differential expression inference, clusterization and gene set enrichment analysis. A user-friendly web interface, GeNNet-Web, allows for setting parameters, executing, and visualizing the results of GeNNet-Wf executions. To demonstrate the features of GeNNet, we performed case studies with data retrieved from GEO, particularly using a single-factor experiment in different analysis scenarios. As a result, we obtained differentially expressed genes for which biological functions were analyzed. The results are integrated into GeNNet-DB, a database about genes, clusters, experiments and their properties and relationships. The resulting graph database is explored with queries that demonstrate the expressiveness of this data model for reasoning about gene interaction networks. GeNNet is the first platform to integrate the analytical process of transcriptome data with graph databases. It provides a comprehensive set of tools that would otherwise be challenging for non-expert users to install and use. Developers can add new functionality to components of GeNNet. The derived data allows for testing previous hypotheses about an experiment and exploring new ones through the interactive graph database environment. It enables the analysis of different data on humans, rhesus, mice and rat coming from Affymetrix platforms. GeNNet is available as an open source platform at https://github.com/raquele/GeNNet and can be retrieved as a software container with the command docker pull quelopes/gennet.
dbDSM: a manually curated database for deleterious synonymous mutations.

PubMed

Wen, Pengbo; Xiao, Peng; Xia, Junfeng

2016-06-15

Synonymous mutations (SMs), which changed the sequence of a gene without directly altering the amino acid sequence of the encoded protein, were thought to have no functional consequences for a long time. They are often assumed to be neutral in models of mutation and selection and were completely ignored in many studies. However, accumulating experimental evidence has demonstrated that these mutations exert their impact on gene functions via splicing accuracy, mRNA stability, translation fidelity, protein folding and expression, and some of these mutations are implicated in human diseases. To the best of our knowledge, there is still no database specially focusing on disease-related SMs. We have developed a new database called dbDSM (database of Deleterious Synonymous Mutation), a continually updated database that collects, curates and manages available human disease-related SM data obtained from published literature. In the current release, dbDSM collects 1936 SM-disease association entries, including 1289 SMs and 443 human diseases from ClinVar, GRASP, GWAS Catalog, GWASdb, PolymiRTS database, PubMed database and Web of Knowledge. Additionally, we provided users a link to download all the data in the dbDSM and a link to submit novel data into the database. We hope dbDSM will be a useful resource for investigating the roles of SMs in human disease. dbDSM is freely available online at http://bioinfo.ahu.edu.cn:8080/dbDSM/index.jsp with all major browser supported. jfxia@ahu.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
CEBS object model for systems biology data, SysBio-OM.

PubMed

Xirasagar, Sandhya; Gustafson, Scott; Merrick, B Alex; Tomer, Kenneth B; Stasiewicz, Stanley; Chan, Denny D; Yost, Kenneth J; Yates, John R; Sumner, Susan; Xiao, Nianqing; Waters, Michael D

2004-09-01

To promote a systems biology approach to understanding the biological effects of environmental stressors, the Chemical Effects in Biological Systems (CEBS) knowledge base is being developed to house data from multiple complex data streams in a systems friendly manner that will accommodate extensive querying from users. Unified data representation via a single object model will greatly aid in integrating data storage and management, and facilitate reuse of software to analyze and display data resulting from diverse differential expression or differential profile technologies. Data streams include, but are not limited to, gene expression analysis (transcriptomics), protein expression and protein-protein interaction analysis (proteomics) and changes in low molecular weight metabolite levels (metabolomics). To enable the integration of microarray gene expression, proteomics and metabolomics data in the CEBS system, we designed an object model, Systems Biology Object Model (SysBio-OM). The model is comprehensive and leverages other open source efforts, namely the MicroArray Gene Expression Object Model (MAGE-OM) and the Proteomics Experiment Data Repository (PEDRo) object model. SysBio-OM is designed by extending MAGE-OM to represent protein expression data elements (including those from PEDRo), protein-protein interaction and metabolomics data. SysBio-OM promotes the standardization of data representation and data quality by facilitating the capture of the minimum annotation required for an experiment. Such standardization refines the accuracy of data mining and interpretation. The open source SysBio-OM model, which can be implemented on varied computing platforms is presented here. A universal modeling language depiction of the entire SysBio-OM is available at http://cebs.niehs.nih.gov/SysBioOM/. The Rational Rose object model package is distributed under an open source license that permits unrestricted academic and commercial use and is available at http://cebs.niehs.nih.gov/cebsdownloads. The database and interface are being built to implement the model and will be available for public use at http://cebs.niehs.nih.gov.
Database citation in full text biomedical articles.

PubMed

Kafkas, Şenay; Kim, Jee-Hyub; McEntyre, Johanna R

2013-01-01

Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services.
Database Citation in Full Text Biomedical Articles

PubMed Central

Kafkas, Şenay; Kim, Jee-Hyub; McEntyre, Johanna R.

2013-01-01

Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services. PMID:23734176
From 20th century metabolic wall charts to 21st century systems biology: database of mammalian metabolic enzymes.

PubMed

Corcoran, Callan C; Grady, Cameron R; Pisitkun, Trairak; Parulekar, Jaya; Knepper, Mark A

2017-03-01

The organization of the mammalian genome into gene subsets corresponding to specific functional classes has provided key tools for systems biology research. Here, we have created a web-accessible resource called the Mammalian Metabolic Enzyme Database ( https://hpcwebapps.cit.nih.gov/ESBL/Database/MetabolicEnzymes/MetabolicEnzymeDatabase.html) keyed to the biochemical reactions represented on iconic metabolic pathway wall charts created in the previous century. Overall, we have mapped 1,647 genes to these pathways, representing ~7 percent of the protein-coding genome. To illustrate the use of the database, we apply it to the area of kidney physiology. In so doing, we have created an additional database ( Database of Metabolic Enzymes in Kidney Tubule Segments: https://hpcwebapps.cit.nih.gov/ESBL/Database/MetabolicEnzymes/), mapping mRNA abundance measurements (mined from RNA-Seq studies) for all metabolic enzymes to each of 14 renal tubule segments. We carry out bioinformatics analysis of the enzyme expression pattern among renal tubule segments and mine various data sources to identify vasopressin-regulated metabolic enzymes in the renal collecting duct. Copyright © 2017 the American Physiological Society.

Cloning and in-silico analysis of beta-1,3-xylanase from psychrophilic yeast, Glaciozyma antarctica PI12

NASA Astrophysics Data System (ADS)

Nor, Nooraisyah Mohamad; Bakar, Farah Diba Abu; Mahadi, Nor Muhammad; Murad, Abdul Munir Abdul

2015-09-01

A beta-1,3-xylanase (EC 3.2.1.32) gene from psychrophilic yeast, Glaciozyma antarctica has been identified via genome data mining. The enzyme was grouped into GH26 family based on Carbohydrate Active Enzyme (CaZY) database. The molecular weight of this protein was predicted to be 42 kDa and is expected to be soluble for expression. The presence of signal peptide suggested that this enzyme may be released extracellularly into the marine environment of the host's habitat. This supports the theory that such enzymatic activity is required for degradation of nutrients of polysaccharide origins into simpler carbohydrates outside the environment before it could be taken up inside the cell. The sequence for this protein showed very little conservation (< 30%) with other beta-1,3-xylanases from available databases. Based on the phylogenetic analysis, this protein also showed distant relationship to other xylanases from eukaryotic origin. The protein may have undergone major substitution in its gene sequence order to adapt to the cold climate. This is the first report of beta-1,3-xylanase gene isolated from a psychrophilic yeast.
Construction and Deciphering of Human Phosphorylation-Mediated Signaling Transduction Networks.

PubMed

Zhang, Menghuan; Li, Hong; He, Ying; Sun, Han; Xia, Li; Wang, Lishun; Sun, Bo; Ma, Liangxiao; Zhang, Guoqing; Li, Jing; Li, Yixue; Xie, Lu

2015-07-02

Protein phosphorylation is the most abundant reversible covalent modification. Human protein kinases participate in almost all biological pathways, and approximately half of the kinases are associated with disease. PhoSigNet was designed to store and display human phosphorylation-mediated signal transduction networks, with additional information related to cancer. It contains 11 976 experimentally validated directed edges and 216 871 phosphorylation sites. Moreover, 3491 differentially expressed proteins in human cancer from dbDEPC, 18 907 human cancer variation sites from CanProVar, and 388 hyperphosphorylation sites from PhosphoSitePlus were collected as annotation information. Compared with other phosphorylation-related databases, PhoSigNet not only takes the kinase-substrate regulatory relationship pairs into account, but also extends regulatory relationships up- and downstream (e.g., from ligand to receptor, from G protein to kinase, and from transcription factor to targets). Furthermore, PhoSigNet allows the user to investigate the impact of phosphorylation modifications on cancer. By using one set of in-house time series phosphoproteomics data, the reconstruction of a conditional and dynamic phosphorylation-mediated signaling network was exemplified. We expect PhoSigNet to be a useful database and analysis platform benefiting both proteomics and cancer studies.
[Identification and prognostic value of differentially expressed proteins of patients with platinum resistance epithelial ovarian cancer in serum].

PubMed

Wu, W J; Wang, Q; Zhang, W; Li, L

2016-07-25

To identified differentially expressed proteins associated with platinum resistance in platinum resistance epithelial oarian cancer(EOC)patients in serum and investigate their clinical value. A total of 106 patients withoverian tumor in affiliated tumor hospital of Guangxi Medical University from August 1998 to September 2013 were enrolled in this study, which include 52 cases od platinum-sensitive(PTS), 44 cases of platinum-resistant(PTR)and 10 cases of benign ovarian cyst(BOC). Thirty-three cases of normal women proceeded physical examination in our hospital in 2008 were chosen as control group(NC). Four groups of patients serum samples of 4 groups were collected and preserved.(1)Differentially express level of serum proteins of 10 cases of every group(PTS & PTR vs NC, PTS & PTR vs BOC, PTS vs PTR)were identified with isobaric tags for relative and absolute quantitative(iTRAQ)based quantitative proteomic approach and then was subjected to bioinformatics analysis.(2)Proteins that played a important role in multidrug resistance were validated by western blot(WB)and ELISA in 44 PTR patients, 52 PRS patients and 33 NC women.(3)Pearson correlation analysis was used to explain the relationship between proteins and clinical pathological parameters of PTR individuals. Kaplan-Meier method was supposed to explore serum biomarkers associated with clinical prognosis data. Receiver operating characteristic(ROC)curves were used to determine the diagnostic value of the markers. (1)Based on the result of bioinformatics analysis, 56 proteins, 39 proteins and 62 proteins were identified respectively among PTS & PTR vs NC, PTS & PTR vs BOC, PTS vs PTR. It showed that C6 and CNTN1 have a positive seletion effect among Asians and BCHE among Europeans through searching Haplotter database. CRP, FN1, S100A9, TF, ALB, VWF, APOC2, APOE, CD44, F2, GPX3 and ACTB proein were further verified related with platinum resistance by taking intersection analysis in the COREMINE database and TCGA.(2)The expression level of SERPINA1 protein in serum of PTR group, PTS group and NC groupwere 41.7±9.2, 32.8±6.6 and 14.2±3.6 respectively using WB assay, and(816±246),(686±205)and(756 ± 244)μg/μl respectively using ELISA; the expression level of ORM1 protein in PTR, PTS and NC serum were 37.9±7.0, 27.0±22.5 and 21.7±2.6 respectively using WB assay, and(221±35),(174±23)and(157±18)μg/μl respectively using ELISA; the expression level of FN1 protein in PTR, PTS and NC serum were 30.3±11.4, 18.2±5.2, 23.7±3.9 respectively by WB assay, and(71±13),(62±13),(69±13)ng/μl respectively by ELISA; the expression level of GPX3 protein in PTR, PTS and NC serum were 1.2±0.3, 2.2± 0.3, 1.6±0.3 respectively WB assay. The expression of each protein by using western blot method and ELISA method had the same trend as that using iTRAQ technology.(3)Pearson correlation analysis showed, the expression of SERPINA1, FN1 and ORM1 had a positive correlation with recurrence and death of PTR patients(P <0.01, P <0.05), but was negatively correlated with progress free survival of PTR patients(P <0.05). Kaplan-Meier analysis indicated that clinical stage, initial treatment outcomes, the express level of SERPINA1, FN1 and ORM1 were significantly related with progression-free survival(P <0.05), the initial treatment outcomes was related with overall survival(P=0.027). The overall predictive accuracy of each protein was reflected by the area under the ROC curve(AUC), FN1 ORM1 and SERPINA with ROC areas of 0.679, 0.910 and 0.666 respectively. The diagnosis value of ORM1 protein in ovarian cancer patients with platinum resistance performance is significantly higher than that of FN1 and SERPINA1 protein(P=0.000) CONCLUSIONS: The differentially express level of FN1, SERPINA1 and ORM1 between PTS and PTR play a essential role in measuring subtle changes in response to platinum-based chemotherapy and may be involved in biological processes of platinum resistance. ORM1 has higher diagnostic efficiency of platinum resistance in ovarian cancer patients. It may be a promising candidate biomarker for screening and diagnosis of ovarian cancer patients with platinum resistance.
MIPS: analysis and annotation of proteins from whole genomes

PubMed Central

Mewes, H. W.; Amid, C.; Arnold, R.; Frishman, D.; Güldener, U.; Mannhaupt, G.; Münsterkötter, M.; Pagel, P.; Strack, N.; Stümpflen, V.; Warfsmann, J.; Ruepp, A.

2004-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein–protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de). PMID:14681354
Bioinformatic analysis of the effects and mechanisms of decitabine and cytarabine on acute myeloid leukemia

PubMed Central

Zhou, Shiyong; Liu, Pengfei; Zhang, Huilai

2017-01-01

Acute myeloid leukemia (AML) is a frequently occurring malignant disease of the blood and may result from a variety of genetic disorders. The present study aimed to identify the underlying mechanisms associated with the therapeutic effects of decitabine and cytarabine on AML, using microarray analysis. The microarray datasets GSE40442 and GSE40870 were downloaded from the Gene Expression Omnibus database. Differentially expressed genes (DEGs) and differentially methylated sites were identified in AML cells treated with decitabine compared with those treated with cytarabine via the Linear Models for Microarray Data package, following data pre-processing. Gene Ontology (GO) analysis of DEGs was performed using the Database for Annotation, Visualization and Integrated Analysis Discovery. Genes corresponding to the differentially methylated sites were obtained using the annotation package of the methylation microarray platform. The overlapping genes were identified, which exhibited the opposite variation trend between gene expression and DNA methylation. Important transcription factor (TF)-gene pairs were screened out, and a regulated network subsequently constructed. A total of 190 DEGs and 540 differentially methylated sites were identified in AML cells treated with decitabine compared with those treated with cytarabine. A total of 36 GO terms of DEGs were enriched, including nucleosomes, protein-DNA complexes and the nucleosome assembly. The 540 differentially methylated sites were located on 240 genes, including the acid-repeat containing protein (ACRC) gene that was additionally differentially expressed. In addition, 60 TF pairs and overlapped methylated sites, and 140 TF-pairs and DEGs were screened out. The regulated network included 68 nodes and 140 TF-gene pairs. The present study identified various genes including ACRC and proliferating cell nuclear antigen, in addition to various TFs, including TATA-box binding protein associated factor 1 and CCCTC-binding factor, which may be potential therapeutic targets of AML. PMID:28498449
Integrative genomic analyses of the histamine H1 receptor and its role in cancer prediction.

PubMed

Wang, Minghai; Wei, Xiaolong; Shi, Lianghui; Chen, Bin; Zhao, Guohai; Yang, Haiwei

2014-04-01

The human histamine receptor H1 (HRH1) gene is located on chromosome 3p25 and encodes for a 487 amino acid G protein-coupled receptor (GPCR) with a long third intracellular loop (IL3). The HRH1 predominantly couples to Gαq/11 proteins, leading to the activation of phospholipase C (PLC) and subsequent release of the second messengers inositol trisphosphate (IP3) and diacylglycerol (DAG) followed by the activation of PKC and the release of [Ca2+]i. In the present study, we identified HRH1 genes from 14 vertebrate genomes and found that HRH1 exists in all types of vertebrates including fish, amphibians, birds and mammals. We identified 88 SNPs including 4 available alleles disrupting an existing exonic splicing enhancer and 84 SNPs causing missense mutation, which may impact the effect of histamine on the HRH1 protein. We found that the human HRH1 gene was expressed in many tissues or organs, and predominant expression of HRH1 was shown in the bone marrow, whole blood, lymph node, thymus, brain, cerebellum, retina, spinal cord, heart, smooth muscle, skeletal muscle, small intestine, colon, adipocytes, kidney, liver, lung, pancreas, thyroid salivary gland, skin, ovary, uterus, placenta, prostate and testis. When searched in the PrognoScan database, human HRH1 was also found to be expressed in bladder cancer, blood cancer, brain cancer, breast cancer, colorectal cancer, eye cancer, head and neck cancer, lung cancer, ovarian cancer, skin cancer and soft tissue cancer tissues. The relationship between the expression of HRH1 and prognosis was found to vary in different types of cancers, even in the same cancer from different databases. This implies that the function of HRH1 in these tumors may be multidimensional. GR, STAT5A and c-Myb regulatory transcription factor binding sites were identified in the HRH1 gene upstream (promoter) region, which may be involved in the effect of HRH1 in tumors.
Nuclear phosphoproteome analysis of 3T3-L1 preadipocyte differentiation reveals system-wide phosphorylation of transcriptional regulators.

PubMed

Rabiee, Atefeh; Schwämmle, Veit; Sidoli, Simone; Dai, Jie; Rogowska-Wrzesinska, Adelina; Mandrup, Susanne; Jensen, Ole N

2017-03-01

Adipocytes (fat cells) are important endocrine and metabolic cells critical for systemic insulin sensitivity. Both adipose excess and insufficiency are associated with adverse metabolic function. Adipogenesis is the process whereby preadipocyte precursor cells differentiate into lipid-laden mature adipocytes. This process is driven by a network of transcriptional regulators (TRs). We hypothesized that protein PTMs, in particular phosphorylation, play a major role in activating and propagating signals within TR networks upon induction of adipogenesis by extracellular stimulus. We applied MS-based quantitative proteomics and phosphoproteomics to monitor the alteration of nuclear proteins during the early stages (4 h) of preadipocyte differentiation. We identified a total of 4072 proteins including 2434 phosphorylated proteins, a majority of which were assigned as regulators of gene expression. Our results demonstrate that adipogenic stimuli increase the nuclear abundance and/or the phosphorylation levels of proteins involved in gene expression, cell organization, and oxidation-reduction pathways. Furthermore, proteins acting as negative modulators involved in negative regulation of gene expression, insulin stimulated glucose uptake, and cytoskeletal organization showed a decrease in their nuclear abundance and/or phosphorylation levels during the first 4 h of adipogenesis. Among 288 identified TRs, 49 were regulated within 4 h of adipogenic stimulation including several known and many novel potential adipogenic regulators. We created a kinase-substrate database for 3T3-L1 preadipocytes by investigating the relationship between protein kinases and protein phosphorylation sites identified in our dataset. A majority of the putative protein kinases belong to the cyclin-dependent kinase family and the mitogen-activated protein kinase family including P38 and c-Jun N-terminal kinases, suggesting that these kinases act as orchestrators of early adipogenesis. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
MOPED enables discoveries through consistently processed proteomics data

PubMed Central

Higdon, Roger; Stewart, Elizabeth; Stanberry, Larissa; Haynes, Winston; Choiniere, John; Montague, Elizabeth; Anderson, Nathaniel; Yandl, Gregory; Janko, Imre; Broomall, William; Fishilevich, Simon; Lancet, Doron; Kolker, Natali; Kolker, Eugene

2014-01-01

The Model Organism Protein Expression Database (MOPED, http://moped.proteinspire.org), is an expanding proteomics resource to enable biological and biomedical discoveries. MOPED aggregates simple, standardized and consistently processed summaries of protein expression and metadata from proteomics (mass spectrometry) experiments from human and model organisms (mouse, worm and yeast). The latest version of MOPED adds new estimates of protein abundance and concentration, as well as relative (differential) expression data. MOPED provides a new updated query interface that allows users to explore information by organism, tissue, localization, condition, experiment, or keyword. MOPED supports the Human Proteome Project’s efforts to generate chromosome and diseases specific proteomes by providing links from proteins to chromosome and disease information, as well as many complementary resources. MOPED supports a new omics metadata checklist in order to harmonize data integration, analysis and use. MOPED’s development is driven by the user community, which spans 90 countries guiding future development that will transform MOPED into a multi-omics resource. MOPED encourages users to submit data in a simple format. They can use the metadata a checklist generate a data publication for this submission. As a result, MOPED will provide even greater insights into complex biological processes and systems and enable deeper and more comprehensive biological and biomedical discoveries. PMID:24350770
A proteomic study of spike development inhibition in bread wheat.

PubMed

Zheng, Yong-Sheng; Guo, Jun-Xian; Zhang, Jin-Peng; Gao, Ai-Nong; Yang, Xin-Ming; Li, Xiu-Quan; Liu, Wei-Hua; Li, Li-Hui

2013-09-01

Spike development in wheat is a complicated development process and determines the wheat propagation and survival. We report herein a proteomic study on the bread wheat mutant strain 5660M underlying spike development inhibition. A total of 121 differentially expressed proteins, which were involved in cold stress response, protein folding and assembly, cell-cycle regulation, scavenging of ROS, and the autonomous pathway were identified using MS/MS and database searching. We found that cold responsive proteins were highly expressed in the mutant in contrast to those expressed in the wild-type line. Particularly, the autonomous pathway protein FVE, which modulates flowering, was dramatically downregulated and closely related to the spike development inhibition phenotype of 5660M. A quantitative RT-PCR study demonstrated that the transcription of the FVE and other six genes in the autonomous pathway and downstream flowering regulators were all markedly downregulated. The results indicate that spike development of 5660M cannot complete the floral transition. FVE might play an important role in the spikes development of the wheat. Our results provide the theory basis for studying floral development and transition in the reproductive growth period, and further analysis of wheat yield formation. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Human protoporphyrinogen oxidase: expression, purification, and characterization of the cloned enzyme.

PubMed Central

Dailey, T. A.; Dailey, H. A.

1996-01-01

Protoporphyrinogen oxidase (E.C.1.3.3.4) catalyzes the oxygen-dependent oxidation of protoporphyrinogen IX to protoporphyrin IX. The enzyme from human placenta has been cloned, sequenced, expressed in Escherichia coli, purified to homogeneity, and characterized. Northern blot analysis of eight different human tissues show evidence for only a single transcript in all tissue types and the size of this transcript is approximately 1.8 kb. The human cDNA has been inserted into an expression vector for E. coli and the protein produced at high levels in these cells. The protein is found in both membrane and cytoplasmic fractions. The enzyme was purified to homogeneity in the presence of detergents using a metal chelate affinity column. The purified protein is a homodimer composed of subunits of molecular weight of 51,000. The enzyme contains one noncovalently bound FAD per dimer, has a monomer extinction coefficient of 48,000 at 270 nm and contains no detectable redox active metals. The apparent K(m) and Kcat for protoporphyrinogen IX are 1.7 microM and 10.5 min-1, respectively. The enzyme does not use coproporphyrinogen III as a substrate and is inhibited by micromolar concentrations of the herbicide acifluorfen. Protein database searches reveal significant homology between protoporphyrinogen oxidase and monoamine oxidase. PMID:8771201
Database resources of the National Center for Biotechnology Information

PubMed Central

Wheeler, David L.; Barrett, Tanya; Benson, Dennis A.; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Kenton, David L.; Khovayko, Oleg; Lipman, David J.; Madden, Thomas L.; Maglott, Donna R.; Ostell, James; Pruitt, Kim D.; Schuler, Gregory D.; Schriml, Lynn M.; Sequeira, Edwin; Sherry, Stephen T.; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Suzek, Tugba O.; Tatusov, Roman; Tatusova, Tatiana A.; Wagner, Lukas; Yaschenko, Eugene

2006-01-01

In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Retroviral Genotyping Tools, HIV-1, Human Protein Interaction Database, SAGEmap, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of the resources can be accessed through the NCBI home page at: . PMID:16381840
Systematic Mapping and Functional Analysis of a Family of Human Epididymal Secretory Sperm-Located Proteins*

PubMed Central

Li, JianYuan; Liu, FuJun; Wang, HaiYan; Liu, Xin; Liu, Juan; Li, Ning; Wan, FengChun; Wang, WenTing; Zhang, ChengLin; Jin, ShaoHua; Liu, Jie; Zhu, Peng; Liu, YunXiang

2010-01-01

The mammalian spermatozoon has many cellular compartments, such as head and tail, permitting it to interact with the female reproductive tract and fertilize the egg. It acquires this fertilizing potential during transit through the epididymis, which secretes proteins that coat different sperm domains. Optimal levels of these proteins provide the spermatozoon with its ability to move to, bind to, fuse with, and penetrate the egg; otherwise male infertility results. As few human epididymal proteins have been characterized, this work was performed to generate a database of human epididymal sperm-located proteins involved in maturation. Two-dimensional gel electrophoresis of epididymal tissue and luminal fluid proteins, followed by identification using MALDI-TOF/MS or MALDI-TOF/TOF, revealed over a thousand spots in gels comprising 745 abundant nonstructural proteins, 408 in luminal fluids, of which 207 were present on spermatozoa. Antibodies raised to 619 recombinant or synthetic peptides, used in Western blots, histological sections, and washed sperm preparations to confirm antibody quality and protein expression, indicated their regional location in the epididymal epithelium and highly specific locations on washed functional spermatozoa. Sperm function tests suggested the role of some proteins in motility and protection against oxidative attack. A large database of these proteins, characterized by size, pI, chromosomal location, and function, was given a unified terminology reflecting their sperm domain location. These novel, secreted human epididymal proteins are potential targets for a posttesticular contraceptive acting to provide rapid, reversible, functional sterility in men and they are also biomarkers that could be used in noninvasive assessments of male fertility. PMID:20736409
Genomics of Mature and Immature Olfactory Sensory Neurons

PubMed Central

Nickell, Melissa D.; Breheny, Patrick; Stromberg, Arnold J.; McClintock, Timothy S.

2014-01-01

The continuous replacement of neurons in the olfactory epithelium provides an advantageous model for investigating neuronal differentiation and maturation. By calculating the relative enrichment of every mRNA detected in samples of mature mouse olfactory sensory neurons (OSNs), immature OSNs, and the residual population of neighboring cell types, and then comparing these ratios against the known expression patterns of >300 genes, enrichment criteria that accurately predicted the OSN expression patterns of nearly all genes were determined. We identified 847 immature OSN-specific and 691 mature OSN-specific genes. The control of gene expression by chromatin modification and transcription factors, and neurite growth, protein transport, RNA processing, cholesterol biosynthesis, and apoptosis via death domain receptors, were overrepresented biological processes in immature OSNs. Ion transport (ion channels), presynaptic functions, and cilia-specific processes were overrepresented in mature OSNs. Processes overrepresented among the genes expressed by all OSNs were protein and ion transport, ER overload response, protein catabolism, and the electron transport chain. To more accurately represent gradations in mRNA abundance and identify all genes expressed in each cell type, classification methods were used to produce probabilities of expression in each cell type for every gene. These probabilities, which identified 9,300 genes expressed in OSNs, were 96% accurate at identifying genes expressed in OSNs and 86% accurate at discriminating genes specific to mature and immature OSNs. This OSN gene database not only predicts the genes responsible for the major biological processes active in OSNs, but also identifies thousands of never before studied genes that support OSN phenotypes. PMID:22252456
In-depth analysis of the thylakoid membrane proteome of Arabidopsis thaliana chloroplasts: new proteins, new functions, and a plastid proteome database.

PubMed

Friso, Giulia; Giacomelli, Lisa; Ytterberg, A Jimmy; Peltier, Jean-Benoit; Rudella, Andrea; Sun, Qi; Wijk, Klaas J van

2004-02-01

An extensive analysis of the Arabidopsis thaliana peripheral and integral thylakoid membrane proteome was performed by sequential extractions with salt, detergent, and organic solvents, followed by multidimensional protein separation steps (reverse-phase HPLC and one- and two-dimensional electrophoresis gels), different enzymatic and nonenzymatic protein cleavage techniques, mass spectrometry, and bioinformatics. Altogether, 154 proteins were identified, of which 76 (49%) were alpha-helical integral membrane proteins. Twenty-seven new proteins without known function but with predicted chloroplast transit peptides were identified, of which 17 (63%) are integral membrane proteins. These new proteins, likely important in thylakoid biogenesis, include two rubredoxins, a potential metallochaperone, and a new DnaJ-like protein. The data were integrated with our analysis of the lumenal-enriched proteome. We identified 83 out of 100 known proteins of the thylakoid localized photosynthetic apparatus, including several new paralogues and some 20 proteins involved in protein insertion, assembly, folding, or proteolysis. An additional 16 proteins are involved in translation, demonstrating that the thylakoid membrane surface is an important site for protein synthesis. The high coverage of the photosynthetic apparatus and the identification of known hydrophobic proteins with low expression levels, such as cpSecE, Ohp1, and Ohp2, indicate an excellent dynamic resolution of the analysis. The sequential extraction process proved very helpful to validate transmembrane prediction. Our data also were cross-correlated to chloroplast subproteome analyses by other laboratories. All data are deposited in a new curated plastid proteome database (PPDB) with multiple search functions (http://cbsusrv01.tc.cornell.edu/users/ppdb/). This PPDB will serve as an expandable resource for the plant community.
Co-LncRNA: investigating the lncRNA combinatorial effects in GO annotations and KEGG pathways based on human RNA-Seq data

PubMed Central

Zhao, Zheng; Bai, Jing; Wu, Aiwei; Wang, Yuan; Zhang, Jinwen; Wang, Zishan; Li, Yongsheng; Xu, Juan; Li, Xia

2015-01-01

Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse biological processes and diseases. However, the combinatorial effects of these molecules in a specific biological function are poorly understood. Identifying co-expressed protein-coding genes of lncRNAs would provide ample insight into lncRNA functions. To facilitate such an effort, we have developed Co-LncRNA, which is a web-based computational tool that allows users to identify GO annotations and KEGG pathways that may be affected by co-expressed protein-coding genes of a single or multiple lncRNAs. LncRNA co-expressed protein-coding genes were first identified in publicly available human RNA-Seq datasets, including 241 datasets across 6560 total individuals representing 28 tissue types/cell lines. Then, the lncRNA combinatorial effects in a given GO annotations or KEGG pathways are taken into account by the simultaneous analysis of multiple lncRNAs in user-selected individual or multiple datasets, which is realized by enrichment analysis. In addition, this software provides a graphical overview of pathways that are modulated by lncRNAs, as well as a specific tool to display the relevant networks between lncRNAs and their co-expressed protein-coding genes. Co-LncRNA also supports users in uploading their own lncRNA and protein-coding gene expression profiles to investigate the lncRNA combinatorial effects. It will be continuously updated with more human RNA-Seq datasets on an annual basis. Taken together, Co-LncRNA provides a web-based application for investigating lncRNA combinatorial effects, which could shed light on their biological roles and could be a valuable resource for this community. Database URL: http://www.bio-bigdata.com/Co-LncRNA/ PMID:26363020
Genome-Wide Discovery of Long Non-Coding RNAs in Rainbow Trout.

PubMed

Al-Tobasei, Rafet; Paneru, Bam; Salem, Mohamed

2016-01-01

The ENCODE project revealed that ~70% of the human genome is transcribed. While only 1-2% of the RNAs encode for proteins, the rest are non-coding RNAs. Long non-coding RNAs (lncRNAs) form a diverse class of non-coding RNAs that are longer than 200 nt. Emerging evidence indicates that lncRNAs play critical roles in various cellular processes including regulation of gene expression. LncRNAs show low levels of gene expression and sequence conservation, which make their computational identification in genomes difficult. In this study, more than two billion Illumina sequence reads were mapped to the genome reference using the TopHat and Cufflinks software. Transcripts shorter than 200 nt, with more than 83-100 amino acids ORF, or with significant homologies to the NCBI nr-protein database were removed. In addition, a computational pipeline was used to filter the remaining transcripts based on a protein-coding-score test. Depending on the filtering stringency conditions, between 31,195 and 54,503 lncRNAs were identified, with only 421 matching known lncRNAs in other species. A digital gene expression atlas revealed 2,935 tissue-specific and 3,269 ubiquitously-expressed lncRNAs. This study annotates the lncRNA rainbow trout genome and provides a valuable resource for functional genomics research in salmonids.
Proteomic profiling of developing cotton fibers from wild and domesticated Gossypium barbadense.

PubMed

Hu, Guanjing; Koh, Jin; Yoo, Mi-Jeong; Grupp, Kara; Chen, Sixue; Wendel, Jonathan F

2013-10-01

Pima cotton (Gossypium barbadense) is widely cultivated because of its long, strong seed trichomes ('fibers') used for premium textiles. These agronomically advanced fibers were derived following domestication and thousands of years of human-mediated crop improvement. To gain an insight into fiber development and evolution, we conducted comparative proteomic and transcriptomic profiling of developing fiber from an elite cultivar and a wild accession. Analyses using isobaric tag for relative and absolute quantification (iTRAQ) LC-MS/MS technology identified 1317 proteins in fiber. Of these, 205 were differentially expressed across developmental stages, and 190 showed differential expression between wild and cultivated forms, 14.4% of the proteome sampled. Human selection may have shifted the timing of developmental modules, such that some occur earlier in domesticated than in wild cotton. A novel approach was used to detect possible biased expression of homoeologous copies of proteins. Results indicate a significant partitioning of duplicate gene expression at the protein level, but an approximately equal degree of bias for each of the two constituent genomes of allopolyploid cotton. Our results demonstrate the power of complementary transcriptomic and proteomic approaches for the study of the domestication process. They also provide a rich database for mining for functional analyses of cotton improvement or evolution. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.
GEM2Net: from gene expression modeling to -omics networks, a new CATdb module to investigate Arabidopsis thaliana genes involved in stress response.

PubMed

Zaag, Rim; Tamby, Jean Philippe; Guichard, Cécile; Tariq, Zakia; Rigaill, Guillem; Delannoy, Etienne; Renou, Jean-Pierre; Balzergue, Sandrine; Mary-Huard, Tristan; Aubourg, Sébastien; Martin-Magniette, Marie-Laure; Brunaud, Véronique

2015-01-01

CATdb (http://urgv.evry.inra.fr/CATdb) is a database providing a public access to a large collection of transcriptomic data, mainly for Arabidopsis but also for other plants. This resource has the rare advantage to contain several thousands of microarray experiments obtained with the same technical protocol and analyzed by the same statistical pipelines. In this paper, we present GEM2Net, a new module of CATdb that takes advantage of this homogeneous dataset to mine co-expression units and decipher Arabidopsis gene functions. GEM2Net explores 387 stress conditions organized into 18 biotic and abiotic stress categories. For each one, a model-based clustering is applied on expression differences to identify clusters of co-expressed genes. To characterize functions associated with these clusters, various resources are analyzed and integrated: Gene Ontology, subcellular localization of proteins, Hormone Families, Transcription Factor Families and a refined stress-related gene list associated to publications. Exploiting protein-protein interactions and transcription factors-targets interactions enables to display gene networks. GEM2Net presents the analysis of the 18 stress categories, in which 17,264 genes are involved and organized within 681 co-expression clusters. The meta-data analyses were stored and organized to compose a dynamic Web resource. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods.

PubMed

Tuo, Youlin; An, Ning; Zhang, Ming

2018-03-01

The aim of the present study was to investigate the feature genes in metastatic breast cancer samples. A total of 5 expression profiles of metastatic breast cancer samples were downloaded from the Gene Expression Omnibus database, which were then analyzed using the MetaQC and MetaDE packages in R language. The feature genes between metastasis and non‑metastasis samples were screened under the threshold of P<0.05. Based on the protein‑protein interactions (PPIs) in the Biological General Repository for Interaction Datasets, Human Protein Reference Database and Biomolecular Interaction Network Database, the PPI network of the feature genes was constructed. The feature genes identified by topological characteristics were then used for support vector machine (SVM) classifier training and verification. The accuracy of the SVM classifier was then evaluated using another independent dataset from The Cancer Genome Atlas database. Finally, function and pathway enrichment analyses for genes in the SVM classifier were performed. A total of 541 feature genes were identified between metastatic and non‑metastatic samples. The top 10 genes with the highest betweenness centrality values in the PPI network of feature genes were Nuclear RNA Export Factor 1, cyclin‑dependent kinase 2 (CDK2), myelocytomatosis proto‑oncogene protein (MYC), Cullin 5, SHC Adaptor Protein 1, Clathrin heavy chain, Nucleolin, WD repeat domain 1, proteasome 26S subunit non‑ATPase 2 and telomeric repeat binding factor 2. The cyclin‑dependent kinase inhibitor 1A (CDKN1A), E2F transcription factor 1 (E2F1), and MYC interacted with CDK2. The SVM classifier constructed by the top 30 feature genes was able to distinguish metastatic samples from non‑metastatic samples [correct rate, specificity, positive predictive value and negative predictive value >0.89; sensitivity >0.84; area under the receiver operating characteristic curve (AUROC) >0.96]. The verification of the SVM classifier in an independent dataset (35 metastatic samples and 143 non‑metastatic samples) revealed an accuracy of 94.38% and AUROC of 0.958. Cell cycle associated functions and pathways were the most significant terms of the 30 feature genes. A SVM classifier was constructed to assess the possibility of breast cancer metastasis, which presented high accuracy in several independent datasets. CDK2, CDKN1A, E2F1 and MYC were indicated as the potential feature genes in metastatic breast cancer.
Separomics applied to the proteomics and peptidomics of low-abundance proteins: Choice of methods and challenges - A review.

PubMed

Baracat-Pereira, Maria Cristina; de Oliveira Barbosa, Meire; Magalhães, Marcos Jorge; Carrijo, Lanna Clicia; Games, Patrícia Dias; Almeida, Hebréia Oliveira; Sena Netto, José Fabiano; Pereira, Matheus Rodrigues; de Barros, Everaldo Gonçalves

2012-06-01

The enrichment and isolation of proteins are considered limiting steps in proteomic studies. Identification of proteins whose expression is transient, those that are of low-abundance, and of natural peptides not described in databases, is still a great challenge. Plant extracts are in general complex, and contaminants interfere with the identification of proteins involved in important physiological processes, such as plant defense against pathogens. This review discusses the challenges and strategies of separomics applied to the identification of low-abundance proteins and peptides in plants, especially in plants challenged by pathogens. Separomics is described as a group of methodological strategies for the separation of protein molecules for proteomics. Several tools have been used to remove highly abundant proteins from samples and also non-protein contaminants. The use of chromatographic techniques, the partition of the proteome into subproteomes, and an effort to isolate proteins in their native form have allowed the isolation and identification of rare proteins involved in different processes.

Separomics applied to the proteomics and peptidomics of low-abundance proteins: Choice of methods and challenges – A review

PubMed Central

Baracat-Pereira, Maria Cristina; de Oliveira Barbosa, Meire; Magalhães, Marcos Jorge; Carrijo, Lanna Clicia; Games, Patrícia Dias; Almeida, Hebréia Oliveira; Sena Netto, José Fabiano; Pereira, Matheus Rodrigues; de Barros, Everaldo Gonçalves

2012-01-01

The enrichment and isolation of proteins are considered limiting steps in proteomic studies. Identification of proteins whose expression is transient, those that are of low-abundance, and of natural peptides not described in databases, is still a great challenge. Plant extracts are in general complex, and contaminants interfere with the identification of proteins involved in important physiological processes, such as plant defense against pathogens. This review discusses the challenges and strategies of separomics applied to the identification of low-abundance proteins and peptides in plants, especially in plants challenged by pathogens. Separomics is described as a group of methodological strategies for the separation of protein molecules for proteomics. Several tools have been used to remove highly abundant proteins from samples and also non-protein contaminants. The use of chromatographic techniques, the partition of the proteome into subproteomes, and an effort to isolate proteins in their native form have allowed the isolation and identification of rare proteins involved in different processes. PMID:22802713
The Nuclear Protein Database (NPD): sub-nuclear localisation and functional annotation of the nuclear proteome

PubMed Central

Dellaire, G.; Farrall, R.; Bickmore, W.A.

2003-01-01

The Nuclear Protein Database (NPD) is a curated database that contains information on more than 1300 vertebrate proteins that are thought, or are known, to localise to the cell nucleus. Each entry is annotated with information on predicted protein size and isoelectric point, as well as any repeats, motifs or domains within the protein sequence. In addition, information on the sub-nuclear localisation of each protein is provided and the biological and molecular functions are described using Gene Ontology (GO) terms. The database is searchable by keyword, protein name, sub-nuclear compartment and protein domain/motif. Links to other databases are provided (e.g. Entrez, SWISS-PROT, OMIM, PubMed, PubMed Central). Thus, NPD provides a gateway through which the nuclear proteome may be explored. The database can be accessed at http://npd.hgu.mrc.ac.uk and is updated monthly. PMID:12520015
Identification of Immunity-Related Genes in Dialeurodes citri against Entomopathogenic Fungus Lecanicillium attenuatum by RNA-Seq Analysis.

PubMed

Yu, Shijiang; Ding, Lili; Luo, Ren; Li, Xiaojiao; Yang, Juan; Liu, Haoqiang; Cong, Lin; Ran, Chun

2016-01-01

Dialeurodes citri is a major pest in citrus producing areas, and large-scale outbreaks have occurred increasingly often in recent years. Lecanicillium attenuatum is an important entomopathogenic fungus that can parasitize and kill D. citri. We separated the fungus from corpses of D. citri larvae. However, the sound immune defense system of pests makes infection by an entomopathogenic fungus difficult. Here we used RNA sequencing technology (RNA-Seq) to build a transcriptome database for D. citri and performed digital gene expression profiling to screen genes that act in the immune defense of D. citri larvae infected with a pathogenic fungus. De novo assembly generated 84,733 unigenes with mean length of 772 nt. All unigenes were searched against GO, Nr, Swiss-Prot, COG, and KEGG databases and a total of 28,190 (33.3%) unigenes were annotated. We identified 129 immunity-related unigenes in transcriptome database that were related to pattern recognition receptors, information transduction factors and response factors. From the digital gene expression profile, we identified 441 unigenes that were differentially expressed in D. citri infected with L. attenuatum. Through calculated Log2Ratio values, we identified genes for which fold changes in expression were obvious, including cuticle protein, vitellogenin, cathepsin, prophenoloxidase, clip-domain serine protease, lysozyme, and others. Subsequent quantitative real-time polymerase chain reaction analysis verified the results. The identified genes may serve as target genes for microbial control of D. citri.
Identification of Immunity-Related Genes in Dialeurodes citri against Entomopathogenic Fungus Lecanicillium attenuatum by RNA-Seq Analysis

PubMed Central

Yu, Shijiang; Ding, Lili; Luo, Ren; Li, Xiaojiao; Yang, Juan; Liu, Haoqiang; Cong, Lin; Ran, Chun

2016-01-01

Dialeurodes citri is a major pest in citrus producing areas, and large-scale outbreaks have occurred increasingly often in recent years. Lecanicillium attenuatum is an important entomopathogenic fungus that can parasitize and kill D. citri. We separated the fungus from corpses of D. citri larvae. However, the sound immune defense system of pests makes infection by an entomopathogenic fungus difficult. Here we used RNA sequencing technology (RNA-Seq) to build a transcriptome database for D. citri and performed digital gene expression profiling to screen genes that act in the immune defense of D. citri larvae infected with a pathogenic fungus. De novo assembly generated 84,733 unigenes with mean length of 772 nt. All unigenes were searched against GO, Nr, Swiss-Prot, COG, and KEGG databases and a total of 28,190 (33.3%) unigenes were annotated. We identified 129 immunity-related unigenes in transcriptome database that were related to pattern recognition receptors, information transduction factors and response factors. From the digital gene expression profile, we identified 441 unigenes that were differentially expressed in D. citri infected with L. attenuatum. Through calculated Log2Ratio values, we identified genes for which fold changes in expression were obvious, including cuticle protein, vitellogenin, cathepsin, prophenoloxidase, clip-domain serine protease, lysozyme, and others. Subsequent quantitative real-time polymerase chain reaction analysis verified the results. The identified genes may serve as target genes for microbial control of D. citri. PMID:27644092
Database of cattle candidate genes and genetic markers for milk production and mastitis

PubMed Central

Ogorevc, J; Kunej, T; Razpet, A; Dovc, P

2009-01-01

A cattle database of candidate genes and genetic markers for milk production and mastitis has been developed to provide an integrated research tool incorporating different types of information supporting a genomic approach to study lactation, udder development and health. The database contains 943 genes and genetic markers involved in mammary gland development and function, representing candidates for further functional studies. The candidate loci were drawn on a genetic map to reveal positional overlaps. For identification of candidate loci, data from seven different research approaches were exploited: (i) gene knockouts or transgenes in mice that result in specific phenotypes associated with mammary gland (143 loci); (ii) cattle QTL for milk production (344) and mastitis related traits (71); (iii) loci with sequence variations that show specific allele-phenotype interactions associated with milk production (24) or mastitis (10) in cattle; (iv) genes with expression profiles associated with milk production (207) or mastitis (107) in cattle or mouse; (v) cattle milk protein genes that exist in different genetic variants (9); (vi) miRNAs expressed in bovine mammary gland (32) and (vii) epigenetically regulated cattle genes associated with mammary gland function (1). Fourty-four genes found by multiple independent analyses were suggested as the most promising candidates and were further in silico analysed for expression levels in lactating mammary gland, genetic variability and top biological functions in functional networks. A miRNA target search for mammary gland expressed miRNAs identified 359 putative binding sites in 3′UTRs of candidate genes. PMID:19508288
UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs

PubMed Central

Mignone, Flavio; Grillo, Giorgio; Licciulli, Flavio; Iacono, Michele; Liuni, Sabino; Kersey, Paul J.; Duarte, Jorge; Saccone, Cecilia; Pesole, Graziano

2005-01-01

The 5′ and 3′ untranslated regions of eukaryotic mRNAs play crucial roles in the post-transcriptional regulation of gene expression through the modulation of nucleo-cytoplasmic mRNA transport, translation efficiency, subcellular localization and message stability. UTRdb is a curated database of 5′ and 3′ untranslated sequences of eukaryotic mRNAs, derived from several sources of primary data. Experimentally validated functional motifs are annotated (and also collated as the UTRsite database) and cross-links to genomic and protein data are provided. The integration of UTRdb with genomic and protein data has allowed the implementation of a powerful retrieval resource for the selection and extraction of UTR subsets based on their genomic coordinates and/or features of the protein encoded by the relevant mRNA (e.g. GO term, PFAM domain, etc.). All internet resources implemented for retrieval and functional analysis of 5′ and 3′ untranslated regions of eukaryotic mRNAs are accessible at http://www.ba.itb.cnr.it/UTR/. PMID:15608165
Bioinformatics analysis of transcriptome dynamics during growth in angus cattle longissimus muscle.

PubMed

Moisá, Sonia J; Shike, Daniel W; Graugnard, Daniel E; Rodriguez-Zas, Sandra L; Everts, Robin E; Lewin, Harris A; Faulkner, Dan B; Berger, Larry L; Loor, Juan J

2013-01-01

Transcriptome dynamics in the longissimus muscle (LM) of young Angus cattle were evaluated at 0, 60, 120, and 220 days from early-weaning. Bioinformatic analysis was performed using the dynamic impact approach (DIA) by means of Kyoto Encyclopedia of Genes and Genomes (KEGG) and Database for Annotation, Visualization and Integrated Discovery (DAVID) databases. Between 0 to 120 days (growing phase) most of the highly-impacted pathways (eg, ascorbate and aldarate metabolism, drug metabolism, cytochrome P450 and Retinol metabolism) were inhibited. The phase between 120 to 220 days (finishing phase) was characterized by the most striking differences with 3,784 differentially expressed genes (DEGs). Analysis of those DEGs revealed that the most impacted KEGG canonical pathway was glycosylphosphatidylinositol (GPI)-anchor biosynthesis, which was inhibited. Furthermore, inhibition of calpastatin and activation of tyrosine aminotransferase ubiquitination at 220 days promotes proteasomal degradation, while the concurrent activation of ribosomal proteins promotes protein synthesis. Therefore, the balance of these processes likely results in a steady-state of protein turnover during the finishing phase. Results underscore the importance of transcriptome dynamics in LM during growth.
Genome-Wide Identification and Expression Profiling Analysis of the Xyloglucan Endotransglucosylase/Hydrolase Gene Family in Tobacco (Nicotiana tabacum L.).

PubMed

Wang, Meng; Xu, Zongchang; Ding, Anming; Kong, Yingzhen

2018-05-24

Xyloglucan endotransglucosylase/hydrolase genes ( XTHs ) encode enzymes required for the reconstruction and modification of xyloglucan backbones, which will result in changes of cell wall extensibility during growth. A total of 56 NtXTH genes were identified from common tobacco, and 50 cDNA fragments were verified by PCR amplification. The 56 NtXTH genes could be classified into two subfamilies: Group I/II and Group III according to their phylogenetic relationships. The gene structure, chromosomal localization, conserved protein domains prediction, sub-cellular localization of NtXTH proteins and evolutionary relationships among Nicotiana tabacum , Nicotiana sylvestrisis , Nicotiana tomentosiformis , Arabidopsis , and rice were also analyzed. The NtXTHs expression profiles analyzed by the TobEA database and qRT-PCR revealed that NtXTHs display different expression patterns in different tissues. Notably, the expression patterns of 12 NtXTHs responding to environment stresses, including salinity, alkali, heat, chilling, and plant hormones, including IAA and brassinolide, were characterized. All the results would be useful for the function study of NtXTHs during different growth cycles and stresses.
Vascular Gene Expression in Nonneoplastic and Malignant Brain

PubMed Central

Madden, Stephen L.; Cook, Brian P.; Nacht, Mariana; Weber, William D.; Callahan, Michelle R.; Jiang, Yide; Dufault, Michael R.; Zhang, Xiaoming; Zhang, Wen; Walter-Yohrling, Jennifer; Rouleau, Cecile; Akmaev, Viatcheslav R.; Wang, Clarence J.; Cao, Xiaohong; St. Martin, Thia B.; Roberts, Bruce L.; Teicher, Beverly A.; Klinger, Katherine W.; Stan, Radu-Virgil; Lucey, Brenden; Carson-Walter, Eleanor B.; Laterra, John; Walter, Kevin A.

2004-01-01

Malignant gliomas are uniformly lethal tumors whose morbidity is mediated in large part by the angiogenic response of the brain to the invading tumor. This profound angiogenic response leads to aggressive tumor invasion and destruction of surrounding brain tissue as well as blood-brain barrier breakdown and life-threatening cerebral edema. To investigate the molecular mechanisms governing the proliferation of abnormal microvasculature in malignant brain tumor patients, we have undertaken a cell-specific transcriptome analysis from surgically harvested nonneoplastic and tumor-associated endothelial cells. SAGE-derived endothelial cell gene expression patterns from glioma and nonneoplastic brain tissue reveal distinct gene expression patterns and consistent up-regulation of certain glioma endothelial marker genes across patient samples. We define the G-protein-coupled receptor RDC1 as a tumor endothelial marker whose expression is distinctly induced in tumor endothelial cells of both brain and peripheral vasculature. Further, we demonstrate that the glioma-induced gene, PV1, shows expression both restricted to endothelial cells and coincident with endothelial cell tube formation. As PV1 provides a framework for endothelial cell caveolar diaphragms, this protein may serve to enhance glioma-induced disruption of the blood-brain barrier and transendothelial exchange. Additional characterization of this extensive brain endothelial cell gene expression database will provide unique molecular insights into vascular gene expression. PMID:15277233
Identification of differentially expressed proteins during human urinary bladder cancer progression.

PubMed

Memon, Ashfaque A; Chang, Jong W; Oh, Bong R; Yoo, Yung J

2005-01-01

Comparative proteome analysis was performed between RT4 (grade-1) and T24 (grade-3) bladder cancer cell lines, in an attempt to identify differentially expressed proteins during bladder cancer progression. Among those relatively abundant proteins, seven spots changed more than two-fold reproducibly and identified by peptide mass fingerprinting using mass spectrometry and database search. We found most extensive and reproducible down-regulation of NADP dependent isocitrate dehydrogenase cytoplasmic (IDPc) and peroxiredoxin-II (Prx-II), in poorly differentiated T24 compared to well-differentiated RT4 bladder cancer cell line. Subsequent Western blotting analysis of human biopsy samples from bladder cancer patient revealed significant loss of IDPc and Prx-II in more advance tumor samples, in agreement with data on cell lines. These results suggest that loss of IDPc and Prx-II during tumor development may involve in tumor progression and metastasis. However, additional investigations are needed on large number of human samples to further verify these findings.
TissueWikiMobile: an Integrative Protein Expression Image Browser for Pathological Knowledge Sharing and Annotation on a Mobile Device

PubMed Central

Cheng, Chihwen; Stokes, Todd H.; Hang, Sovandy; Wang, May D.

2016-01-01

Doctors need fast and convenient access to medical data. This motivates the use of mobile devices for knowledge retrieval and sharing. We have developed TissueWikiMobile on the Apple iPhone and iPad to seamlessly access TissueWiki, an enormous repository of medical histology images. TissueWiki is a three terabyte database of antibody information and histology images from the Human Protein Atlas (HPA). Using TissueWikiMobile, users are capable of extracting knowledge from protein expression, adding annotations to highlight regions of interest on images, and sharing their professional insight. By providing an intuitive human computer interface, users can efficiently operate TissueWikiMobile to access important biomedical data without losing mobility. TissueWikiMobile furnishes the health community a ubiquitous way to collaborate and share their expert opinions not only on the performance of various antibodies stains but also on histology image annotation. PMID:27532057
A Novel Gibberellin-Induced Gene from Rice and Its Potential Regulatory Role in Stem Growth1

PubMed Central

van der Knaap, Esther; Kim, Jeong Hoe; Kende, Hans

2000-01-01

Os-GRF1 (Oryza sativa-GROWTH-REGULATING FACTOR1) was identified in a search for genes that are differentially expressed in the intercalary meristem of deepwater rice (Oryza sativa L.) internodes in response to gibberellin (GA). Os-GRF1 displays general features of transcription factors, contains a functional nuclear localization signal, and has three regions with similarities to sequences in the database. One of these regions is similar to a protein interaction domain of SWI2/SNF2, which is a subunit of a chromatin-remodeling complex in yeast. The two other domains are novel and found only in plant proteins of unknown function. To study its role in plant growth, Os-GRF1 was expressed in Arabidopsis. Stem elongation of transformed plants was severely inhibited, and normal growth could not be recovered by the application of GA. Our results indicate that Os-GRF1 belongs to a novel class of plant proteins and may play a regulatory role in GA-induced stem elongation. PMID:10712532
System for face recognition under expression variations of neutral-sampled individuals using recognized expression warping and a virtual expression-face database

NASA Astrophysics Data System (ADS)

Petpairote, Chayanut; Madarasmi, Suthep; Chamnongthai, Kosin

2018-01-01

The practical identification of individuals using facial recognition techniques requires the matching of faces with specific expressions to faces from a neutral face database. A method for facial recognition under varied expressions against neutral face samples of individuals via recognition of expression warping and the use of a virtual expression-face database is proposed. In this method, facial expressions are recognized and the input expression faces are classified into facial expression groups. To aid facial recognition, the virtual expression-face database is sorted into average facial-expression shapes and by coarse- and fine-featured facial textures. Wrinkle information is also employed in classification by using a process of masking to adjust input faces to match the expression-face database. We evaluate the performance of the proposed method using the CMU multi-PIE, Cohn-Kanade, and AR expression-face databases, and we find that it provides significantly improved results in terms of face recognition accuracy compared to conventional methods and is acceptable for facial recognition under expression variation.
GermOnline 4.0 is a genomics gateway for germline development, meiosis and the mitotic cell cycle

PubMed Central

Lardenois, Aurélie; Gattiker, Alexandre; Collin, Olivier; Chalmel, Frédéric; Primig, Michael

2010-01-01

GermOnline 4.0 is a cross-species database portal focusing on high-throughput expression data relevant for germline development, the meiotic cell cycle and mitosis in healthy versus malignant cells. It is thus a source of information for life scientists as well as clinicians who are interested in gene expression and regulatory networks. The GermOnline gateway provides unlimited access to information produced with high-density oligonucleotide microarrays (3′-UTR GeneChips), genome-wide protein–DNA binding assays and protein–protein interaction studies in the context of Ensembl genome annotation. Samples used to produce high-throughput expression data and to carry out genome-wide in vivo DNA binding assays are annotated via the MIAME-compliant Multiomics Information Management and Annotation System (MIMAS 3.0). Furthermore, the Saccharomyces Genomics Viewer (SGV) was developed and integrated into the gateway. SGV is a visualization tool that outputs genome annotation and DNA-strand specific expression data produced with high-density oligonucleotide tiling microarrays (Sc_tlg GeneChips) which cover the complete budding yeast genome on both DNA strands. It facilitates the interpretation of expression levels and transcript structures determined for various cell types cultured under different growth and differentiation conditions. Database URL: www.germonline.org/ PMID:21149299
Describing the Diapause-Preparatory Proteome of the Beetle Colaphellus bowringi and Identifying Candidates Affecting Lipid Accumulation Using Isobaric Tags for Mass Spectrometry-Based Proteome Quantification (iTRAQ)

PubMed Central

Tan, Qian-Qian; Liu, Wen; Zhu, Fen; Lei, Chao-Liang; Hahn, Daniel A.; Wang, Xiao-Ping

2017-01-01

Prior to entering diapause, insects must prepare themselves physiologically to withstand the stresses of arresting their development for a lengthy period. While studies describing the biochemical and cellular milieu of the maintenance phase of diapause are accumulating, few studies have taken an “omics” approach to describing molecular events during the diapause preparatory phase. We used isobaric tags and mass spectrometry (iTRAQ) to quantitatively compare the expression profiles of proteins identified during the onset of diapause preparation phase in the heads of adult female cabbage beetles, Colaphellus bowringi. A total of 3,175 proteins were identified, 297 of which were differentially expressed between diapause-destined and non-diapause-destined female adults and could therefore be involved in diapause preparation in this species. Comparison of identified proteins with protein function databases shows that many of these differentially expressed proteins enhanced in diapause destined beetles are involved in energy production and conversion, carbohydrate metabolism and transport, and lipid metabolism. Further hand annotation of differentially abundant peptides nominates several associated with stress hardiness, including HSPs and antioxidants, as well as neural development. In contrast, non-diapause destined beetles show substantial increases in cuticle proteins, suggesting additional post-emergence growth. Using RNA interference to silence a fatty acid-binding protein (FABP) that was highly abundant in the head of diapause-destined females prevented the accumulation of lipids in the fat body, a common product of diapause preparation in this species and others. Surprisingly, RNAi against the FABP also affected the transcript abundance of several heat shock proteins. These results suggest that the identified differentially expressed proteins that play vital roles in lipid metabolism may also contribute somehow to enhanced hardiness to environmental stress that is characteristic of diapause. PMID:28491041
MicroRNA networks in mouse lung organogenesis.

PubMed

Dong, Jie; Jiang, Guoqian; Asmann, Yan W; Tomaszek, Sandra; Jen, Jin; Kislinger, Thomas; Wigle, Dennis A

2010-05-26

MicroRNAs (miRNAs) are known to be important regulators of both organ development and tumorigenesis. MiRNA networks and their regulation of messenger RNA (mRNA) translation and protein expression in specific biological processes are poorly understood. We explored the dynamic regulation of miRNAs in mouse lung organogenesis. Comprehensive miRNA and mRNA profiling was performed encompassing all recognized stages of lung development beginning at embryonic day 12 and continuing to adulthood. We analyzed the expression patterns of dynamically regulated miRNAs and mRNAs using a number of statistical and computational approaches, and in an integrated manner with protein levels from an existing mass-spectrometry derived protein database for lung development. In total, 117 statistically significant miRNAs were dynamically regulated during mouse lung organogenesis and clustered into distinct temporal expression patterns. 11,220 mRNA probes were also shown to be dynamically regulated and clustered into distinct temporal expression patterns, with 3 major patterns accounting for 75% of all probes. 3,067 direct miRNA-mRNA correlation pairs were identified involving 37 miRNAs. Two defined correlation patterns were observed upon integration with protein data: 1) increased levels of specific miRNAs directly correlating with downregulation of predicted mRNA targets; and 2) increased levels of specific miRNAs directly correlating with downregulation of translated target proteins without detectable changes in mRNA levels. Of 1345 proteins analyzed, 55% appeared to be regulated in this manner with a direct correlation between miRNA and protein level, but without detectable change in mRNA levels. Systematic analysis of microRNA, mRNA, and protein levels over the time course of lung organogenesis demonstrates dynamic regulation and reveals 2 distinct patterns of miRNA-mRNA interaction. The translation of target proteins affected by miRNAs independent of changes in mRNA level appears to be a prominent mechanism of developmental regulation in lung organogenesis.
Identification of three protein disulfide isomerase members from Haemaphysalis longicornis tick.

PubMed

Liao, Min; Hatta, Takeshi; Umemiya, Rika; Huang, Penglong; Jia, Honglin; Gong, Haiyan; Zhou, Jinlin; Nishikawa, Yoshifumi; Xuan, Xuenan; Fujisaki, Kozo

2007-07-01

Three genes encoding putative protein disulfide isomerase (PDI) were isolated from the Haemaphysalis longicornis EST database and designed as HlPDI-1, HlPDI-2, and HlPDI-3. All three PDI genes contain two typical PDI active sites CXXC and encode putative 435, 499, and 488 amino acids, respectively. The recombinant proteins expressed in Escherichia coli all show PDI activities, and the activities were inhibited by a PDI-specific inhibitor, zinc bacitracin. Western blot analysis and real-time PCR revealed that three HlPDIs were present in all the developmental stages of the tick as well as in the midgut, salivary glands, ovary, hemolymph, and fatbody of adult female ticks, but the three genes were expressed at the highest level in the egg stage. HlPDI-1 is expressed primarily in the ovary and secondarily in the salivary glands. HlPDI-2 and HlPDI-3 are expressed primarily in the salivary gland, suggesting that the PDI genes are important for tick biology, especially for egg development, and that they play distinct roles in different tissues. Blood feeding induced significantly increased expression of HlPDI-1 and HlPDI-3 in both partially fed nymphs and adults. Babesia gibsoni-infected larval ticks expressed HlPDI-1 and HlPDI-3 2.0 and 4.0 times higher than uninfected normal larval ticks, respectively. The results indicate that HlPDI-1 and HlPDI-3 might be involved in tick blood feeding and Babesia parasite infection in ticks.
Gene expression patterns combined with network analysis identify hub genes associated with bladder cancer.

PubMed

Bi, Dongbin; Ning, Hao; Liu, Shuai; Que, Xinxiang; Ding, Kejia

2015-06-01

To explore molecular mechanisms of bladder cancer (BC), network strategy was used to find biomarkers for early detection and diagnosis. The differentially expressed genes (DEGs) between bladder carcinoma patients and normal subjects were screened using empirical Bayes method of the linear models for microarray data package. Co-expression networks were constructed by differentially co-expressed genes and links. Regulatory impact factors (RIF) metric was used to identify critical transcription factors (TFs). The protein-protein interaction (PPI) networks were constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and clusters were obtained through molecular complex detection (MCODE) algorithm. Centralities analyses for complex networks were performed based on degree, stress and betweenness. Enrichment analyses were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Co-expression networks and TFs (based on expression data of global DEGs and DEGs in different stages and grades) were identified. Hub genes of complex networks, such as UBE2C, ACTA2, FABP4, CKS2, FN1 and TOP2A, were also obtained according to analysis of degree. In gene enrichment analyses of global DEGs, cell adhesion, proteinaceous extracellular matrix and extracellular matrix structural constituent were top three GO terms. ECM-receptor interaction, focal adhesion, and cell cycle were significant pathways. Our results provide some potential underlying biomarkers of BC. However, further validation is required and deep studies are needed to elucidate the pathogenesis of BC. Copyright © 2015 Elsevier Ltd. All rights reserved.
Analysis of Cytoskeletal and Motility Proteins in the Sea Urchin Genome Assembly

PubMed Central

RL, Morris; MP, Hoffman; RA, Obar; SS, McCafferty; IR, Gibbons; AD, Leone; J, Cool; EL, Allgood; AM, Musante; KM, Judkins; BJ, Rossetti; AP, Rawson; DR, Burgess

2007-01-01

The sea urchin embryo is a classical model system for studying the role of the cytoskeleton in such events as fertilization, mitosis, cleavage, cell migration and gastrulation. We have conducted an analysis of gene models derived from the Strongylocentrotus purpuratus genome assembly and have gathered strong evidence for the existence of multiple gene families encoding cytoskeletal proteins and their regulators in sea urchin. While many cytoskeletal genes have been cloned from sea urchin with sequences already existing in public databases, genome analysis reveals a significantly higher degree of diversity within certain gene families. Furthermore, genes are described corresponding to homologs of cytoskeletal proteins not previously documented in sea urchins. To illustrate the varying degree of sequence diversity that exists within cytoskeletal gene families, we conducted an analysis of genes encoding actins, specific actin-binding proteins, myosins, tubulins, kinesins, dyneins, specific microtubule-associated proteins, and intermediate filaments. We conducted ontological analysis of select genes to better understand the relatedness of urchin cytoskeletal genes to those of other deuterostomes. We analyzed developmental expression (EST) data to confirm the existence of select gene models and to understand their differential expression during various stages of early development. PMID:17027957
Functional discovery via a compendium of expression profiles.

PubMed

Hughes, T R; Marton, M J; Jones, A R; Roberts, C J; Stoughton, R; Armour, C D; Bennett, H A; Coffey, E; Dai, H; He, Y D; Kidd, M J; King, A M; Meyer, M R; Slade, D; Lum, P Y; Stepaniants, S B; Shoemaker, D D; Gachotte, D; Chakraburtty, K; Simon, J; Bard, M; Friend, S H

2000-07-07

Ascertaining the impact of uncharacterized perturbations on the cell is a fundamental problem in biology. Here, we describe how a single assay can be used to monitor hundreds of different cellular functions simultaneously. We constructed a reference database or "compendium" of expression profiles corresponding to 300 diverse mutations and chemical treatments in S. cerevisiae, and we show that the cellular pathways affected can be determined by pattern matching, even among very subtle profiles. The utility of this approach is validated by examining profiles caused by deletions of uncharacterized genes: we identify and experimentally confirm that eight uncharacterized open reading frames encode proteins required for sterol metabolism, cell wall function, mitochondrial respiration, or protein synthesis. We also show that the compendium can be used to characterize pharmacological perturbations by identifying a novel target of the commonly used drug dyclonine.

Separation and identification of Musa acuminate Colla (banana) leaf proteins by two-dimensional gel electrophoresis and mass spectrometry.

PubMed

Lu, Y; Qi, Y X; Zhang, H; Zhang, H Q; Pu, J J; Xie, Y X

2013-12-19

To establish a proteomic reference map of Musa acuminate Colla (banana) leaf, we separated and identified leaf proteins using two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) and mass spectrometry (MS). Tryptic digests of 44 spots were subjected to peptide mass fingerprinting (PMF) by matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) MS. Three spots that were not identified by MALDI-TOF MS analysis were identified by searching against the NCBInr, SwissProt, and expressed sequence tag (EST) databases. We identified 41 unique proteins. The majority of the identified leaf proteins were found to be involved in energy metabolism. The results indicate that 2D-PAGE is a sensitive and powerful technique for the separation and identification of Musa leaf proteins. A summary of the identified proteins and their putative functions is discussed.
A Reference Proteomic Database of Lactobacillus plantarum CMCC-P0002

PubMed Central

Tian, Wanhong; Yu, Gang; Liu, Xiankai; Wang, Jie; Feng, Erling; Zhang, Xuemin; Chen, Bei; Zeng, Ming; Wang, Hengliang

2011-01-01

Lactobacillus plantarum is a widespread probiotic bacteria found in many fermented food products. In this study, the whole-cell proteins and secretory proteins of L. plantarum were separated by two-dimensional electrophoresis method. A total of 434 proteins were identified by tandem mass spectrometry, including a plasmid-encoded hypothetical protein pLP9000_05. The information of first 20 highest abundance proteins was listed for the further genetic manipulation of L. plantarum, such as construction of high-level expressions system. Furthermore, the first interaction map of L. plantarum was established by Blue-Native/SDS-PAGE technique. A heterodimeric complex composed of maltose phosphorylase Map3 and Map2, and two homodimeric complexes composed of Map3 and Map2 respectively, were identified at the same time, indicating the important roles of these proteins. These findings provided valuable information for the further proteomic researches of L. plantarum. PMID:21998671
A reference proteomic database of Lactobacillus plantarum CMCC-P0002.

PubMed

Zhu, Li; Hu, Wei; Liu, Datao; Tian, Wanhong; Yu, Gang; Liu, Xiankai; Wang, Jie; Feng, Erling; Zhang, Xuemin; Chen, Bei; Zeng, Ming; Wang, Hengliang

2011-01-01

Lactobacillus plantarum is a widespread probiotic bacteria found in many fermented food products. In this study, the whole-cell proteins and secretory proteins of L. plantarum were separated by two-dimensional electrophoresis method. A total of 434 proteins were identified by tandem mass spectrometry, including a plasmid-encoded hypothetical protein pLP9000_05. The information of first 20 highest abundance proteins was listed for the further genetic manipulation of L. plantarum, such as construction of high-level expressions system. Furthermore, the first interaction map of L. plantarum was established by Blue-Native/SDS-PAGE technique. A heterodimeric complex composed of maltose phosphorylase Map3 and Map2, and two homodimeric complexes composed of Map3 and Map2 respectively, were identified at the same time, indicating the important roles of these proteins. These findings provided valuable information for the further proteomic researches of L. plantarum.
Decreased expression of serine protease inhibitor family G1 (SERPING1) in prostate cancer can help distinguish high-risk prostate cancer and predicts malignant progression.

PubMed

Peng, Shengmeng; Du, Tao; Wu, Wanhua; Chen, Xianju; Lai, Yiming; Zhu, Dingjun; Wang, Qiong; Ma, Xiaoming; Lin, Chunhao; Li, Zean; Guo, Zhenghui; Huang, Hai

2018-06-11

The aim of this study was to investigate the associations of serine proteinase inhibitor family G1 (SERPING1) down-regulation with poor prognosis in patients with prostate cancer (PCa). Furthermore, we aim to find more novel and effective PCa molecular markers to provide an early screening of PCa, distinguish patients with aggressive PCa, predict the prognosis, or reduce the economic burden of PCa. SERPING1 protein expression in both human PCa and normal prostate tissues was detected by immunohistochemical staining, which intensity was analyzed in association with clinical pathological parameters such Gleason score, pathological grade, clinical stage, tumor stage, lymph node metastasis, and distant metastasis. Moreover, we used The Cancer Genome Atlas (TCGA) Database, Taylor Database, and Oncomine dataset to validate our immunohistochemical results and investigated the value of SERPING1 in PCa at mRNA level. Kaplan-Meier analysis and Cox regression analysis were performed to evaluate the relationship between SERPING1 and prognosis of patients with PCa. The outcome showed that SERPING1 was expressed mainly in cytoplasm of grand cells of prostate tissue and was significantly expressed less in PCa (P<0.001). Furthermore, in the tissue microarray of our samples, decreasing expression of SERPING1 was correlated with the higher Gleason score (P = 0.004), the higher pathological grade (P = 0.01) and the advanced tumor stage (P = 0.005) at protein level. In TCGA dataset and Taylor Dataset, low-expressed SERPING1 was correlated with the younger patient (P = 0.02 in TCGA, P = 0.044 in Taylor) and the higher Gleason score (P = 0.019 in TCGA, P<0.001 in Taylor) at mRNA level. Kaplan-Meier analysis revealed that the lower mRNA of SERPING1 predicted lower overall survivals (P = 0.027 in TCGA), lower disease-free survival (P = 0.029) and lower biochemical recurrence-free survival (P = 0.011 in Taylor). Data from Oncomine database shown that SERPING1 low expression implying higher malignancy of prostate lesions. Using multivariate analysis, we also found that SERPING1 expression was independent prognostic marker of poor disease-free survival and biochemical recurrence-free survival. SERPING1 may play an important role in PCa and can be serve as a novel marker in diagnosis and prognostic prediction in PCa. In addition, levels of SERPING1 can help identify low-risk prostate to provide reference for patients with PCa to accept active surveillance and reduce overtreatment. Copyright © 2018 Elsevier Inc. All rights reserved.
Looking for Cancer Clues in Publicly Accessible Databases

PubMed Central

Lemkin, Peter F.; Smythers, Gary W.; Munroe, David J.

2004-01-01

What started out as a mere attempt to tentatively identify proteins in experimental cancer-related 2D-PAGE maps developed into VIRTUAL2D, a web-accessible repository for theoretical pI/MW charts for 92 organisms. Using publicly available expression data, we developed a collection of tissue-specific plots based on differential gene expression between normal and diseased states. We use this comparative cancer proteomics knowledge base, known as the tissue molecular anatomy project (TMAP), to uncover threads of cancer markers common to several types of cancer and to relate this information to established biological pathways. PMID:18629065
Looking for cancer clues in publicly accessible databases.

PubMed

Medjahed, Djamel; Lemkin, Peter F; Smythers, Gary W; Munroe, David J

2004-01-01

What started out as a mere attempt to tentatively identify proteins in experimental cancer-related 2D-PAGE maps developed into VIRTUAL2D, a web-accessible repository for theoretical pI/MW charts for 92 organisms. Using publicly available expression data, we developed a collection of tissue-specific plots based on differential gene expression between normal and diseased states. We use this comparative cancer proteomics knowledge base, known as the tissue molecular anatomy project (TMAP), to uncover threads of cancer markers common to several types of cancer and to relate this information to established biological pathways.
Proteome Analysis of the Plasma Membrane of Mycobacterium Tuberculosis

PubMed Central

Arora, Shalini; Kosalai, K.; Namane, Abdelkader; Pym, Alex S.; Cole, Stewart T.

2002-01-01

The plasma membrane of Mycobacterium tuberculosis is likely to contain proteins that could serve as novel drug targets, diagnostic probes or even components of a vaccine against tuberculosis. With this in mind, we have undertaken proteome analysis of the membrane of M. tuberculosis H37Rv. Isolated membrane vesicles were extracted with either a detergent (Triton X114) or an alkaline buffer (carbonate) following two of the protocols recommended for membrane protein enrichment. Proteins were resolved by 2D-GE using immobilized pH gradient (IPG) strips, and identified by peptide mass mapping utilizing the M. tuberculosis genome database. The two extraction procedures yielded patterns with minimal overlap. Only two proteins, both HSPs, showed a common presence. MALDI–MS analysis of 61 spots led to the identification of 32 proteins, 17 of which were new to the M. tuberculosis proteome database. We classified 19 of the identified proteins as ‘membrane-associated’; 14 of these were further classified as ‘membrane-bound’, three of which were lipoproteins. The remaining proteins included four heat-shock proteins and several enzymes involved in energy or lipid metabolism. Extraction with Triton X114 was found to be more effective than carbonate for detecting ‘putative’ M. tuberculosis membrane proteins. The protocol was also found to be suitable for comparing BCG and M. tuberculosis membranes, identifying ESAT-6 as being expressed selectively in M. tuberculosis. While this study demonstrates for the first time some of the membrane proteins of M. tuberculosis, it also underscores the problems associated with proteomic analysis of a complex membrane such as that of a mycobacterium. PMID:18629250
IntegromeDB: an integrated system and biological search engine

PubMed Central

2012-01-01

Background With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Description Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. Conclusions The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback. PMID:22260095
The National NeuroAIDS Tissue Consortium (NNTC) Database: an integrated database for HIV-related studies

PubMed Central

Cserhati, Matyas F.; Pandey, Sanjit; Beaudoin, James J.; Baccaglini, Lorena; Guda, Chittibabu; Fox, Howard S.

2015-01-01

We herein present the National NeuroAIDS Tissue Consortium-Data Coordinating Center (NNTC-DCC) database, which is the only available database for neuroAIDS studies that contains data in an integrated, standardized form. This database has been created in conjunction with the NNTC, which provides human tissue and biofluid samples to individual researchers to conduct studies focused on neuroAIDS. The database contains experimental datasets from 1206 subjects for the following categories (which are further broken down into subcategories): gene expression, genotype, proteins, endo-exo-chemicals, morphometrics and other (miscellaneous) data. The database also contains a wide variety of downloadable data and metadata for 95 HIV-related studies covering 170 assays from 61 principal investigators. The data represent 76 tissue types, 25 measurement types, and 38 technology types, and reaches a total of 33 017 407 data points. We used the ISA platform to create the database and develop a searchable web interface for querying the data. A gene search tool is also available, which searches for NCBI GEO datasets associated with selected genes. The database is manually curated with many user-friendly features, and is cross-linked to the NCBI, HUGO and PubMed databases. A free registration is required for qualified users to access the database. Database URL: http://nntc-dcc.unmc.edu PMID:26228431
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system.

PubMed

AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

2015-11-19

Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.
Characterization of a sterol carrier protein 2/3-oxoacyl-CoA thiolase from the cotton leafworm (Spodoptera littoralis): a lepidopteran mechanism closer to that in mammals than that in dipterans

PubMed Central

2004-01-01

Numerous invertebrate species belonging to several phyla cannot synthesize sterols de novo and rely on a dietary source of the compound. SCPx (sterol carrier protein 2/3-oxoacyl-CoA thiolase) is a protein involved in the trafficking of sterols and oxidation of branched-chain fatty acids. We have isolated SCPx protein from Spodoptera littoralis (cotton leafworm) and have subjected it to limited amino acid sequencing. A reverse-transcriptase PCR-based approach has been used to clone the cDNA (1.9 kb), which encodes a 57 kDa protein. Northern blotting detected two mRNA transcripts, one of 1.9 kb, encoding SCPx, and one of 0.95 kb, presumably encoding SCP2 (sterol carrier protein 2). The former mRNA was highly expressed in midgut and Malpighian tubules during the last larval instar. Furthermore, constitutive expression of the gene was detected in the prothoracic glands, which are the main tissue producing the insect moulting hormone. There was no significant change in the 1.9 kb mRNA in midgut throughout development, but slightly higher expression in the early stages. Conceptual translation of the cDNA and a database search revealed that the gene includes the SCP2 sequence and a putative peroxisomal targeting signal in the C-terminal region. Also a cysteine residue at the putative active site for the 3-oxoacyl-CoA thiolase is conserved. Southern blotting showed that SCPx is likely to be encoded by a single-copy gene. The mRNA expression pattern and the gene structure suggest that SCPx from S. littoralis (a lepidopteran) is evolutionarily closer to that of mammals than to that of dipterans. PMID:15149283
Translational analysis of mouse and human placental protein and mRNA reveals distinct molecular pathologies in human preeclampsia.

PubMed

Cox, Brian; Sharma, Parveen; Evangelou, Andreas I; Whiteley, Kathie; Ignatchenko, Vladimir; Ignatchenko, Alex; Baczyk, Dora; Czikk, Marie; Kingdom, John; Rossant, Janet; Gramolini, Anthony O; Adamson, S Lee; Kislinger, Thomas

2011-12-01

Preeclampsia (PE) adversely impacts ~5% of pregnancies. Despite extensive research, no consistent biomarkers or cures have emerged, suggesting that different molecular mechanisms may cause clinically similar disease. To address this, we undertook a proteomics study with three main goals: (1) to identify a panel of cell surface markers that distinguish the trophoblast and endothelial cells of the placenta in the mouse; (2) to translate this marker set to human via the Human Protein Atlas database; and (3) to utilize the validated human trophoblast markers to identify subgroups of human preeclampsia. To achieve these goals, plasma membrane proteins at the blood tissue interfaces were extracted from placentas using intravascular silica-bead perfusion, and then identified using shotgun proteomics. We identified 1181 plasma membrane proteins, of which 171 were enriched at the maternal blood-trophoblast interface and 192 at the fetal endothelial interface with a 70% conservation of expression in humans. Three distinct molecular subgroups of human preeclampsia were identified in existing human microarray data by using expression patterns of trophoblast-enriched proteins. Analysis of all misexpressed genes revealed divergent dysfunctions including angiogenesis (subgroup 1), MAPK signaling (subgroup 2), and hormone biosynthesis and metabolism (subgroup 3). Subgroup 2 lacked expected changes in known preeclampsia markers (sFLT1, sENG) and uniquely overexpressed GNA12. In an independent set of 40 banked placental specimens, GNA12 was overexpressed during preeclampsia when co-incident with chronic hypertension. In the current study we used a novel translational analysis to integrate mouse and human trophoblast protein expression with human microarray data. This strategy identified distinct molecular pathologies in human preeclampsia. We conclude that clinically similar preeclampsia patients exhibit divergent placental gene expression profiles thus implicating divergent molecular mechanisms in the origins of this disease.
Mouse cysteine-rich secretory protein 4 (CRISP4): a member of the Crisp family exclusively expressed in the epididymis in an androgen-dependent manner.

PubMed

Jalkanen, Jenni; Huhtaniemi, Ilpo; Poutanen, Matti

2005-05-01

The final maturation of spermatozoa produced in the testis takes place during their passage through the epididymis. In this process, the proteins secreted into the epididymal lumen along with changes in the pH and salt composition of the epididymal fluid cause several biochemical changes and remodeling of the sperm plasma membrane. The Crisp family is a group of cysteine-rich secretory proteins that previously consisted of three members, one of which-CRISP1-is an epididymal protein shown to attach to the sperm surface in the epididymal lumen and to inhibit gamete membrane fusion. In the present paper, we introduce a new member of the Crisp protein family, CRISP4. The new gene was discovered through in silico analysis of the epididymal expressed sequence tag library deposited in the UniGene database. The peptide sequence of CRISP4 has a signal sequence suggesting that it is secreted into the epididymal lumen and might thus interact with sperm. Unlike the other members of the family, Crisp4 is located on chromosome 1 in a cluster of genes encoding for cysteine-rich proteins. Crisp4 is expressed in the mouse exclusively in epithelial cells of the epididymis in an androgen-dependent manner, and the expression of the gene starts at puberty along with the onset of sperm maturation. The identified murine CRISP4 peptide has high homology with human CRISP1, and the homology is higher than that between murine and human CRISP1, suggesting that CRISP4 represents the mouse counterpart of human CRISP1 and could have similar effects on sperm membrane as mouse and human CRISP1.
Enzymatic and biochemical properties of a novel human serine dehydratase isoform.

PubMed

Ogawa, Hirofumi; Gomi, Tomoharu; Nishizawa, Mikio; Hayakawa, Yumiko; Endo, Shunro; Hayashi, Kyoko; Ochiai, Hiroshi; Takusagawa, Fusao; Pitot, Henry C; Mori, Hisashi; Sakurai, Hiroaki; Koizumi, Keiichi; Saiki, Ikuo; Oda, Hirofumi; Fujishita, Takashi; Miwa, Toshiro; Maruyama, Muneharu; Kobayashi, Masashi

2006-05-01

A cDNA clone similar to human serine dehydratase (SDH) is deposited in the GenBank/EMBL databases, but its structural and functional bases remain unknown. Despite the occurrence of mRNA, the expected protein level was found to be low in cultured cells. To learn about physicochemical properties of the protein, we expressed the cDNA in Escherichia coli, and compared the expressed protein with that of a hepatic SDH. The purified protein showed l-serine and l-threonine dehydratase activity, demonstrating to be an isoform of SDH. However, their Km and Vmax constants were different in a range of two-order. Removal of Pro128 from the hepatic SDH consisting of 328 residues, which is missing in the corresponding position of the isoform consisting of 329 residues, significantly changed the Michaelis constants and Kd value for pyridoxal 5'-phosphate, whereas addition of a proline residue to the isoform was without effect. These findings suggest the difference in the structures of the active sites of the two enzymes. Another striking feature was that the expressed level of the isoform in E. coli was 7-fold lower than that of the hepatic SDH. Substitution of Val for Leu287 in the isoform dramatically increased the protein level. The high yield of the mutated isoform was also confirmed by the in vitro transcription and translation experiment. The poor expression of the isoform could be explained by the more stable secondary structure of the mRNA than that of the hepatic SDH mRNA. The present findings may provide a clue as to why the protein level in cultured cells is low.
EPConDB: a web resource for gene expression related to pancreatic development, beta-cell function and diabetes.

PubMed

Mazzarelli, Joan M; Brestelli, John; Gorski, Regina K; Liu, Junmin; Manduchi, Elisabetta; Pinney, Deborah F; Schug, Jonathan; White, Peter; Kaestner, Klaus H; Stoeckert, Christian J

2007-01-01

EPConDB (http://www.cbil.upenn.edu/EPConDB) is a public web site that supports research in diabetes, pancreatic development and beta-cell function by providing information about genes expressed in cells of the pancreas. EPConDB displays expression profiles for individual genes and information about transcripts, promoter elements and transcription factor binding sites. Gene expression results are obtained from studies examining tissue expression, pancreatic development and growth, differentiation of insulin-producing cells, islet or beta-cell injury, and genetic models of impaired beta-cell function. The expression datasets are derived using different microarray platforms, including the BCBC PancChips and Affymetrix gene expression arrays. Other datasets include semi-quantitative RT-PCR and MPSS expression studies. For selected microarray studies, lists of differentially expressed genes, derived from PaGE analysis, are displayed on the site. EPConDB provides database queries and tools to examine the relationship between a gene, its transcriptional regulation, protein function and expression in pancreatic tissues.
Target gene screening and evaluation of prognostic values in non-small cell lung cancers by bioinformatics analysis.

PubMed

Piao, Junjie; Sun, Jie; Yang, Yang; Jin, Tiefeng; Chen, Liyan; Lin, Zhenhua

2018-03-20

Non-small cell lung cancer (NSCLC) is the major leading cause of cancer-related deaths worldwide. This study aims to explore molecular mechanism of NSCLC. Microarray dataset was obtained from the Gene Expression Omnibus (GEO) database, and analyzed by using GEO2R. Functional and pathway enrichment analysis were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Then, STRING, Cytoscape and MCODE were applied to construct the Protein-protein interaction (PPI) network and screen hub genes. Following, overall survival (OS) analysis of hub genes was performed by using the Kaplan-Meier plotter online tool. Moreover, miRecords was also applied to predict the targets of the differentially expressed microRNAs (DEMs). A total of 228 DEGs were identified, and they were mainly enriched in the terms of cell adhesion molecules, leukocyte transendothelial migration and ECM-receptor interaction. A PPI network was constructed, and 16 hub genes were identified, including TEK, ANGPT1, MMP9, VWF, CDH5, EDN1, ESAM, CCNE1, CDC45, PRC1, CCNB2, AURKA, MELK, CDC20, TOP2A and PTTG1. Among the genes, expressions of 14 hub genes were associated with prognosis of NSCLC patients. Additionally, a total of 11 DEMs were also identified. Our results provide some potential underlying biomarkers for NSCLC. Further studies are required to elucidate the pathogenesis of NSCLC. Copyright © 2018 Elsevier B.V. All rights reserved.
Proteomic profiling of human pleural effusion using two-dimensional nano liquid chromatography tandem mass spectrometry.

PubMed

Tyan, Yu-Chang; Wu, Hsin-Yi; Lai, Wu-Wei; Su, Wu-Chou; Liao, Pao-Chi

2005-01-01

Pleural effusion, an accumulation of pleural fluid, contains proteins originated from plasma filtrate and, especially when tissues are damaged, parenchyma interstitial spaces of lungs and/or other organs. This study details protein profiles in human pleural effusion from 43 lung adenocarcinoma patients by a two-dimensional nano-high performance liquid chromatography electrospray ionization tandem mass spectrometry (2D nano-HPLC-ESI-MS/MS) system. The experimental results revealed the identification of 1415 unique proteins from human pleural effusion. Among these 124 proteins identified with higher confidence levels, some proteins have not been reported in plasma and may represent proteins specifically present in pleural effusion. These proteins are valuable for mass identification of differentially expressed proteins involved in proteomics database and screening biomarker to further study in human lung adenocarcinoma. The significance of the use of proteomics analysis of human pleural fluid for the search of new lung cancer marker proteins, and for their simultaneous display and analysis in patients suffering from lung disorders has been examined.
MIPS: a database for genomes and protein sequences

PubMed Central

Mewes, H. W.; Frishman, D.; Güldener, U.; Mannhaupt, G.; Mayer, K.; Mokrejs, M.; Morgenstern, B.; Münsterkötter, M.; Rudd, S.; Weil, B.

2002-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz–Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91–93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155–158; Barker et al. (2001) Nucleic Acids Res., 29, 29–32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de). PMID:11752246
MIPS: a database for genomes and protein sequences.

PubMed

Mewes, H W; Frishman, D; Güldener, U; Mannhaupt, G; Mayer, K; Mokrejs, M; Morgenstern, B; Münsterkötter, M; Rudd, S; Weil, B

2002-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz-Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91-93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155-158; Barker et al. (2001) Nucleic Acids Res., 29, 29-32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de).
Alternative Splicing Studies of the Reactive Oxygen Species Gene Network in Populus Reveal Two Isoforms of High-Isoelectric-Point Superoxide Dismutase1[C][W

PubMed Central

Srivastava, Vaibhav; Srivastava, Manoj Kumar; Chibani, Kamel; Nilsson, Robert; Rouhier, Nicolas; Melzer, Michael; Wingsle, Gunnar

2009-01-01

Recent evidence has shown that alternative splicing (AS) is widely involved in the regulation of gene expression, substantially extending the diversity of numerous proteins. In this study, a subset of expressed sequence tags representing members of the reactive oxygen species gene network was selected from the PopulusDB database to investigate AS mechanisms in Populus. Examples of all known types of AS were detected, but intron retention was the most common. Interestingly, the closest Arabidopsis (Arabidopsis thaliana) homologs of half of the AS genes identified in Populus are not reportedly alternatively spliced. Two genes encoding the protein of most interest in our study (high-isoelectric-point superoxide dismutase [hipI-SOD]) have been found in black cottonwood (Populus trichocarpa), designated PthipI-SODC1 and PthipI-SODC2. Analysis of the expressed sequence tag libraries has indicated the presence of two transcripts of PthipI-SODC1 (hipI-SODC1b and hipI-SODC1s). Alignment of these sequences with the PthipI-SODC1 gene showed that hipI-SODC1b was 69 bp longer than hipI-SODC1s due to an AS event involving the use of an alternative donor splice site in the sixth intron. Transcript analysis showed that the splice variant hipI-SODC1b was differentially expressed, being clearly expressed in cambial and xylem, but not phloem, regions. In addition, immunolocalization and mass spectrometric data confirmed the presence of hipI-SOD proteins in vascular tissue. The functionalities of the spliced gene products were assessed by expressing recombinant hipI-SOD proteins and in vitro SOD activity assays. PMID:19176719

Alternative splicing studies of the reactive oxygen species gene network in Populus reveal two isoforms of high-isoelectric-point superoxide dismutase.

PubMed

Srivastava, Vaibhav; Srivastava, Manoj Kumar; Chibani, Kamel; Nilsson, Robert; Rouhier, Nicolas; Melzer, Michael; Wingsle, Gunnar

2009-04-01

Recent evidence has shown that alternative splicing (AS) is widely involved in the regulation of gene expression, substantially extending the diversity of numerous proteins. In this study, a subset of expressed sequence tags representing members of the reactive oxygen species gene network was selected from the PopulusDB database to investigate AS mechanisms in Populus. Examples of all known types of AS were detected, but intron retention was the most common. Interestingly, the closest Arabidopsis (Arabidopsis thaliana) homologs of half of the AS genes identified in Populus are not reportedly alternatively spliced. Two genes encoding the protein of most interest in our study (high-isoelectric-point superoxide dismutase [hipI-SOD]) have been found in black cottonwood (Populus trichocarpa), designated PthipI-SODC1 and PthipI-SODC2. Analysis of the expressed sequence tag libraries has indicated the presence of two transcripts of PthipI-SODC1 (hipI-SODC1b and hipI-SODC1s). Alignment of these sequences with the PthipI-SODC1 gene showed that hipI-SODC1b was 69 bp longer than hipI-SODC1s due to an AS event involving the use of an alternative donor splice site in the sixth intron. Transcript analysis showed that the splice variant hipI-SODC1b was differentially expressed, being clearly expressed in cambial and xylem, but not phloem, regions. In addition, immunolocalization and mass spectrometric data confirmed the presence of hipI-SOD proteins in vascular tissue. The functionalities of the spliced gene products were assessed by expressing recombinant hipI-SOD proteins and in vitro SOD activity assays.
Identification of Zebrafish Fxyd11a Protein that is Highly Expressed in Ion-Transporting Epithelium of the Gill and Skin and its Possible Role in Ion Homeostasis

PubMed Central

Saito, Kaori; Nakamura, Nobuhiro; Ito, Yusuke; Hoshijima, Kazuyuki; Esaki, Masahiro; Zhao, Boqiang; Hirose, Shigehisa

2010-01-01

FXYD proteins, small single-transmembrane proteins, have been proposed to be auxiliary regulatory subunits of Na+–K+-ATPase and have recently been implied in ion osmoregulation of teleost fish. In freshwater (FW) fish, numerous ions are actively taken up through mitochondrion-rich cells (MRCs) of the gill and skin epithelia, using the Na+ electrochemical gradient generated by Na+–K+-ATPase. In the present study, to understand the molecular mechanism for the regulation of Na+–K+-ATPase in MRCs of FW fish, we sought to identify FXYD proteins expressed in MRCs of zebrafish. Reverse-transcriptase PCR studies of adult zebrafish tissues revealed that, out of eight fxyd genes found in zebrafish database, only zebrafish fxyd11 (zfxyd11) mRNA exhibited a gill-specific expression. Double immunofluorescence staining showed that zFxyd11 is abundantly expressed in MRCs rich in Na+–K+-ATPase (NaK-MRCs) but not in those rich in vacuolar-type H+-transporting ATPase. An in situ proximity ligation assay demonstrated its close association with Na+–K+-ATPase in NaK-MRCs. The zfxyd11 mRNA expression was detectable at 1 day postfertilization, and its expression levels in the whole larvae and adult gills were regulated in response to changes in environmental ionic concentrations. Furthermore, knockdown of zFxyd11 resulted in a significant increase in the number of Na+–K+-ATPase–positive cells in the larval skin. These results suggest that zFxyd11 may regulate the transport ability of NaK-MRCs by modulating Na+–K+-ATPase activity, and may be involved in the regulation of body fluid and electrolyte homeostasis. PMID:21423371
Evaluation of correlation between CT image features and ERCC1 protein expression in assessing lung cancer prognosis

NASA Astrophysics Data System (ADS)

Tan, Maxine; Emaminejad, Nastaran; Qian, Wei; Sun, Shenshen; Kang, Yan; Guan, Yubao; Lure, Fleming; Zheng, Bin

2014-03-01

Stage I non-small-cell lung cancers (NSCLC) usually have favorable prognosis. However, high percentage of NSCLC patients have cancer relapse after surgery. Accurately predicting cancer prognosis is important to optimally treat and manage the patients to minimize the risk of cancer relapse. Studies have shown that an excision repair crosscomplementing 1 (ERCC1) gene was a potentially useful genetic biomarker to predict prognosis of NSCLC patients. Meanwhile, studies also found that chronic obstructive pulmonary disease (COPD) was highly associated with lung cancer prognosis. In this study, we investigated and evaluated the correlations between COPD image features and ERCC1 gene expression. A database involving 106 NSCLC patients was used. Each patient had a thoracic CT examination and ERCC1 genetic test. We applied a computer-aided detection scheme to segment and quantify COPD image features. A logistic regression method and a multilayer perceptron network were applied to analyze the correlation between the computed COPD image features and ERCC1 protein expression. A multilayer perceptron network (MPN) was also developed to test performance of using COPD-related image features to predict ERCC1 protein expression. A nine feature based logistic regression analysis showed the average COPD feature values in the low and high ERCC1 protein expression groups are significantly different (p < 0.01). Using a five-fold cross validation method, the MPN yielded an area under ROC curve (AUC = 0.669±0.053) in classifying between the low and high ERCC1 expression cases. The study indicates that CT phenotype features are associated with the genetic tests, which may provide supplementary information to help improve accuracy in assessing prognosis of NSCLC patients.
Exploring the cross talk between ER stress and inflammation in age-related macular degeneration.

PubMed

Kheitan, Samira; Minuchehr, Zarrin; Soheili, Zahra-Soheila

2017-01-01

Increasing evidence demonstrates that inflammation and endoplasmic reticulum (ER) stress is implicated in the development and progression of age-related macular degeneration (AMD), a multifactorial neurodegenerative disease. However the cross talk between these cellular mechanisms has not been clearly and fully understood. The present study investigates a possible intersection between ER stress and inflammation in AMD. In this study, we recruited two collections of involved protein markers to retrieve their interaction information from IMEx-curated databases, which are the most well- known protein-protein interaction collections, allowing us to design an intersection network for AMD that is unprecedented. In order to find expression activated subnetworks, we utilized AMD expression profiles in our network. In addition, we studied topological characteristics of the most expressed active subnetworks to identify the hubs. With regard to topological quantifications and expressional activity, we reported a list of the most pivotal hubs which are potentially applicable as probable therapeutic targets. Furthermore, we introduced MAPK signaling pathway as a significantly involved pathway in the association between ER stress and inflammation, leading to promising new directions in discovering AMD formation mechanisms and possible treatments.
Exploring the cross talk between ER stress and inflammation in age-related macular degeneration

PubMed Central

Kheitan, Samira; Soheili, Zahra-Soheila

2017-01-01

Increasing evidence demonstrates that inflammation and endoplasmic reticulum (ER) stress is implicated in the development and progression of age-related macular degeneration (AMD), a multifactorial neurodegenerative disease. However the cross talk between these cellular mechanisms has not been clearly and fully understood. The present study investigates a possible intersection between ER stress and inflammation in AMD. In this study, we recruited two collections of involved protein markers to retrieve their interaction information from IMEx-curated databases, which are the most well- known protein-protein interaction collections, allowing us to design an intersection network for AMD that is unprecedented. In order to find expression activated subnetworks, we utilized AMD expression profiles in our network. In addition, we studied topological characteristics of the most expressed active subnetworks to identify the hubs. With regard to topological quantifications and expressional activity, we reported a list of the most pivotal hubs which are potentially applicable as probable therapeutic targets. Furthermore, we introduced MAPK signaling pathway as a significantly involved pathway in the association between ER stress and inflammation, leading to promising new directions in discovering AMD formation mechanisms and possible treatments. PMID:28742151
Database resources of the National Center for Biotechnology Information

PubMed Central

Wheeler, David L.; Barrett, Tanya; Benson, Dennis A.; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Feolo, Michael; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Khovayko, Oleg; Landsman, David; Lipman, David J.; Madden, Thomas L.; Maglott, Donna R.; Miller, Vadim; Ostell, James; Pruitt, Kim D.; Schuler, Gregory D.; Shumway, Martin; Sequeira, Edwin; Sherry, Steven T.; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusov, Roman L.; Tatusova, Tatiana A.; Wagner, Lukas; Yaschenko, Eugene

2008-01-01

In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data available through NCBI's web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace, Assembly, and Short Read Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Database of Genotype and Phenotype, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting the web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:18045790
Altered Protein Expression of Streptococcus oralis Cultured at Low pH Revealed by Two-Dimensional Gel Electrophoresis

PubMed Central

Wilkins, Joanna C.; Homer, Karen A.; Beighton, David

2001-01-01

Streptococcus oralis is the predominant aciduric nonmutans streptococcus isolated from the human dentition, but the role of this organism in the initiation and progression of dental caries has yet to be established. To identify proteins that are differentially expressed by S. oralis growing under conditions of low pH, soluble cellular proteins extracted from bacteria grown in batch culture at pH 5.2 or 7.0 were analyzed by two-dimensional (2-D) gel electrophoresis. Thirty-nine proteins had altered expression at low pH; these were excised, digested with trypsin using an in-gel protocol, and further analyzed by peptide mass fingerprinting using matrix-assisted laser desorption ionization mass spectrometry. The resulting fingerprints were compared with the genomic database for Streptococcus pneumoniae, an organism that is phylogenetically closely related to S. oralis, and putative functions for the majority of these proteins were determined on the basis of functional homology. Twenty-eight proteins were up-regulated following growth at pH 5.2; these included enzymes of the glycolytic pathway (glyceraldehyde-3-phosphate dehydrogenase and lactate dehydrogenase), the polypeptide chains comprising ATP synthase, and proteins that are considered to play a role in the general stress response of bacteria, including the 60-kDa chaperone, Hsp33, and superoxide dismutase, and three distinct ABC transporters. These data identify, for the first time, gene products that may be important in the survival and proliferation of nonmutans aciduric S. oralis under conditions of low pH that are likely to be encountered by this organism in vivo. PMID:11472910
DAPK1 as an independent prognostic marker in liver cancer.

PubMed

Li, Ling; Guo, Libin; Wang, Qingshui; Liu, Xiaolong; Zeng, Yongyi; Wen, Qing; Zhang, Shudong; Kwok, Hang Fai; Lin, Yao; Liu, Jingfeng

2017-01-01

The death-associated protein kinase 1 (DAPK1) can act as an oncogene or a tumor suppressor gene depending on the cellular context as well as external stimuli. Our study aims to investigate the prognostic significance of DAPK1 in liver cancer in both mRNA and protein levels. The mRNA expression of DAPK1 was extracted from the Gene Expression Omnibus database in three independent liver cancer datasets while protein expression of DAPK1 was detected by immunohistochemistry in our Chinese liver cancer patient cohort. The associations between DAPK1 expression and clinical characteristics were tested. DAPK1 mRNA expression was down-regulated in liver cancer. Low levels of DAPK1 mRNA were associated with shorter survival in a liver cancer patient cohort ( n = 115; p = 0.041), while negative staining of DAPK1 protein was significantly correlated with shorter time to progression ( p = 0.002) and overall survival ( p = 0.02). DAPK1 was an independent prognostic marker for both time to progression and overall survival by multivariate analysis. Liver cancer with the b-catenin mutation has a lower DAPK1 expression, suggesting that DAPK1 may be regulated under the b-catenin pathway. In addition, we also identified genes that are co-regulated with DAPK1. DAPK1 expression was positively correlated with IRF2, IL7R, PCOLCE and ZBTB16, and negatively correlated with SLC16A3 in both liver cancer datasets. Among these genes, PCOLCE and ZBTB16 were significantly down-regulated, while SLC16A3 was significantly upregulated in liver cancer. By using connectivity mapping of these co-regulated genes, we have identified amcinonide and sulpiride as potential small molecules that could potentially reverse DAPK1/PCOLCE/ZBTB16/SLC16A3 expression. Our study demonstrated for the first time that both DAPK1 mRNA and protein expression levels are important prognostic markers in liver cancer, and have identified genes that may contribute to DAPK1-mediated liver carcinogenesis.
Integration of gel-based and gel-free proteomic data for functional analysis of proteins through Soybean Proteome Database.

PubMed

Komatsu, Setsuko; Wang, Xin; Yin, Xiaojian; Nanjo, Yohei; Ohyanagi, Hajime; Sakata, Katsumi

2017-06-23

The Soybean Proteome Database (SPD) stores data on soybean proteins obtained with gel-based and gel-free proteomic techniques. The database was constructed to provide information on proteins for functional analyses. The majority of the data is focused on soybean (Glycine max 'Enrei'). The growth and yield of soybean are strongly affected by environmental stresses such as flooding. The database was originally constructed using data on soybean proteins separated by two-dimensional polyacrylamide gel electrophoresis, which is a gel-based proteomic technique. Since 2015, the database has been expanded to incorporate data obtained by label-free mass spectrometry-based quantitative proteomics, which is a gel-free proteomic technique. Here, the portions of the database consisting of gel-free proteomic data are described. The gel-free proteomic database contains 39,212 proteins identified in 63 sample sets, such as temporal and organ-specific samples of soybean plants grown under flooding stress or non-stressed conditions. In addition, data on organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored. Furthermore, the database integrates multiple omics data such as genomics, transcriptomics, metabolomics, and proteomics. The SPD database is accessible at http://proteome.dc.affrc.go.jp/Soybean/. The Soybean Proteome Database stores data obtained from both gel-based and gel-free proteomic techniques. The gel-free proteomic database comprises 39,212 proteins identified in 63 sample sets, such as different organs of soybean plants grown under flooding stress or non-stressed conditions in a time-dependent manner. In addition, organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored in the gel-free proteomics database. A total of 44,704 proteins, including 5490 proteins identified using a gel-based proteomic technique, are stored in the SPD. It accounts for approximately 80% of all predicted proteins from genome sequences, though there are over lapped proteins. Based on the demonstrated application of data stored in the database for functional analyses, it is suggested that these data will be useful for analyses of biological mechanisms in soybean. Furthermore, coupled with recent advances in information and communication technology, the usefulness of this database would increase in the analyses of biological mechanisms. Copyright © 2017 Elsevier B.V. All rights reserved.
PlantTribes: a gene and gene family resource for comparative genomics in plants

PubMed Central

Wall, P. Kerr; Leebens-Mack, Jim; Müller, Kai F.; Field, Dawn; Altman, Naomi S.; dePamphilis, Claude W.

2008-01-01

The PlantTribes database (http://fgp.huck.psu.edu/tribe.html) is a plant gene family database based on the inferred proteomes of five sequenced plant species: Arabidopsis thaliana, Carica papaya, Medicago truncatula, Oryza sativa and Populus trichocarpa. We used the graph-based clustering algorithm MCL [Van Dongen (Technical Report INS-R0010 2000) and Enright et al. (Nucleic Acids Res. 2002; 30: 1575–1584)] to classify all of these species’ protein-coding genes into putative gene families, called tribes, using three clustering stringencies (low, medium and high). For all tribes, we have generated protein and DNA alignments and maximum-likelihood phylogenetic trees. A parallel database of microarray experimental results is linked to the genes, which lets researchers identify groups of related genes and their expression patterns. Unified nomenclatures were developed, and tribes can be related to traditional gene families and conserved domain identifiers. SuperTribes, constructed through a second iteration of MCL clustering, connect distant, but potentially related gene clusters. The global classification of nearly 200 000 plant proteins was used as a scaffold for sorting ∼4 million additional cDNA sequences from over 200 plant species. All data and analyses are accessible through a flexible interface allowing users to explore the classification, to place query sequences within the classification, and to download results for further study. PMID:18073194
Ligand- and structure-based in silico studies to identify kinesin spindle protein (KSP) inhibitors as potential anticancer agents.

PubMed

Balakumar, Chandrasekaran; Ramesh, Muthusamy; Tham, Chuin Lean; Khathi, Samukelisiwe Pretty; Kozielski, Frank; Srinivasulu, Cherukupalli; Hampannavar, Girish A; Sayyad, Nisar; Soliman, Mahmoud E; Karpoormath, Rajshekhar

2017-11-29

Kinesin spindle protein (KSP) belongs to the kinesin superfamily of microtubule-based motor proteins. KSP is responsible for the establishment of the bipolar mitotic spindle which mediates cell division. Inhibition of KSP expedites the blockade of the normal cell cycle during mitosis through the generation of monoastral MT arrays that finally cause apoptotic cell death. As KSP is highly expressed in proliferating/cancer cells, it has gained considerable attention as a potential drug target for cancer chemotherapy. Therefore, this study envisaged to design novel KSP inhibitors by employing computational techniques/tools such as pharmacophore modelling, virtual database screening, molecular docking and molecular dynamics. Initially, the pharmacophore models were generated from the data-set of highly potent KSP inhibitors and the pharmacophore models were validated against in house test set ligands. The validated pharmacophore model was then taken for database screening (Maybridge and ChemBridge) to yield hits, which were further filtered for their drug-likeliness. The potential hits retrieved from virtual database screening were docked using CDOCKER to identify the ligand binding landscape. The top-ranked hits obtained from molecular docking were progressed to molecular dynamics (AMBER) simulations to deduce the ligand binding affinity. This study identified MB-41570 and CB-10358 as potential hits and evaluated these experimentally using in vitro KSP ATPase inhibition assays.
Navigating through the Jungle of Allergens: Features and Applications of Allergen Databases.

PubMed

Radauer, Christian

2017-01-01

The increasing number of available data on allergenic proteins demanded the establishment of structured, freely accessible allergen databases. In this review article, features and applications of 6 of the most widely used allergen databases are discussed. The WHO/IUIS Allergen Nomenclature Database is the official resource of allergen designations. Allergome is the most comprehensive collection of data on allergens and allergen sources. AllergenOnline is aimed at providing a peer-reviewed database of allergen sequences for prediction of allergenicity of proteins, such as those planned to be inserted into genetically modified crops. The Structural Database of Allergenic Proteins (SDAP) provides a database of allergen sequences, structures, and epitopes linked to bioinformatics tools for sequence analysis and comparison. The Immune Epitope Database (IEDB) is the largest repository of T-cell, B-cell, and major histocompatibility complex protein epitopes including epitopes of allergens. AllFam classifies allergens into families of evolutionarily related proteins using definitions from the Pfam protein family database. These databases contain mostly overlapping data, but also show differences in terms of their targeted users, the criteria for including allergens, data shown for each allergen, and the availability of bioinformatics tools. © 2017 S. Karger AG, Basel.
Suppression of Aurora-A-FLJ10540 signaling axis prohibits the malignant state of head and neck cancer.

PubMed

Chen, Chang-Han; Chang, Alice Y W; Li, Shau-Hsuan; Tsai, Hsin-Ting; Shiu, Li-Yen; Su, Li-Jen; Wang, Wen-Lung; Chiu, Tai-Jen; Luo, Sheng-Dean; Huang, Tai-Lin; Chien, Chih-Yen

2015-04-12

Head and neck cancer (HNC) is a highly invasive cancer. Aurora-A has been reported for a number of malignancies. However, the identity of downstream effectors responsible for its aggressive phenotype in HNC remains underinvestigated. The mRNA and protein expression levels of Aurora-A and FLJ10540 were assessed in HNC specimens and cell lines using RT-qPCR, western blot, Oncomine, and microarray database analysis. The downstream molecular mechanisms of Aurora-A were confirmed by RT-qPCR, western blot, luciferase reporter, confocal microscopy analyses, immunoprecipitation, colony formation, cell viability, and xenograft model. Cellular functions in response to Aurora-A-modulated downstream targets such as FLJ10540 and MMPs were examined in vitro and in vivo, including cell growth, motility and chemosensitivity. Aurora-A/FLJ10540/MMPs expression was determined in cancer and adjacent normal tissues from HNC patients by immunohistochemistry approach. In the current study, Aurora-A exhibited similar gene expression profiles with FLJ10540 by using accessibly public microarray and Oncomine database analysis, raising the possibility that these molecules might coordinately participate in cancer progression and metastasis of HNC. These two molecules connection were also examined in cell lines and tissues of HNC. Aurora-A overexpression could not only bind to the promoter of FLJ10540 to induce FLJ10540 expression, but also increase both mRNA and protein levels of MMP-7 and MMP-10 in HNC cells. Conversely, depletion of Aurora-A expression by using siRNA or Aurora-A kinase inhibitor, MLN8237, suppressed FLJ10540, MMP-7 and MMP-10 mRNA and protein expressions in vitro and in vivo. In addition, the FLJ10540-PI3K complex was destroyed by inhibition the Aurora-A kinase activity. Forced overexpression of FLJ10540 in Aurora-A-depleted or in MLN8237-treated HNC cells attenuated the effect on cytotoxicity to cisplatin. Elevated Aurora-A expression in HNC cells led to the characteristics of more aggressive malignancy, including enhanced chemoresistance and increased the abilities of proliferation, migration and invasion, which was required for FLJ10540/MMP-7 or FLJ10540/MMP-10 expressions. Finally, immunohistochemical analysis of human HNC specimens showed a significant positively correlation among Aurora-A, FLJ10540, MMP-7 and MMP-10 expressions. Together, our findings define a novel mechanism by which Aurora-A promotes cell malignancy, with potential implications for understanding the clinical action of Aurora-A.
The BRENDA enzyme information system-From a database to an expert system.

PubMed

Schomburg, I; Jeske, L; Ulbrich, M; Placzek, S; Chang, A; Schomburg, D

2017-11-10

Enzymes, representing the largest and by far most complex group of proteins, play an essential role in all processes of life, including metabolism, gene expression, cell division, the immune system, and others. Their function, also connected to most diseases or stress control makes them interesting targets for research and applications in biotechnology, medical treatments, or diagnosis. Their functional parameters and other properties are collected, integrated, and made available to the scientific community in the BRaunschweig ENzyme DAtabase (BRENDA). In the last 30 years BRENDA has developed into one of the most highly used biological databases worldwide. The data contents, the process of data acquisition, data integration and control, the ways to access the data, and visualizations provided by the website are described and discussed. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
The characterisation of novel secreted Ly-6 proteins from rat urine by the combined use of two-dimensional gel electrophoresis, microbore high performance liquid chromatography and expressed sequence tag data.

PubMed

Southan, Christopher; Cutler, Paul; Birrell, Helen; Connell, John; Fantom, Kenneth G M; Sims, Matthew; Shaikh, Narjis; Schneider, Klaus

2002-02-01

A proteomic study of rat urine was undertaken using two-dimensional gel electrophoresis, microbore high performance liquid chromatography, mass spectrometry and N-terminal sequencing. Five known urinary proteins were identified but two novel peptide fragments matched a large number of rat expressed sequence tags (ESTs) from a liver library. By combining protein chemical and nucleotide data, two 101-residue open reading frames with 90% amino acid identity were determined, rat urinary protein 1 (RUP-1) and RUP-2. The data established signal peptide removal and provided evidence for N-glycosylation. A third related sequence, rat spleen protein (RSP-1) was confirmed from EST searches. These three proteins have been submitted to SWISS-PROT as P81827, P81828 and Q9QXN2, respectively. A fourth novel homologue was found in porcine and bovine ESTs from embryo libraries. Alignment with known homologues showed conserved cysteine positions characteristic of a secreted subfamily of Ly-6 proteins. In two cases, antineoplastic urinary protein and caltrin, these homologues have unverified functional annotations. The RUP sequences showed high scoring matches to three unrelated rat mRNAs subsequently established to be chimeric. Two of these share extended sectional identity to RUP-1 but the third may represent another novel Ly-6 homologue. These chimeras have caused serious annotation errors in secondary databases.
The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases.

PubMed

Côté, Richard G; Jones, Philip; Martens, Lennart; Kerrien, Samuel; Reisinger, Florian; Lin, Quan; Leinonen, Rasko; Apweiler, Rolf; Hermjakob, Henning

2007-10-18

Each major protein database uses its own conventions when assigning protein identifiers. Resolving the various, potentially unstable, identifiers that refer to identical proteins is a major challenge. This is a common problem when attempting to unify datasets that have been annotated with proteins from multiple data sources or querying data providers with one flavour of protein identifiers when the source database uses another. Partial solutions for protein identifier mapping exist but they are limited to specific species or techniques and to a very small number of databases. As a result, we have not found a solution that is generic enough and broad enough in mapping scope to suit our needs. We have created the Protein Identifier Cross-Reference (PICR) service, a web application that provides interactive and programmatic (SOAP and REST) access to a mapping algorithm that uses the UniProt Archive (UniParc) as a data warehouse to offer protein cross-references based on 100% sequence identity to proteins from over 70 distinct source databases loaded into UniParc. Mappings can be limited by source database, taxonomic ID and activity status in the source database. Users can copy/paste or upload files containing protein identifiers or sequences in FASTA format to obtain mappings using the interactive interface. Search results can be viewed in simple or detailed HTML tables or downloaded as comma-separated values (CSV) or Microsoft Excel (XLS) files suitable for use in a local database or a spreadsheet. Alternatively, a SOAP interface is available to integrate PICR functionality in other applications, as is a lightweight REST interface. We offer a publicly available service that can interactively map protein identifiers and protein sequences to the majority of commonly used protein databases. Programmatic access is available through a standards-compliant SOAP interface or a lightweight REST interface. The PICR interface, documentation and code examples are available at http://www.ebi.ac.uk/Tools/picr.
The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases

PubMed Central

Côté, Richard G; Jones, Philip; Martens, Lennart; Kerrien, Samuel; Reisinger, Florian; Lin, Quan; Leinonen, Rasko; Apweiler, Rolf; Hermjakob, Henning

2007-01-01

Background Each major protein database uses its own conventions when assigning protein identifiers. Resolving the various, potentially unstable, identifiers that refer to identical proteins is a major challenge. This is a common problem when attempting to unify datasets that have been annotated with proteins from multiple data sources or querying data providers with one flavour of protein identifiers when the source database uses another. Partial solutions for protein identifier mapping exist but they are limited to specific species or techniques and to a very small number of databases. As a result, we have not found a solution that is generic enough and broad enough in mapping scope to suit our needs. Results We have created the Protein Identifier Cross-Reference (PICR) service, a web application that provides interactive and programmatic (SOAP and REST) access to a mapping algorithm that uses the UniProt Archive (UniParc) as a data warehouse to offer protein cross-references based on 100% sequence identity to proteins from over 70 distinct source databases loaded into UniParc. Mappings can be limited by source database, taxonomic ID and activity status in the source database. Users can copy/paste or upload files containing protein identifiers or sequences in FASTA format to obtain mappings using the interactive interface. Search results can be viewed in simple or detailed HTML tables or downloaded as comma-separated values (CSV) or Microsoft Excel (XLS) files suitable for use in a local database or a spreadsheet. Alternatively, a SOAP interface is available to integrate PICR functionality in other applications, as is a lightweight REST interface. Conclusion We offer a publicly available service that can interactively map protein identifiers and protein sequences to the majority of commonly used protein databases. Programmatic access is available through a standards-compliant SOAP interface or a lightweight REST interface. The PICR interface, documentation and code examples are available at . PMID:17945017
An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

PubMed Central

Habermann, Bianca; Bebin, Anne-Gaelle; Herklotz, Stephan; Volkmer, Michael; Eckelt, Kay; Pehlke, Kerstin; Epperlein, Hans Henning; Schackert, Hans Konrad; Wiebe, Glenis; Tanaka, Elly M

2004-01-01

Background The ambystomatid salamander, Ambystoma mexicanum (axolotl), is an important model organism in evolutionary and regeneration research but relatively little sequence information has so far been available. This is a major limitation for molecular studies on caudate development, regeneration and evolution. To address this lack of sequence information we have generated an expressed sequence tag (EST) database for A. mexicanum. Results Two cDNA libraries, one made from stage 18-22 embryos and the other from day-6 regenerating tail blastemas, generated 17,352 sequences. From the sequenced ESTs, 6,377 contigs were assembled that probably represent 25% of the expressed genes in this organism. Sequence comparison revealed significant homology to entries in the NCBI non-redundant database. Further examination of this gene set revealed the presence of genes involved in important cell and developmental processes, including cell proliferation, cell differentiation and cell-cell communication. On the basis of these data, we have performed phylogenetic analysis of key cell-cycle regulators. Interestingly, while cell-cycle proteins such as the cyclin B family display expected evolutionary relationships, the cyclin-dependent kinase inhibitor 1 gene family shows an unusual evolutionary behavior among the amphibians. Conclusions Our analysis reveals the importance of a comprehensive sequence set from a representative of the Caudata and illustrates that the EST sequence database is a rich source of molecular, developmental and regeneration studies. To aid in data mining, the ESTs have been organized into an easily searchable database that is freely available online. PMID:15345051
The National NeuroAIDS Tissue Consortium (NNTC) Database: an integrated database for HIV-related studies.

PubMed

Cserhati, Matyas F; Pandey, Sanjit; Beaudoin, James J; Baccaglini, Lorena; Guda, Chittibabu; Fox, Howard S

2015-01-01

We herein present the National NeuroAIDS Tissue Consortium-Data Coordinating Center (NNTC-DCC) database, which is the only available database for neuroAIDS studies that contains data in an integrated, standardized form. This database has been created in conjunction with the NNTC, which provides human tissue and biofluid samples to individual researchers to conduct studies focused on neuroAIDS. The database contains experimental datasets from 1206 subjects for the following categories (which are further broken down into subcategories): gene expression, genotype, proteins, endo-exo-chemicals, morphometrics and other (miscellaneous) data. The database also contains a wide variety of downloadable data and metadata for 95 HIV-related studies covering 170 assays from 61 principal investigators. The data represent 76 tissue types, 25 measurement types, and 38 technology types, and reaches a total of 33,017,407 data points. We used the ISA platform to create the database and develop a searchable web interface for querying the data. A gene search tool is also available, which searches for NCBI GEO datasets associated with selected genes. The database is manually curated with many user-friendly features, and is cross-linked to the NCBI, HUGO and PubMed databases. A free registration is required for qualified users to access the database. © The Author(s) 2015. Published by Oxford University Press.
Cellular responses during morphological transformation in Azospirillum brasilense and Its flcA knockout mutant.

PubMed

Hou, Xingsheng; McMillan, Mary; Coumans, Joëlle V F; Poljak, Anne; Raftery, Mark J; Pereg, Lily

2014-01-01

FlcA is a response regulator controlling flocculation and the morphological transformation of Azospirillum cells from vegetative to cyst-like forms. To understand the cellular responses of Azospirillum to conditions that cause morphological transformation, proteins differentially expressed under flocculation conditions in A. brasilense Sp7 and its flcA knockout mutant were investigated. Comparison of 2-DE protein profiles of wild-type (Sp7) and a flcA deletion mutant (Sp7-flcAΔ) revealed a total of 33 differentially expressed 2-DE gel spots, with 22 of these spots confidently separated to allow protein identification. Analysis of these spots by liquid chromatography-tandem mass spectrometry (LC-MS/MS) and MASCOT database searching identified 48 proteins (≥10% emPAI in each spot). The functional characteristics of these proteins included carbon metabolism (beta-ketothiolase and citrate synthase), nitrogen metabolism (Glutamine synthetase and nitric oxide synthase), stress tolerance (superoxide dismutase, Alkyl hydroperoxidase and ATP-dependent Clp protease proteolytic subunit) and morphological transformation (transducer coupling protein). The observed differences between Sp7 wild-type and flcA- strains enhance our understanding of the morphological transformation process and help to explain previous phenotypical observations. This work is a step forward in connecting the Azospirillum phenome and genome.

Cellular Responses during Morphological Transformation in Azospirillum brasilense and Its flcA Knockout Mutant

PubMed Central

Coumans, Joëlle V. F.; Poljak, Anne; Raftery, Mark J.; Pereg, Lily

2014-01-01

FlcA is a response regulator controlling flocculation and the morphological transformation of Azospirillum cells from vegetative to cyst-like forms. To understand the cellular responses of Azospirillum to conditions that cause morphological transformation, proteins differentially expressed under flocculation conditions in A. brasilense Sp7 and its flcA knockout mutant were investigated. Comparison of 2-DE protein profiles of wild-type (Sp7) and a flcA deletion mutant (Sp7-flcAΔ) revealed a total of 33 differentially expressed 2-DE gel spots, with 22 of these spots confidently separated to allow protein identification. Analysis of these spots by liquid chromatography-tandem mass spectrometry (LC-MS/MS) and MASCOT database searching identified 48 proteins (≥10% emPAI in each spot). The functional characteristics of these proteins included carbon metabolism (beta-ketothiolase and citrate synthase), nitrogen metabolism (Glutamine synthetase and nitric oxide synthase), stress tolerance (superoxide dismutase, Alkyl hydroperoxidase and ATP-dependent Clp protease proteolytic subunit) and morphological transformation (transducer coupling protein). The observed differences between Sp7 wild-type and flcA − strains enhance our understanding of the morphological transformation process and help to explain previous phenotypical observations. This work is a step forward in connecting the Azospirillum phenome and genome. PMID:25502569
Proteomic analysis of 'Zaosu' pear (Pyrus bretschneideri Rehd.) and its early-maturing bud sport.

PubMed

Liu, Xueting; Zhai, Rui; Feng, Wenting; Zhang, Shiwei; Wang, Zhigang; Qiu, Zonghao; Zhang, Junke; Ma, Fengwang; Xu, Lingfei

2014-07-01

Maturation of fruits involves a series of physiological, biochemical, and organoleptic changes that eventually make fleshy fruits attractive, palatable, and nutritional. In order to understand the mature mechanism of the early-maturing bud sport of 'Zaosu' pear, we analyzed the differences of proteome expression between the both pears in different mature stages by the methods of a combination of two-dimensional electrophoresis (2-DE) and matrix assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) analysis. Seventy-five differential expressed protein spots (p<0.05) were obtained between 'Zaosu' pear and its early-maturing bud sport, but only sixty-eight were demonstratively identified in the database of NCBI and uniprot. The majority of proteins were linked to metabolism, energy, stress response/defense and cell structure. Additionally, our data confirmed an increase of proteins related to cell-wall modification, oxidative stress and pentose phosphate metabolism and a decrease of proteins related to photosynthesis and glycolysis during the development process of both pears, but all these proteins increased or decreased faster in the early-maturing bud sport. This comparative analysis between both pears showed that these proteins were closely associated with maturation and could provide more detailed characteristics of the maturation process of both pears. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Proteomic profiling of liver from Atlantic salmon (Salmo salar) fed genetically modified soy compared to the near-isogenic non-GM line.

PubMed

Sissener, Nini H; Martin, Samuel A M; Cash, Phillip; Hevrøy, Ernst M; Sanden, Monica; Hemre, Gro-Ingunn

2010-06-01

The aim of this study was to investigate potential differences in liver protein expression of Atlantic salmon fed genetically modified (GM) Roundup Ready soy at a high inclusion level (25% inclusion, constituting 21% of crude protein in the diet) for 7 months or a compositionally similar non-GM diet. The liver was selected as the target organ due to its importance in the general metabolism, and 2D gel electrophoresis used as a screening tool. Samples from 12 individual fish from each diet group were evaluated. Of a total of 781 analysed protein spots, only 36 were significantly different by ANOVA (p < 0.05) in abundance between the diet groups. All these spots had low fold differences (1.2-1.6) and high false discovery rate (q = 0.44), indicating minor differences in liver protein synthesis between fish fed GM and non-GM soy. Additionally, low fold differences were observed. Four protein spots were analyzed by liquid chromatography tandem mass spectrometry and identified using a combination of online searches in NCBI and searches in an inhouse database containing salmonid expressed sequence tags and contigs. Follow-up on these proteins by real-time polymerase chain reaction did not identify differences at the transcriptional level.
The spectrum of low molecular weight alpha-amylase/protease inhibitor genes expressed in the US bread wheat cultivar Butte 86

PubMed Central

2011-01-01

Background Wheat grains accumulate a variety of low molecular weight proteins that are inhibitors of alpha-amylases and proteases and play an important protective role in the grain. These proteins have more balanced amino acid compositions than the major wheat gluten proteins and contribute important reserves for both seedling growth and human nutrition. The alpha-amylase/protease inhibitors also are of interest because they cause IgE-mediated occupational and food allergies and thereby impact human health. Results The complement of genes encoding alpha-amylase/protease inhibitors expressed in the US bread wheat Butte 86 was characterized by analysis of expressed sequence tags (ESTs). Coding sequences for 19 distinct proteins were identified. These included two monomeric (WMAI), four dimeric (WDAI), and six tetrameric (WTAI) inhibitors of exogenous alpha-amylases, two inhibitors of endogenous alpha-amylases (WASI), four putative trypsin inhibitors (CMx and WTI), and one putative chymotrypsin inhibitor (WCI). A number of the encoded proteins were identical or very similar to proteins in the NCBI database. Sequences not reported previously included variants of WTAI-CM3, three CMx inhibitors and WTI. Within the WDAI group, two different genes encoded the same mature protein. Based on numbers of ESTs, transcripts for WTAI-CM3 Bu-1, WMAI Bu-1 and WTAI-CM16 Bu-1 were most abundant in Butte 86 developing grain. Coding sequences for 16 of the inhibitors were unequivocally associated with specific proteins identified by tandem mass spectrometry (MS/MS) in a previous proteomic analysis of milled white flour from Butte 86. Proteins corresponding to WDAI Bu-1/Bu-2, WMAI Bu-1 and the WTAI subunits CM2 Bu-1, CM3 Bu-1 and CM16 Bu-1 were accumulated to the highest levels in flour. Conclusions Information on the spectrum of alpha-amylase/protease inhibitor genes and proteins expressed in a single wheat cultivar is central to understanding the importance of these proteins in both plant defense mechanisms and human allergies and facilitates both breeding and biotechnology approaches for manipulating the composition of these proteins in plants. PMID:21774824
UniPROBE, update 2011: expanded content and search tools in the online database of protein-binding microarray data on protein-DNA interactions.

PubMed

Robasky, Kimberly; Bulyk, Martha L

2011-01-01

The Universal PBM Resource for Oligonucleotide-Binding Evaluation (UniPROBE) database is a centralized repository of information on the DNA-binding preferences of proteins as determined by universal protein-binding microarray (PBM) technology. Each entry for a protein (or protein complex) in UniPROBE provides the quantitative preferences for all possible nucleotide sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In this update, we describe >130% expansion of the database content, incorporation of a protein BLAST (blastp) tool for finding protein sequence matches in UniPROBE, the introduction of UniPROBE accession numbers and additional database enhancements. The UniPROBE database is available at http://uniprobe.org.
The transcriptome of Lutzomyia longipalpis (Diptera: Psychodidae) male reproductive organs.

PubMed

Azevedo, Renata V D M; Dias, Denise B S; Bretãs, Jorge A C; Mazzoni, Camila J; Souza, Nataly A; Albano, Rodolpho M; Wagner, Glauber; Davila, Alberto M R; Peixoto, Alexandre A

2012-01-01

It has been suggested that genes involved in the reproductive biology of insect disease vectors are potential targets for future alternative methods of control. Little is known about the molecular biology of reproduction in phlebotomine sand flies and there is no information available concerning genes that are expressed in male reproductive organs of Lutzomyia longipalpis, the main vector of American visceral leishmaniasis and a species complex. We generated 2678 high quality ESTs ("Expressed Sequence Tags") of L. longipalpis male reproductive organs that were grouped in 1391 non-redundant sequences (1136 singlets and 255 clusters). BLAST analysis revealed that only 57% of these sequences share similarity with a L. longipalpis female EST database. Although no more than 36% of the non-redundant sequences showed similarity to protein sequences deposited in databases, more than half of them presented the best-match hits with mosquito genes. Gene ontology analysis identified subsets of genes involved in biological processes such as protein biosynthesis and DNA replication, which are probably associated with spermatogenesis. A number of non-redundant sequences were also identified as putative male reproductive gland proteins (mRGPs), also known as male accessory gland protein genes (Acps). The transcriptome analysis of L. longipalpis male reproductive organs is one step further in the study of the molecular basis of the reproductive biology of this important species complex. It has allowed the identification of genes potentially involved in spermatogenesis as well as putative mRGPs sequences, which have been studied in many insect species because of their effects on female post-mating behavior and physiology and their potential role in sexual selection and speciation. These data open a number of new avenues for further research in the molecular and evolutionary reproductive biology of sand flies.
The Transcriptome of Lutzomyia longipalpis (Diptera: Psychodidae) Male Reproductive Organs

PubMed Central

Bretãs, Jorge A. C.; Mazzoni, Camila J.; Souza, Nataly A.; Albano, Rodolpho M.; Wagner, Glauber; Davila, Alberto M. R.; Peixoto, Alexandre A.

2012-01-01

Background It has been suggested that genes involved in the reproductive biology of insect disease vectors are potential targets for future alternative methods of control. Little is known about the molecular biology of reproduction in phlebotomine sand flies and there is no information available concerning genes that are expressed in male reproductive organs of Lutzomyia longipalpis, the main vector of American visceral leishmaniasis and a species complex. Methods/Principal Findings We generated 2678 high quality ESTs (“Expressed Sequence Tags”) of L. longipalpis male reproductive organs that were grouped in 1391 non-redundant sequences (1136 singlets and 255 clusters). BLAST analysis revealed that only 57% of these sequences share similarity with a L. longipalpis female EST database. Although no more than 36% of the non-redundant sequences showed similarity to protein sequences deposited in databases, more than half of them presented the best-match hits with mosquito genes. Gene ontology analysis identified subsets of genes involved in biological processes such as protein biosynthesis and DNA replication, which are probably associated with spermatogenesis. A number of non-redundant sequences were also identified as putative male reproductive gland proteins (mRGPs), also known as male accessory gland protein genes (Acps). Conclusions The transcriptome analysis of L. longipalpis male reproductive organs is one step further in the study of the molecular basis of the reproductive biology of this important species complex. It has allowed the identification of genes potentially involved in spermatogenesis as well as putative mRGPs sequences, which have been studied in many insect species because of their effects on female post-mating behavior and physiology and their potential role in sexual selection and speciation. These data open a number of new avenues for further research in the molecular and evolutionary reproductive biology of sand flies. PMID:22496818
Acclimation to different depths by the marine angiosperm Posidonia oceanica: transcriptomic and proteomic profiles

PubMed Central

Dattolo, Emanuela; Gu, Jenny; Bayer, Philipp E.; Mazzuca, Silvia; Serra, Ilia A.; Spadafora, Antonia; Bernardo, Letizia; Natali, Lucia; Cavallini, Andrea; Procaccini, Gabriele

2013-01-01

For seagrasses, seasonal and daily variations in light and temperature represent the mains factors driving their distribution along the bathymetric cline. Changes in these environmental factors, due to climatic and anthropogenic effects, can compromise their survival. In a framework of conservation and restoration, it becomes crucial to improve our knowledge about the physiological plasticity of seagrass species along environmental gradients. Here, we aimed to identify differences in transcriptomic and proteomic profiles, involved in the acclimation along the depth gradient in the seagrass Posidonia oceanica, and to improve the available molecular resources in this species, which is an important requisite for the application of eco-genomic approaches. To do that, from plant growing in shallow (−5 m) and deep (−25 m) portions of a single meadow, (i) we generated two reciprocal Expressed Sequences Tags (EST) libraries using a Suppressive Subtractive Hybridization (SSH) approach, to obtain depth/specific transcriptional profiles, and (ii) we identified proteins differentially expressed, using the highly innovative USIS mass spectrometry methodology, coupled with 1D-SDS electrophoresis and labeling free approach. Mass spectra were searched in the open source Global Proteome Machine (GPM) engine against plant databases and with the X!Tandem algorithm against a local database. Transcriptional analysis showed both quantitative and qualitative differences between depths. EST libraries had only the 3% of transcripts in common. A total of 315 peptides belonging to 64 proteins were identified by mass spectrometry. ATP synthase subunits were among the most abundant proteins in both conditions. Both approaches identified genes and proteins in pathways related to energy metabolism, transport and genetic information processing, that appear to be the most involved in depth acclimation in P. oceanica. Their putative rules in acclimation to depth were discussed. PMID:23785376
Acclimation to different depths by the marine angiosperm Posidonia oceanica: transcriptomic and proteomic profiles.

PubMed

Dattolo, Emanuela; Gu, Jenny; Bayer, Philipp E; Mazzuca, Silvia; Serra, Ilia A; Spadafora, Antonia; Bernardo, Letizia; Natali, Lucia; Cavallini, Andrea; Procaccini, Gabriele

2013-01-01

For seagrasses, seasonal and daily variations in light and temperature represent the mains factors driving their distribution along the bathymetric cline. Changes in these environmental factors, due to climatic and anthropogenic effects, can compromise their survival. In a framework of conservation and restoration, it becomes crucial to improve our knowledge about the physiological plasticity of seagrass species along environmental gradients. Here, we aimed to identify differences in transcriptomic and proteomic profiles, involved in the acclimation along the depth gradient in the seagrass Posidonia oceanica, and to improve the available molecular resources in this species, which is an important requisite for the application of eco-genomic approaches. To do that, from plant growing in shallow (-5 m) and deep (-25 m) portions of a single meadow, (i) we generated two reciprocal Expressed Sequences Tags (EST) libraries using a Suppressive Subtractive Hybridization (SSH) approach, to obtain depth/specific transcriptional profiles, and (ii) we identified proteins differentially expressed, using the highly innovative USIS mass spectrometry methodology, coupled with 1D-SDS electrophoresis and labeling free approach. Mass spectra were searched in the open source Global Proteome Machine (GPM) engine against plant databases and with the X!Tandem algorithm against a local database. Transcriptional analysis showed both quantitative and qualitative differences between depths. EST libraries had only the 3% of transcripts in common. A total of 315 peptides belonging to 64 proteins were identified by mass spectrometry. ATP synthase subunits were among the most abundant proteins in both conditions. Both approaches identified genes and proteins in pathways related to energy metabolism, transport and genetic information processing, that appear to be the most involved in depth acclimation in P. oceanica. Their putative rules in acclimation to depth were discussed.
Genomic atlas of the human plasma proteome.

PubMed

Sun, Benjamin B; Maranville, Joseph C; Peters, James E; Stacey, David; Staley, James R; Blackshaw, James; Burgess, Stephen; Jiang, Tao; Paige, Ellie; Surendran, Praveen; Oliver-Williams, Clare; Kamat, Mihir A; Prins, Bram P; Wilcox, Sheri K; Zimmerman, Erik S; Chi, An; Bansal, Narinder; Spain, Sarah L; Wood, Angela M; Morrell, Nicholas W; Bradley, John R; Janjic, Nebojsa; Roberts, David J; Ouwehand, Willem H; Todd, John A; Soranzo, Nicole; Suhre, Karsten; Paul, Dirk S; Fox, Caroline S; Plenge, Robert M; Danesh, John; Runz, Heiko; Butterworth, Adam S

2018-06-01

Although plasma proteins have important roles in biological processes and are the direct targets of many drugs, the genetic factors that control inter-individual variation in plasma protein levels are not well understood. Here we characterize the genetic architecture of the human plasma proteome in healthy blood donors from the INTERVAL study. We identify 1,927 genetic associations with 1,478 proteins, a fourfold increase on existing knowledge, including trans associations for 1,104 proteins. To understand the consequences of perturbations in plasma protein levels, we apply an integrated approach that links genetic variation with biological pathway, disease, and drug databases. We show that protein quantitative trait loci overlap with gene expression quantitative trait loci, as well as with disease-associated loci, and find evidence that protein biomarkers have causal roles in disease using Mendelian randomization analysis. By linking genetic factors to diseases via specific proteins, our analyses highlight potential therapeutic targets, opportunities for matching existing drugs with new disease indications, and potential safety concerns for drugs under development.
Immunogenic Salivary Proteins of Triatoma infestans: Development of a Recombinant Antigen for the Detection of Low-Level Infestation of Triatomines

PubMed Central

Schwarz, Alexandra; Helling, Stefan; Collin, Nicolas; Teixeira, Clarissa R.; Medrano-Mercado, Nora; Hume, Jen C. C.; Assumpção, Teresa C.; Marcus, Katrin; Stephan, Christian; Meyer, Helmut E.; Ribeiro, José M. C.; Billingsley, Peter F.; Valenzuela, Jesus G.; Sternberg, Jeremy M.; Schaub, Günter A.

2009-01-01

Background Triatomines are vectors of Trypanosoma cruzi, the etiological agent of Chagas disease in Latin America. The most effective vector, Triatoma infestans, has been controlled successfully in much of Latin America using insecticide spraying. Though rarely undertaken, surveillance programs are necessary in order to identify new infestations and estimate the intensity of triatomine bug infestations in domestic and peridomestic habitats. Since hosts exposed to triatomines develop immune responses to salivary antigens, these responses can be evaluated for their usefulness as epidemiological markers to detect infestations of T. infestans. Methodology/Principal Findings T. infestans salivary proteins were separated by 2D-gel electrophoresis and tested for their immunogenicity by Western blotting using sera from chickens and guinea pigs experimentally exposed to T. infestans. From five highly immunogenic protein spots, eight salivary proteins were identified by nano liquid chromatography-electrospray ionization-tandem mass spectrometry (nanoLC-ESI-MS/MS) and comparison to the protein sequences of the National Center for Biotechnology Information (NCBI) database and expressed sequence tags of a unidirectionally cloned salivary gland cDNA library from T. infestans combined with the NCBI yeast protein sub-database. The 14.6 kDa salivary protein [gi|149689094] was produced as recombinant protein (rTiSP14.6) in a mammalian cell expression system and recognized by all animal sera. The specificity of rTiSP14.6 was confirmed by the lack of reactivity to anti-mosquito and anti-sand fly saliva antibodies. However, rTiSP14.6 was recognized by sera from chickens exposed to four other triatomine species, Triatoma brasiliensis, T. sordida, Rhodnius prolixus, and Panstrongylus megistus and by sera of chickens from an endemic area of T. infestans and Chagas disease in Bolivia. Conclusions/Significance The recombinant rTiSP14.6 is a suitable and promising epidemiological marker for detecting the presence of small numbers of different species of triatomines and could be developed for use as a new tool in surveillance programs, especially to corroborate vector elimination in Chagas disease vector control campaigns. PMID:19841746
Unbiased compound screening with a reporter gene assay highlights the role of p13 in the cardiac cellular stress response.

PubMed

Inoue, Naoki; Hirouchi, Taisei; Kasai, Atsushi; Higashi, Shintaro; Hiraki, Natsumi; Tanaka, Shota; Nakazawa, Takanobu; Nunomura, Kazuto; Lin, Bangzhong; Omori, Akiko; Hayata-Takano, Atsuko; Kim, Yoon-Jeong; Doi, Takefumi; Baba, Akemichi; Hashimoto, Hitoshi; Shintani, Norihito

2018-01-08

We recently showed that a 13-kDa protein (p13), the homolog protein of formation of mitochondrial complex V assembly factor 1 in yeast, acts as a potential protective factor in pancreatic islets under diabetes. Here, we aimed to identify known compounds regulating p13 mRNA expression to obtain therapeutic insight into the cellular stress response. A luciferase reporter system was developed using the putative promoter region of the human p13 gene. Overexpression of peroxisome proliferator-activated receptor gamma coactivator 1α, a master player regulating mitochondrial metabolism, increased both reporter activity and p13 expression. Following unbiased screening with 2320 known compounds in HeLa cells, 12 pharmacological agents (including 8 cardiotonics and 2 anthracyclines) that elicited >2-fold changes in p13 mRNA expression were identified. Among them, four cardiac glycosides decreased p13 expression and concomitantly elevated cellular oxidative stress. Additional database analyses showed highest p13 expression in heart, with typically decreased expression in cardiac disease. Accordingly, our results illustrate the usefulness of unbiased compound screening as a method for identifying novel functional roles of unfamiliar genes. Our findings also highlight the importance of p13 in the cellular stress response in heart. Copyright © 2017. Published by Elsevier Inc.
Identification of potential crucial genes associated with steroid-induced necrosis of femoral head based on gene expression profile.

PubMed

Lin, Zhe; Lin, Yongsheng

2017-09-05

The aim of this study was to explore potential crucial genes associated with the steroid-induced necrosis of femoral head (SINFH) and to provide valid biological information for further investigation of SINFH. Gene expression profile of GSE26316, generated from 3 SINFH rat samples and 3 normal rat samples were downloaded from Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) were identified using LIMMA package. After functional enrichment analyses of DEGs, protein-protein interaction (PPI) network and sub-PPI network analyses were conducted based on the STRING database and cytoscape. In total, 59 up-regulated DEGs and 156 downregulated DEGs were identified. The up-regulated DEGs were mainly involved in functions about immunity (e.g. Fcer1A and Il7R), and the downregulated DEGs were mainly enriched in muscle system process (e.g. Tnni2, Mylpf and Myl1). The PPI network of DEGs consisted of 123 nodes and 300 interactions. Tnni2, Mylpf, and Myl1 were the top 3 outstanding genes based on both subgraph centrality and degree centrality evaluation. These three genes interacted with each other in the network. Furthermore, the significant network module was composed of 22 downregulated genes (e.g. Tnni2, Mylpf and Myl1). These genes were mainly enriched in functions like muscle system process. The DEGs related to the regulation of immune system process (e.g. Fcer1A and Il7R), and DEGs correlated with muscle system process (e.g. Tnni2, Mylpf and Myl1) may be closely associated with the progress of SINFH, which is still needed to be confirmed by experiments. Copyright © 2017 Elsevier B.V. All rights reserved.
Transcriptome profiling identified differentially expressed genes and pathways associated with tamoxifen resistance in human breast cancer

PubMed Central

Men, Xin; Ma, Jun; Wu, Tong; Pu, Junyi; Wen, Shaojia; Shen, Jianfeng; Wang, Xun; Wang, Yamin; Chen, Chao; Dai, Penggao

2018-01-01

Tamoxifen (TAM) resistance is an important clinical problem in the treatment of breast cancer. In order to identify the mechanism of TAM resistance for estrogen receptor (ER)-positive breast cancer, we screened the transcriptome using RNA-seq and compared the gene expression profiles between the MCF-7 mamma carcinoma cell line and the TAM-resistant cell line TAMR/MCF-7, 52 significant differential expression genes (DEGs) were identified including SLIT2, ROBO, LHX, KLF, VEGFC, BAMBI, LAMA1, FLT4, PNMT, DHRS2, MAOA and ALDH. The DEGs were annotated in the GO, COG and KEGG databases. Annotation of the function of the DEGs in the KEGG database revealed the top three pathways enriched with the most DEGs, including pathways in cancer, the PI3K-AKT pathway, and focal adhesion. Then we compared the gene expression profiles between the Clinical progressive disease (PD) and the complete response (CR) from the cancer genome altas (TCGA). 10 common DEGs were identified through combining the clinical and cellular analysis results. Protein-protein interaction network was applied to analyze the association of ER signal pathway with the 10 DEGs. 3 significant genes (GFRA3, NPY1R and PTPRN2) were closely related to ER related pathway. These significant DEGs regulated many biological activities such as cell proliferation and survival, motility and migration, and tumor cell invasion. The interactions between these DEGs and drug resistance phenomenon need to be further elucidated at a functional level in further studies. Based on our findings, we believed that these DEGs could be therapeutic targets, which can be explored to develop new treatment options. PMID:29423105
MIPS: a database for protein sequences, homology data and yeast genome information.

PubMed Central

Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F

1997-01-01

The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498
TRANSFAC: an integrated system for gene expression regulation.

PubMed

Wingender, E; Chen, X; Hehl, R; Karas, H; Liebich, I; Matys, V; Meinhardt, T; Prüss, M; Reuter, I; Schacherer, F

2000-01-01

TRANSFAC is a database on transcription factors, their genomic binding sites and DNA-binding profiles (http://transfac.gbf.de/TRANSFAC/). Its content has been enhanced, in particular by information about training sequences used for the construction of nucleotide matrices as well as by data on plant sites and factors. Moreover, TRANSFAC has been extended by two new modules: PathoDB provides data on pathologically relevant mutations in regulatory regions and transcription factor genes, whereas S/MARt DB compiles features of scaffold/matrix attached regions (S/MARs) and the proteins binding to them. Additionally, the databases TRANSPATH, about signal transduction, and CYTOMER, about organs and cell types, have been extended and are increasingly integrated with the TRANSFAC data sources.
NPIDB: Nucleic acid-Protein Interaction DataBase.

PubMed

Kirsanov, Dmitry D; Zanegina, Olga N; Aksianov, Evgeniy A; Spirin, Sergei A; Karyagina, Anna S; Alexeevski, Andrei V

2013-01-01

The Nucleic acid-Protein Interaction DataBase (http://npidb.belozersky.msu.ru/) contains information derived from structures of DNA-protein and RNA-protein complexes extracted from the Protein Data Bank (3846 complexes in October 2012). It provides a web interface and a set of tools for extracting biologically meaningful characteristics of nucleoprotein complexes. The content of the database is updated weekly. The current version of the Nucleic acid-Protein Interaction DataBase is an upgrade of the version published in 2007. The improvements include a new web interface, new tools for calculation of intermolecular interactions, a classification of SCOP families that contains DNA-binding protein domains and data on conserved water molecules on the DNA-protein interface.
The PMDB Protein Model Database

PubMed Central

Castrignanò, Tiziana; De Meo, Paolo D'Onorio; Cozzetto, Domenico; Talamo, Ivano Giuseppe; Tramontano, Anna

2006-01-01

The Protein Model Database (PMDB) is a public resource aimed at storing manually built 3D models of proteins. The database is designed to provide access to models published in the scientific literature, together with validating experimental data. It is a relational database and it currently contains >74 000 models for ∼240 proteins. The system is accessible at and allows predictors to submit models along with related supporting evidence and users to download them through a simple and intuitive interface. Users can navigate in the database and retrieve models referring to the same target protein or to different regions of the same protein. Each model is assigned a unique identifier that allows interested users to directly access the data. PMID:16381873
PDB_TM: selection and membrane localization of transmembrane proteins in the protein data bank.

PubMed

Tusnády, Gábor E; Dosztányi, Zsuzsanna; Simon, István

2005-01-01

PDB_TM is a database for transmembrane proteins with known structures. It aims to collect all transmembrane proteins that are deposited in the protein structure database (PDB) and to determine their membrane-spanning regions. These assignments are based on the TMDET algorithm, which uses only structural information to locate the most likely position of the lipid bilayer and to distinguish between transmembrane and globular proteins. This algorithm was applied to all PDB entries and the results were collected in the PDB_TM database. By using TMDET algorithm, the PDB_TM database can be automatically updated every week, keeping it synchronized with the latest PDB updates. The PDB_TM database is available at http://www.enzim.hu/PDB_TM.
Mango (Mangifera indica L.) cv. Kent fruit mesocarp de novo transcriptome assembly identifies gene families important for ripening

PubMed Central

Dautt-Castro, Mitzuko; Ochoa-Leyva, Adrian; Contreras-Vergara, Carmen A.; Pacheco-Sanchez, Magda A.; Casas-Flores, Sergio; Sanchez-Flores, Alejandro; Kuhn, David N.; Islas-Osuna, Maria A.

2015-01-01

Fruit ripening is a physiological and biochemical process genetically programmed to regulate fruit quality parameters like firmness, flavor, odor and color, as well as production of ethylene in climacteric fruit. In this study, a transcriptomic analysis of mango (Mangifera indica L.) mesocarp cv. “Kent” was done to identify key genes associated with fruit ripening. Using the Illumina sequencing platform, 67,682,269 clean reads were obtained and a transcriptome of 4.8 Gb. A total of 33,142 coding sequences were predicted and after functional annotation, 25,154 protein sequences were assigned with a product according to Swiss-Prot database and 32,560 according to non-redundant database. Differential expression analysis identified 2,306 genes with significant differences in expression between mature-green and ripe mango [1,178 up-regulated and 1,128 down-regulated (FDR ≤ 0.05)]. The expression of 10 genes evaluated by both qRT-PCR and RNA-seq data was highly correlated (R = 0.97), validating the differential expression data from RNA-seq alone. Gene Ontology enrichment analysis, showed significantly represented terms associated to fruit ripening like “cell wall,” “carbohydrate catabolic process” and “starch and sucrose metabolic process” among others. Mango genes were assigned to 327 metabolic pathways according to Kyoto Encyclopedia of Genes and Genomes database, among them those involved in fruit ripening such as plant hormone signal transduction, starch and sucrose metabolism, galactose metabolism, terpenoid backbone, and carotenoid biosynthesis. This study provides a mango transcriptome that will be very helpful to identify genes for expression studies in early and late flowering mangos during fruit ripening. PMID:25741352

Transcriptome Analysis of the Differentially Expressed Genes in the Male and Female Shrub Willows (Salix suchowensis)

PubMed Central

Liu, Jingjing; Yin, Tongming; Ye, Ning; Chen, Yingnan; Yin, Tingting; Liu, Min; Hassani, Danial

2013-01-01

Background The dioecious system is relatively rare in plants. Shrub willow is an annual flowering dioecious woody plant, and possesses many characteristics that lend it as a great model for tracking the missing pieces of sex determination evolution. To gain a global view of the genes differentially expressed in the male and female shrub willows and to develop a database for further studies, we performed a large-scale transcriptome sequencing of flower buds which were separately collected from two types of sexes. Results Totally, 1,201,931 high quality reads were obtained, with an average length of 389 bp and a total length of 467.96 Mb. The ESTs were assembled into 29,048 contigs, and 132,709 singletons. These unigenes were further functionally annotated by comparing their sequences to different proteins and functional domain databases and assigned with Gene Ontology (GO) terms. A biochemical pathway database containing 291 predicted pathways was also created based on the annotations of the unigenes. Digital expression analysis identified 806 differentially expressed genes between the male and female flower buds. And 33 of them located on the incipient sex chromosome of Salicaceae, among which, 12 genes might involve in plant sex determination empirically. These genes were worthy of special notification in future studies. Conclusions In this study, a large number of EST sequences were generated from the flower buds of a male and a female shrub willow. We also reported the differentially expressed genes between the two sex-type flowers. This work provides valuable information and sequence resources for uncovering the sex determining genes and for future functional genomics analysis of Salicaceae spp. PMID:23560075
Mango (Mangifera indica L.) cv. Kent fruit mesocarp de novo transcriptome assembly identifies gene families important for ripening.

PubMed

Dautt-Castro, Mitzuko; Ochoa-Leyva, Adrian; Contreras-Vergara, Carmen A; Pacheco-Sanchez, Magda A; Casas-Flores, Sergio; Sanchez-Flores, Alejandro; Kuhn, David N; Islas-Osuna, Maria A

2015-01-01

Fruit ripening is a physiological and biochemical process genetically programmed to regulate fruit quality parameters like firmness, flavor, odor and color, as well as production of ethylene in climacteric fruit. In this study, a transcriptomic analysis of mango (Mangifera indica L.) mesocarp cv. "Kent" was done to identify key genes associated with fruit ripening. Using the Illumina sequencing platform, 67,682,269 clean reads were obtained and a transcriptome of 4.8 Gb. A total of 33,142 coding sequences were predicted and after functional annotation, 25,154 protein sequences were assigned with a product according to Swiss-Prot database and 32,560 according to non-redundant database. Differential expression analysis identified 2,306 genes with significant differences in expression between mature-green and ripe mango [1,178 up-regulated and 1,128 down-regulated (FDR ≤ 0.05)]. The expression of 10 genes evaluated by both qRT-PCR and RNA-seq data was highly correlated (R = 0.97), validating the differential expression data from RNA-seq alone. Gene Ontology enrichment analysis, showed significantly represented terms associated to fruit ripening like "cell wall," "carbohydrate catabolic process" and "starch and sucrose metabolic process" among others. Mango genes were assigned to 327 metabolic pathways according to Kyoto Encyclopedia of Genes and Genomes database, among them those involved in fruit ripening such as plant hormone signal transduction, starch and sucrose metabolism, galactose metabolism, terpenoid backbone, and carotenoid biosynthesis. This study provides a mango transcriptome that will be very helpful to identify genes for expression studies in early and late flowering mangos during fruit ripening.
Identification and lateral membrane localization of cyclin M3, likely to be involved in renal Mg2+ handling in seawater fish

PubMed Central

Islam, Zinia; Hayashi, Naoko; Inoue, Hana; Umezawa, Takahiro; Kimura, Yuuri; Doi, Hiroyuki; Romero, Michael F.; Hirose, Shigehisa

2014-01-01

The kidney of marine teleosts is the major site of Mg2+ excretion and produces urine with a high Mg2+ concentration. However, the transporters involved in Mg2+ excretion are poorly understood. The cyclin M (Cnnm; also known as ancient conserved domain protein) family comprises membrane proteins homologous to the bacterial Mg2+ and Co2+ efflux protein, CorC. To understand the molecular mechanism of Mg2+ homeostasis in marine teleosts, we analyzed the expression of the Cnnm family genes in the seawater (SW) pufferfish, torafugu (Takifugu rubripes), and the closely related euryhaline species, mefugu (Takifugu obscurus). Database mining and phylogenetic analysis indicated that the Takifugu genome contains six members of the Cnnm family: two orthologs of Cnnm1, one of Cnnm2, one of Cnnm3, and two of Cnnm4. RT-PCR analyses indicated that Cnnm2, Cnnm3, and Cnnm4a are expressed in the kidney, whereas other members are mainly expressed in the brain. Renal expression of Cnnm3 was upregulated in SW mefugu, whereas renal expression of Cnnm2 was upregulated in freshwater (FW) mefugu. No significant difference was observed in renal expression of Cnnm4a between SW and FW mefugu. In situ hybridization and immunohistochemical analyses of the SW mefugu kidney revealed that Cnnm3 is expressed in the proximal tubule, and its product localizes to the lateral membrane. When Cnnm3 was expressed in Xenopus laevis oocytes, whole cellular Mg2+ content and free intracellular Mg2+ activity significantly decreased. These results suggest that Cnnm3 is involved in body fluid Mg2+ homeostasis in marine teleosts. PMID:24965791
Novel APC gene mutations associated with protein alteration in diffuse type gastric cancer.

PubMed

Ghatak, Souvik; Chakraborty, Payel; Sarkar, Sandeep Roy; Chowdhury, Biswajit; Bhaumik, Arup; Kumar, Nachimuthu Senthil

2017-06-02

The role of adenomatous polyposis coli (APC) gene in mitosis might be critical for regulation of genomic stability and chromosome segregation. APC gene mutations have been associated to have a role in colon cancer and since gastric and colon tumors share some common genetic lesions, it is relevant to investigate the role of APC tumor suppressor gene in gastric cancer. We investigated for somatic mutations in the Exons 14 and 15 of APC gene from 40 diffuse type gastric cancersamples. Rabbit polyclonal anti-APC antibody was used, which detects the wild-type APC protein and was recommended for detection of the respective protein in human tissues. Cell cycle analysis was done from tumor and adjacent normal tissue. APC immunoreactivity showed positive expression of the protein in stages I, II, III and negative expression in Stages III and IV. Two novel deleterious variations (g.127576C > A, g.127583C > T) in exon 14 sequence were found to generate stop codon (Y622* and Q625*)in the tumor samples. Due to the generation of stop codon, the APC protein might be truncated and all the regulatory features could be lost which has led to the down-regulation of protein expression. Our results indicate that aneuploidy might occurdue to the codon 622 and 625 APC-driven gastric tumorigenesis, in agreement with our cell cycle analysis. The APC gene function in mitosis and chromosomal stability might be lost and G1 might be arrested with high quantity of DNA in the S phase. Six missense somatic mutations in tumor samples were detected in exon 15 A-B, twoof which showed pathological and disease causing effects based on SIFT, Polyphen2 and SNPs & GO score and were not previously reported in the literature or the public mutation databases. The two novel pathological somatic mutations (g.127576C > A, g.127583C > T) in exon 14 might be altering the protein expression leading to development of gastric cancer in the study population. Our study showed that mutations in the APC gene alter the protein expression and cell cycle regulation in diffuse type gastric adenocarcinoma.
Kinase Pathway Database: An Integrated Protein-Kinase and NLP-Based Protein-Interaction Resource

PubMed Central

Koike, Asako; Kobayashi, Yoshiyuki; Takagi, Toshihisa

2003-01-01

Protein kinases play a crucial role in the regulation of cellular functions. Various kinds of information about these molecules are important for understanding signaling pathways and organism characteristics. We have developed the Kinase Pathway Database, an integrated database involving major completely sequenced eukaryotes. It contains the classification of protein kinases and their functional conservation, ortholog tables among species, protein–protein, protein–gene, and protein–compound interaction data, domain information, and structural information. It also provides an automatic pathway graphic image interface. The protein, gene, and compound interactions are automatically extracted from abstracts for all genes and proteins by natural-language processing (NLP).The method of automatic extraction uses phrase patterns and the GENA protein, gene, and compound name dictionary, which was developed by our group. With this database, pathways are easily compared among species using data with more than 47,000 protein interactions and protein kinase ortholog tables. The database is available for querying and browsing at http://kinasedb.ontology.ims.u-tokyo.ac.jp/. PMID:12799355
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

DOE Office of Scientific and Technical Information (OSTI.GOV)

AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

DOE PAGES

AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

2015-11-19

Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
Cloning and analysis of DnaJ family members in the silkworm, Bombyx mori.

PubMed

Li, Yinü; Bu, Cuiyu; Li, Tiantian; Wang, Shibao; Jiang, Feng; Yi, Yongzhu; Yang, Huipeng; Zhang, Zhifang

2016-01-15

Heat shock proteins (Hsps) are involved in a variety of critical biological functions, including protein folding, degradation, and translocation and macromolecule assembly, act as molecular chaperones during periods of stress by binding to other proteins. Using expressed sequence tag (EST) and silkworm (Bombyx mori) transcriptome databases, we identified 27 cDNA sequences encoding the conserved J domain, which is found in DnaJ-type Hsps. Of the 27 J domain-containing sequences, 25 were complete cDNA sequences. We divided them into three types according to the number and presence of conserved domains. By analyzing the gene structures, intron numbers, and conserved domains and constructing a phylogenetic tree, we found that the DnaJ family had undergone convergent evolution, obtaining new domains to expand the diversity of its family members. The acquisition of the new DnaJ domains most likely occurred prior to the evolutionary divergence of prokaryotes and eukaryotes. The expression of DnaJ genes in the silkworm was generally higher in the fat body. The tissue distribution of DnaJ1 proteins was detected by western blotting, demonstrating that in the fifth-instar larvae, the DnaJ1 proteins were expressed at their highest levels in hemocytes, followed by the fat body and head. We also found that the DnaJ1 transcripts were likely differentially translated in different tissues. Using immunofluorescence cytochemistry, we revealed that in the blood cells, DnaJ1 was mainly localized in the cytoplasm. Copyright © 2015 Elsevier B.V. All rights reserved.
Database-independent Protein Sequencing (DiPS) Enables Full-length de Novo Protein and Antibody Sequence Determination.

PubMed

Savidor, Alon; Barzilay, Rotem; Elinger, Dalia; Yarden, Yosef; Lindzen, Moshit; Gabashvili, Alexandra; Adiv Tal, Ophir; Levin, Yishai

2017-06-01

Traditional "bottom-up" proteomic approaches use proteolytic digestion, LC-MS/MS, and database searching to elucidate peptide identities and their parent proteins. Protein sequences absent from the database cannot be identified, and even if present in the database, complete sequence coverage is rarely achieved even for the most abundant proteins in the sample. Thus, sequencing of unknown proteins such as antibodies or constituents of metaproteomes remains a challenging problem. To date, there is no available method for full-length protein sequencing, independent of a reference database, in high throughput. Here, we present Database-independent Protein Sequencing, a method for unambiguous, rapid, database-independent, full-length protein sequencing. The method is a novel combination of non-enzymatic, semi-random cleavage of the protein, LC-MS/MS analysis, peptide de novo sequencing, extraction of peptide tags, and their assembly into a consensus sequence using an algorithm named "Peptide Tag Assembler." As proof-of-concept, the method was applied to samples of three known proteins representing three size classes and to a previously un-sequenced, clinically relevant monoclonal antibody. Excluding leucine/isoleucine and glutamic acid/deamidated glutamine ambiguities, end-to-end full-length de novo sequencing was achieved with 99-100% accuracy for all benchmarking proteins and the antibody light chain. Accuracy of the sequenced antibody heavy chain, including the entire variable region, was also 100%, but there was a 23-residue gap in the constant region sequence. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
pseudoMap: an innovative and comprehensive resource for identification of siRNA-mediated mechanisms in human transcribed pseudogenes.

PubMed

Chan, Wen-Ling; Yang, Wen-Kuang; Huang, Hsien-Da; Chang, Jan-Gowth

2013-01-01

RNA interference (RNAi) is a gene silencing process within living cells, which is controlled by the RNA-induced silencing complex with a sequence-specific manner. In flies and mice, the pseudogene transcripts can be processed into short interfering RNAs (siRNAs) that regulate protein-coding genes through the RNAi pathway. Following these findings, we construct an innovative and comprehensive database to elucidate siRNA-mediated mechanism in human transcribed pseudogenes (TPGs). To investigate TPG producing siRNAs that regulate protein-coding genes, we mapped the TPGs to small RNAs (sRNAs) that were supported by publicly deep sequencing data from various sRNA libraries and constructed the TPG-derived siRNA-target interactions. In addition, we also presented that TPGs can act as a target for miRNAs that actually regulate the parental gene. To enable the systematic compilation and updating of these results and additional information, we have developed a database, pseudoMap, capturing various types of information, including sequence data, TPG and cognate annotation, deep sequencing data, RNA-folding structure, gene expression profiles, miRNA annotation and target prediction. As our knowledge, pseudoMap is the first database to demonstrate two mechanisms of human TPGs: encoding siRNAs and decoying miRNAs that target the parental gene. pseudoMap is freely accessible at http://pseudomap.mbc.nctu.edu.tw/. Database URL: http://pseudomap.mbc.nctu.edu.tw/
Screening key candidate genes and pathways involved in insulinoma by microarray analysis.

PubMed

Zhou, Wuhua; Gong, Li; Li, Xuefeng; Wan, Yunyan; Wang, Xiangfei; Li, Huili; Jiang, Bin

2018-06-01

Insulinoma is a rare type tumor and its genetic features remain largely unknown. This study aimed to search for potential key genes and relevant enriched pathways of insulinoma.The gene expression data from GSE73338 were downloaded from Gene Expression Omnibus database. Differentially expressed genes (DEGs) were identified between insulinoma tissues and normal pancreas tissues, followed by pathway enrichment analysis, protein-protein interaction (PPI) network construction, and module analysis. The expressions of candidate key genes were validated by quantitative real-time polymerase chain reaction (RT-PCR) in insulinoma tissues.A total of 1632 DEGs were obtained, including 1117 upregulated genes and 514 downregulated genes. Pathway enrichment results showed that upregulated DEGs were significantly implicated in insulin secretion, and downregulated DEGs were mainly enriched in pancreatic secretion. PPI network analysis revealed 7 hub genes with degrees more than 10, including GCG (glucagon), GCGR (glucagon receptor), PLCB1 (phospholipase C, beta 1), CASR (calcium sensing receptor), F2R (coagulation factor II thrombin receptor), GRM1 (glutamate metabotropic receptor 1), and GRM5 (glutamate metabotropic receptor 5). DEGs involved in the significant modules were enriched in calcium signaling pathway, protein ubiquitination, and platelet degranulation. Quantitative RT-PCR data confirmed that the expression trends of these hub genes were similar to the results of bioinformatic analysis.The present study demonstrated that candidate DEGs and enriched pathways were the potential critical molecule events involved in the development of insulinoma, and these findings were useful for better understanding of insulinoma genesis.
Merging in-silico and in vitro salivary protein complex partners using the STRING database: A tutorial.

PubMed

Crosara, Karla Tonelli Bicalho; Moffa, Eduardo Buozi; Xiao, Yizhi; Siqueira, Walter Luiz

2018-01-16

Protein-protein interaction is a common physiological mechanism for protection and actions of proteins in an organism. The identification and characterization of protein-protein interactions in different organisms is necessary to better understand their physiology and to determine their efficacy. In a previous in vitro study using mass spectrometry, we identified 43 proteins that interact with histatin 1. Six previously documented interactors were confirmed and 37 novel partners were identified. In this tutorial, we aimed to demonstrate the usefulness of the STRING database for studying protein-protein interactions. We used an in-silico approach along with the STRING database (http://string-db.org/) and successfully performed a fast simulation of a novel constructed histatin 1 protein-protein network, including both the previously known and the predicted interactors, along with our newly identified interactors. Our study highlights the advantages and importance of applying bioinformatics tools to merge in-silico tactics with experimental in vitro findings for rapid advancement of our knowledge about protein-protein interactions. Our findings also indicate that bioinformatics tools such as the STRING protein network database can help predict potential interactions between proteins and thus serve as a guide for future steps in our exploration of the Human Interactome. Our study highlights the usefulness of the STRING protein database for studying protein-protein interactions. The STRING database can collect and integrate data about known and predicted protein-protein associations from many organisms, including both direct (physical) and indirect (functional) interactions, in an easy-to-use interface. Copyright © 2017 Elsevier B.V. All rights reserved.
Proteomic analysis of Pinus radiata needles: 2-DE map and protein identification by LC/MS/MS and substitution-tolerant database searching.

PubMed

Valledor, Luis; Castillejo, Maria A; Lenz, Christof; Rodríguez, Roberto; Cañal, Maria J; Jorrín, Jesús

2008-07-01

Pinus radiata is one of the most economically important forest tree species, with a worldwide production of around 370 million m (3) of wood per year. Current selection of elite trees to be used in conservation and breeding programes requires the physiological and molecular characterization of available populations. To identify key proteins related to tree growth, productivity and responses to environmental factors, a proteomic approach is being utilized. In this paper, we present the first report of the 2-DE protein reference map of physiologically mature P. radiata needles, as a basis for subsequent differential expression proteomic studies related to growth, development, biomass production and responses to stresses. After TCA/acetone protein extraction of needle tissue, 549 +/- 21 well-resolved spots were detected in Coommassie-stained gels within the 5-8 pH and 10-100 kDa M(r) ranges. The analytical and biological variance determined for 450 spots were of 31 and 42%, respectively. After LC/MS/MS analysis of in-gel tryptic digested spots, proteins were identified by using the novel Paragon algorithm that tolerates amino acid substitution in the first-pass search. It allowed the confident identification of 115 out of the 150 protein spots subjected to MS, quite unusual high percentage for a poor sequence database, as is the case of P. radiata. Proteins were classified into 12 or 18 groups based on their corresponding cell component or biological process/pathway categories, respectively. Carbohydrate metabolism and photosynthetic enzymes predominate in the 2-DE protein profile of P. radiata needles.
Tsetse Salivary Gland Proteins 1 and 2 Are High Affinity Nucleic Acid Binding Proteins with Residual Nuclease Activity

PubMed Central

Caljon, Guy; Ridder, Karin De; Stijlemans, Benoît; Coosemans, Marc; Magez, Stefan; De Baetselier, Patrick; Van Den Abbeele, Jan

2012-01-01

Analysis of the tsetse fly salivary gland EST database revealed the presence of a highly enriched cluster of putative endonuclease genes, including tsal1 and tsal2. Tsal proteins are the major components of tsetse fly (G. morsitans morsitans) saliva where they are present as monomers as well as high molecular weight complexes with other saliva proteins. We demonstrate that the recombinant tsetse salivary gland proteins 1&2 (Tsal1&2) display DNA/RNA non-specific, high affinity nucleic acid binding with KD values in the low nanomolar range and a non-exclusive preference for duplex. These Tsal proteins exert only a residual nuclease activity with a preference for dsDNA in a broad pH range. Knockdown of Tsal expression by in vivo RNA interference in the tsetse fly revealed a partially impaired blood digestion phenotype as evidenced by higher gut nucleic acid, hematin and protein contents. PMID:23110062
Genome scale enzyme–metabolite and drug–target interaction predictions using the signature molecular descriptor

DOE PAGES

Faulon, Jean-Loup; Misra, Milind; Martin, Shawn; ...

2007-11-23

Motivation: Identifying protein enzymatic or pharmacological activities are important areas of research in biology and chemistry. Biological and chemical databases are increasingly being populated with linkages between protein sequences and chemical structures. Additionally, there is now sufficient information to apply machine-learning techniques to predict interactions between chemicals and proteins at a genome scale. Current machine-learning techniques use as input either protein sequences and structures or chemical information. We propose here a method to infer protein–chemical interactions using heterogeneous input consisting of both protein sequence and chemical information. Results: Our method relies on expressing proteins and chemicals with a common cheminformaticsmore » representation. We demonstrate our approach by predicting whether proteins can catalyze reactions not present in training sets. We also predict whether a given drug can bind a target, in the absence of prior binding information for that drug and target. Lastly, such predictions cannot be made with current machine-learning techniques requiring binding information for individual reactions or individual targets.« less
Promoter activity of polypyrimidine tract-binding protein genes of potato responds to environmental cues.

PubMed

Butler, Nathaniel M; Hannapel, David J

2012-12-01

Polypyrimidine tract-binding (PTB) proteins are RNA-binding proteins that target specific RNAs for post-transcriptional processing by binding cytosine/uracil motifs. PTBs have established functions in a range of RNA processes including splicing, translation, stability and long-distance transport. Six PTB-like genes identified in potato have been grouped into two clades based on homology to other known plant PTBs. StPTB1 and StPTB6 are closely related to a PTB protein discovered in pumpkin, designated CmRBP50, and contain four canonical RNA-recognition motifs. CmRBP50 is expressed in phloem tissues and functions as the core protein of a phloem-mobile RNA/protein complex. Sequence from the potato genome database was used to clone the upstream sequence of these two PTB genes and analyzed to identify conserved cis-elements. The promoter of StPTB6 was enriched for regulatory elements for light and sucrose induction and defense. Upstream sequence of both PTB genes was fused to β-glucuronidase and monitored in transgenic potato lines. In whole plants, the StPTB1 promoter was most active in leaf veins and petioles, whereas StPTB6 was most active in leaf mesophyll. Both genes are active in new tubers and tuber sprouts. StPTB6 expression was induced in stems and stolon sections in response to sucrose and in leaves or petioles in response to light, heat, drought and mechanical wounding. These results show that CmRBP50-like genes of potato exhibit distinct expression patterns and respond to both developmental and environmental cues.
Differential expression proteins associated with bud dormancy release during chilling treatment of tree peony (Paeonia suffruticosa).

PubMed

Zhang, Y X; Yu, D; Tian, X L; Liu, C Y; Gai, S P; Zheng, G S

2015-01-01

Endo-dormant flower buds of tree peony must have sufficient chilling duration to reinitiate growth, which is a major obstacle to the forcing culture of tree peony in winter. We used a combination of two-dimensional gel electrophoresis (2-DE) and matrix-assisted laser desorption/ionisation time of flight/time of flight mass spectrometry (MALDI-TOF/TOF MS) to identify the differentially expressed proteins of tree peony after three different chilling treatments: endo-dormancy, endo-dormancy release and eco-dormancy stages. More than 200 highly reproducible protein spots were detected, and 31 differentially expressed spots (P < 0.05) were selected for further analysis. Finally, 20 protein spots were confidently identified from databases, which were annotated and classified into seven functional categories: response to abiotic or biotic stimulus (four), metabolic processes (four), other binding (three), transcription or transcription regulation (two), biological processes (one), cell biogenesis (one) and unclassified (five). The results of qPCR of five genes were mainly consistent with that of the protein accumulation analysis as determined by 2-DE. This indicated that most of these genes were mainly regulated at transcriptional level. The activity of nitrate reductase and pyruvate dehydrogenase E1 was consistent with the 2-DE results. The proteomic profiles indicated activation of citrate cycle, amino acid metabolism, lipid metabolism, energy production, calcium signalling and cell growth processes by chilling fulfilment to facilitate dormancy release in tree peony. Analysis of functions of identified proteins will increase our knowledge of endo-dormancy release in tree peony. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.
METEORIN-LIKE is a cytokine associated with barrier tissues and alternatively activated macrophages

PubMed Central

Ushach, Irina; Burkhardt, Amanda M.; Martinez, Cynthia; Hevezi, Peter A.; Gerber, Peter Arne; Buhren, Bettina Alexandra; Schrumpf, Holger; Valle-Rios, Ricardo; Vazquez, Monica I.; Homey, Bernhard; Zlotnik, Albert

2014-01-01

Cytokines are involved in many functions of the immune system including initiating, amplifying and resolving immune responses. Through bioinformatics analyses of a comprehensive database of gene expression (BIGE: Body Index of Gene Expression) we observed that a small secreted protein encoded by a poorly characterized gene called meteorin-like (METRNL), is highly expressed in mucosal tissues, skin and activated macrophages. Further studies indicate that Metrnl is produced by Alternatively Activated Macrophages (AAM) and M-CSF cultured bone marrow macrophages (M2-like macrophages). In the skin, METRNL is expressed by resting fibroblasts and IFNγ-treated keratinocytes. A screen of human skin-associated diseases showed significant over-expression of METRNL in psoriasis, prurigo nodularis, actinic keratosis and atopic dermatitis. METRNL is also up-regulated in synovial membranes of human rheumatoid arthritis. Taken together, these results indicate that Metrnl represents a novel cytokine, which is likely involved in both innate and acquired immune responses. PMID:25486603
Multidimensional fractionation is a requirement for quantitation of Golgi-resident glycosylation enzymes from cultured human cells.

PubMed

Lin, Chi-Hung; Chik, Jenny H L; Packer, Nicolle H; Molloy, Mark P

2015-02-06

Glycosylation results from the concerted action of glycosylation enzymes in the secretory pathway. In general, gene expression serves as the primary control mechanism, but post-translational fine-tuning of glycosylation enzyme functions is often necessary for efficient synthesis of specific glycan epitopes. While the field of glycomics has rapidly advanced, there lacks routine proteomic methods to measure expression of specific glycosylation enzymes needed to fill the gap between mRNA expression and the glycomic profile in a "reverse genomics" workflow. Toward developing this workflow we enriched Golgi membranes from two human colon cancer cell lines by sucrose density centrifugation and further mass-based fractionation by SDS-PAGE. We then applied mass spectrometry to demonstrate a doubling in the number of Golgi resident proteins identified, compared to the unenriched, low speed centrifuged supernatant of lysed cells. A total of 35 Golgi-resident glycosylation enzymes, of which 23 were glycosyltransferases, were identified making this the largest protein database so far of Golgi resident glycosylation enzymes experimentally identified in cultured human cells. We developed targeted mass spectrometry assays for specific quantitation of many of these glycosylation enzymes. Our results show that alterations in abundance of glycosylation enzymes at the protein level were generally consistent with the resultant glycomic profiles, but not necessarily with the corresponding glycosyltransferase mRNA expression as exemplified by the case of O-glycan core 1 T synthase.
Novel Tonoplast Transporters Identified Using a Proteomic Approach with Vacuoles Isolated from Cauliflower Buds1[W][OA

PubMed Central

Schmidt, Ulrike G.; Endler, Anne; Schelbert, Silvia; Brunner, Arco; Schnell, Magali; Neuhaus, H. Ekkehard; Marty-Mazars, Daniéle; Marty, Francis; Baginsky, Sacha; Martinoia, Enrico

2007-01-01

Young meristematic plant cells contain a large number of small vacuoles, while the largest part of the vacuome in mature cells is composed by a large central vacuole, occupying 80% to 90% of the cell volume. Thus far, only a limited number of vacuolar membrane proteins have been identified and characterized. The proteomic approach is a powerful tool to identify new vacuolar membrane proteins. To analyze vacuoles from growing tissues we isolated vacuoles from cauliflower (Brassica oleracea) buds, which are constituted by a large amount of small cells but also contain cells in expansion as well as fully expanded cells. Here we show that using purified cauliflower vacuoles and different extraction procedures such as saline, NaOH, acetone, and chloroform/methanol and analyzing the data against the Arabidopsis (Arabidopsis thaliana) database 102 cauliflower integral proteins and 214 peripheral proteins could be identified. The vacuolar pyrophosphatase was the most prominent protein. From the 102 identified proteins 45 proteins were already described. Nine of these, corresponding to 46% of peptides detected, are known vacuolar proteins. We identified 57 proteins (55.9%) containing at least one membrane spanning domain with unknown subcellular localization. A comparison of the newly identified proteins with expression profiles from in silico data revealed that most of them are highly expressed in young, developing tissues. To verify whether the newly identified proteins were indeed localized in the vacuole we constructed and expressed green fluorescence protein fusion proteins for five putative vacuolar membrane proteins exhibiting three to 11 transmembrane domains. Four of them, a putative organic cation transporter, a nodulin N21 family protein, a membrane protein of unknown function, and a senescence related membrane protein were localized in the vacuolar membrane, while a white-brown ATP-binding cassette transporter homolog was shown to reside in the plasma membrane. These results demonstrate that proteomic analysis of highly purified vacuoles from specific tissues allows the identification of new vacuolar proteins and provides an additional view of tonoplastic proteins. PMID:17660356

Some links on this page may take you to non-federal websites. Their policies may differ from this site.